Getting started with a single-worker instance - OpenSearch

This topic describes how to purchase an OpenSearch Vector Search Edition instance that has a single worker.

Purchase an instance

Log on to the OpenSearch console. In the upper-left corner, switch to OpenSearch Vector Search Edition. On the Instances page, click Create Instance.
On the buy page, select Vector Search Edition as Service Edition and configure the following parameters for purchasing an OpenSearch Vector Search Edition instance: Region and Zone, Data Node Quantity, Data Node Type, Total Storage Space of Single Searcher Worker, VPC, and vSwitch. Then, specify a username and password and click Buy Now. The username and password are used for permission verification in queries. We recommend that you do not specify your Alibaba Cloud account and password as the username and password.
Note
- Determine the number and specifications of Search workers to purchase based on your business requirements. After you specify the number and specifications, the price is automatically calculated and displayed on the buy page.
- If you purchase a single-worker instance, you do not need to purchase Query Result Searcher (QRS) workers. In this case, leave the Query Node Quantity parameter to 0 and the QRS Worker Family parameter is invalid.
- You must specify the same virtual private cloud (VPC) and vSwitch as those of the Elastic Compute Service (ECS) instance that you use to access the OpenSearch Vector Search Edition instance. Otherwise, the {'errors':{'code':'403','message':'Forbidden'}} error message is reported when you access the OpenSearch Vector Search Edition instance.
- A Searcher worker that uses local SSDs is provided with a free quota of 50 GB of storage space. You can purchase extra storage space for the Searcher worker in increments of 50 GB. For a Searcher worker that uses cloud disks, no free quota of storage space is provided and you can purchase storage space in increments of 50 GB.
On the Confirm Order page, confirm the configurations, and read and agree to the terms of service agreement. Then, click Activate Now.
After you purchase the instance, click Management Console. On the Instances page, you can view the purchased instance.

Configure the instance

On the Instances page, the purchased instance is in the Pending Configuration state. The system automatically deploys an instance that contains no data. The number and specifications of Searcher workers in the automatically deployed instance are the same as those you specify when you purchase the instance. Before you can use the instance for searches, perform the following steps: configure a table, add a data source, configure fields, configure an index schema, and then perform reindexing for the instance.

1. Configure the basic information of a table

In the left-side navigation pane on the instance details page, click Table Management. On the Table Management page, click Add Table. In the Basic Table Information step of the Create wizard, configure the Table Name, Data Shards, Number of Resources for Data Updates, and Scenario Template parameters. Then, click Next.

Parameters:

Table Name: the name of the table. You can customize the table name.
Data Shards: the number of data shards. For a single-worker instance, the default value is 1 and you cannot modify the value. However, you can increase the number of workers of the instance. For more information, see Change the configurations of an instance.
Number of Resources for Data Updates: the number of resources used for data updates. By default, OpenSearch provides a free quota of two resources for data updates for each data source in an OpenSearch Vector Search Edition instance. Each resource consists of 4 CPU cores and 8 GB of memory. You are charged for resources that exceed the free quota. For more information, see Billing overview of OpenSearch Vector Search Edition.
Scenario Template: the template that is used to create the table. Valid values: Common Template, Vector: Image Search, and Vector: Semantic Search for Text.

2. Add a data source

In the Data Synchronization step, add a data source. Object Storage Service (OSS) data sources, MaxCompute data sources, and API data sources are supported. In this example, MaxCompute + API is selected as Full Data Source. Configure the Project, AccessKey, AccessKey Secret, Table, and Partition Key parameters, set the Automatic Reindexing parameter to Yes or No, and then click Check. If the data source information passes the check, click Next.

For more information about MaxCompute data sources, see Create a table for a MaxCompute data source.
For more information about API data sources, see Create a table for an API data source.
For more information about OSS data sources, see Create a table for an OSS data source.

3. Configure fields

OpenSearch provides relevant preset fields based on the scenario template that you select and automatically imports fields in the full data source to the field list.

In the Field Configuration step, configure fields. You must configure at least two fields: a primary key field and a vector field. You must define the vector field as a multi-value field of the FLOAT type.

If you want to classify vectors by namespace, add a namespace field between the primary key field and vector field.

Note:

The primary key field and vector field are required. For the primary key field, you must set the Type parameter to INT or STRING and select the option in the Primary Key column. For the vector field, you must set the Type parameter to FLOAT and select the check box in the Vector Field column.
By default, the vector field is a multi-value field of the FLOAT type, and multiple values of the vector field are separated by HA3 delimiters (^]). This delimiter is encoded as \x1D in the UTF format. You can also enter a custom multi-value delimiter.
When you configure a vector index, you must specify the fields in the order of the primary key field, namespace field, and vector field. The namespace field is optional. The preceding figure shows an example.

4. Configure the index schema

Vector index

OpenSearch automatically creates indexes for the primary key field and vector field. The index names are the same as the field names. You need to only configure the vector index in the OpenSearch console.

You must separately configure parameters for the advanced configurations of the vector index. For more information, see Common configurations of vector indexes.

Note

The primary key field and vector field are required. The namespace field is optional and can be left empty.
You can configure only the three fixed fields for the Fields Contained parameter and cannot add fields.
The system automatically configures the parameters for the vector index. If you do not have special requirements, click Next to complete the configuration.
Namespace field: If the engine version of the instance is vector_service_1.0.2 or earlier, the namespace field cannot be of the STRING type. If the engine version of the instance is vector_service_1.0.2 or later, no limit is imposed on the field type.

5. Confirm the creation

In the Confirm step, click Confirm. On the Table Management page, the table is in the Adding state.

6. View the change history

In the left-side navigation pane on the instance details page, click Change History. On the Change History page, you can view the change history of the instance within the previous three days, seven days, and 30 days. For example, you can view the processes of creating tables, creating indexes, scaling out the instance, and performing reindexing for full data. The following figure shows the process of creating a table. After all processes on the Change History page are complete, the search service is built and you can run query tests.

7. Run query tests

Sample query:

{
  "vector": [0.0019676427,0.005902928,0.021644069,0.21644068,0.12199384,0.043288138,0.007870571,0.0,0.08460863,0.041320495,0.043288138,0.035417568,0.011805856,0.055093993,0.12592913,0.017708784,0.021644069,0.0019676427,0.0,0.0,0.0019676427,0.078705706,0.1987319,0.041320495,0.039352853,0.0039352854,0.007870571,0.0039352854,0.0039352854,0.017708784,0.035417568,0.06886749,0.0019676427,0.0019676427,0.013773498,0.049191065,0.2125054,0.22824654,0.123961486,0.0039352854,0.0,0.0,0.021644069,0.14560555,0.078705706,0.1987319,0.22824654,0.005902928,0.064932205,0.0019676427,0.0019676427,0.021644069,0.027546996,0.035417568,0.22824654,0.22824654,0.1337997,0.023611711,0.009838213,0.007870571,0.0039352854,0.0039352854,0.017708784,0.20069954,0.033449925,0.005902928,0.019676426,0.035417568,0.015741142,0.029514639,0.13183205,0.123961486,0.029514639,0.0,0.027546996,0.22824654,0.15741141,0.0,0.0039352854,0.043288138,0.18889369,0.072802775,0.055093993,0.17315255,0.08460863,0.0019676427,0.007870571,0.035417568,0.22824654,0.10034977,0.009838213,0.021644069,0.062964566,0.027546996,0.015741142,0.04525578,0.086576276,0.033449925,0.023611711,0.017708784,0.0,0.0,0.03738521,0.072802775,0.16724962,0.035417568,0.031482283,0.20463483,0.043288138,0.011805856,0.0039352854,0.051158708,0.023611711,0.11412327,0.13183205,0.16134669,0.049191065,0.023611711,0.0039352854,0.0039352854,0.049191065,0.035417568,0.015741142,0.0039352854,0.03738521,0.08264099,0.094446845,0.021644069],
  "topK": 10,
  "includeVector": true
}

vector: the vector to be queried.
topK: the top K documents to be queried.
includeVector: specifies whether to return vector information in documents.

Sample results:

For more information about the query syntax, see the "Syntax" section of this topic.

Syntax

Syntax for vector-based queries: Vector-based query
Syntax for primary key-based queries: Primary key-based query
Syntax for filter expressions: Filter expression

Use an SDK to perform vector-based queries

Use an SDK to perform vector-based queries or primary key-based queries. For more information, see Query data.
Use an SDK to add or delete documents. For more information, see Update Data.

Scale out the instance

To increase the number of workers of a single-worker instance, perform the following steps:

Warning

When a single-worker instance is being scaled out, the instance is unavailable. You can view the scale-out progress on the Change History page.

On the Instances page, find the instance that you want to scale out and click Scale In/Out in the Actions column.
On the configuration change page, specify the number and specifications of QRS workers that you want to purchase, select Terms of Service, and then click Buy Now.