Overview - Tablestore - Alibaba Cloud Documentation Center

You can use the k-nearest neighbor (KNN) vector query feature of Tablestore to perform an approximate nearest neighbor search based on vectors. This way, you can find data items that have the highest similarity to the vector that you want to query in a large-scale dataset. This feature is suitable for various scenarios such as retrieval-augmented generation (RAG), recommendation systems, similarity detection, and natural language processing (NLP) and semantic search.

Scenarios

The KNN vector query feature is suitable for the following scenarios:

RAG
RAG is an AI framework that combines retrieval capabilities with the capabilities of large language models (LLMs) to improve the accuracy of outputs generated by LLMs, especially in the field of private data or professional data. RAG is widely used in knowledge bases.
Recommendation system
KNN vector query can be used by platforms such as e-commerce, social media, and video streaming platforms. For example, a platform can encode content, such as user behaviors, preferences, and content features, as vectors and store the vectors. Then, KNN vector query can be used to quickly find products, articles, or videos that match user interests. This way, the platform can provide custom recommendations to improve user experience and loyalty.
Similarity detection
In content recognition scenarios, such as image recognition, video recognition, speech recognition, voiceprint recognition, and facial recognition, unstructured data is converted to vectors and stored. Then, the system uses KNN vector query to quickly find the most similar content. For example, after a user uploads an image of a commodity to an e-commerce platform, the system can quickly find the images of commodities that are similar to the uploaded image in style, color, or pattern.
NLP and semantic search
In the NLP field, text is converted to vectors by using techniques such as Word2Vec and Bidirectional Encoder Representations from Transformers (BERT). Then, KNN vector query is used to understand the semantics of query statements and find the most semantically relevant content such as documents, news, and Q&A pairs. This helps improve the relevance of query results and user experience.
Knowledge graph and intelligent conversational search
The nodes and relationships of nodes in knowledge graphs can be represented by vectors. Then, KNN vector query can be used to accelerate entity linking, relational inference, and the response of an intelligent conversational search system. This helps the system accurately understand the questions and provide answers to complex questions.

Features

You can use the KNN vector query feature to perform an approximate nearest neighbor search based on vectors. This way, you can find data items that have the highest similarity to the vector that you want to query in a large-scale dataset.

KNN vector query is an out-of-the-box feature that inherits all the features of search indexes. You can use the KNN vector query feature in pay-as-you-go mode, without the need to build or deploy a system. KNN vector query supports the streaming mode of search indexes. Data can be queried in near real time after the data is written to tables. KNN vector query also supports high-throughput insert, update, and delete operations. The query performance is comparable to that of systems that use the hierarchical navigable small world (HNSW) algorithm.

When you use the KNN vector query feature, you must specify the vector for which you want to query the similarity, the vector field that you want to query, and the top K query results that have the highest similarity to the vector that you want to query. You can also use the features of other query methods in combination with KNN vector query to filter the query results.

Vector fields

Before you use the KNN vector query feature, you must specify the following information when you create a search index for a data table: the vector field, the number of dimensions of the vector field, the data type of the vector field, and the algorithm that you want to use to measure the distance between vectors.

The data type of the field in the data table that is mapped to the vector field must be STRING. The value of the vector field in the search index must be an array that consists of FLOAT32 strings. The following table describes the parameters of the vector field.

Parameter	Description
dimension	The number of dimensions of the vector field. The maximum value is 2048. The number of dimensions of the vector field must be the same as that of vectors generated by the vector generation system. The array length of the vector field is equal to the value of the dimension parameter. For example, if the value of a vector field is `[1, 5.1, 4.7, 0.08]`, the value of the dimension parameter for the field is 4. Note Only dense vectors are supported. The number of dimensions of a vector field in a search index must be the same as that specified in the schema of the search index when the search index is created. A greater or smaller number of dimensions lead to a failure of index building.
dataType	The data type of the vector field. Only FLOAT32 is supported. FLOAT32 does not support extreme values such as NaN and Infinite. The data type of the vector field must be the same as that of vectors generated by the vector generation system. Note If you want to use vectors of other data types, submit a ticket.
metricType	The algorithm that you want to use to measure the distance between vectors. Valid values: euclidean, cosine, and dot_product. The distance measurement algorithm of the vector field must be the same as that of vectors generated by the vector generation system. For more information, see the Distance measurement algorithms for vectors section of this topic.

Note

The attributes of vectors vary based on the model or version used in the vector generation system. The attributes include the number of dimensions, data type, and algorithm used to measure the distance between vectors. The attributes of a vector field in a vector retrieval system must be the same as those of vectors generated by the vector generation system. For more information about the methods that you can use to generate vectors, see Generate vectors.

Distance measurement algorithms for vectors

The KNN vector query feature supports the following distance measurement algorithms for vectors: Euclidean distance, dot product, and cosine similarity. The following table describes the algorithms. A greater value that is obtained by using an algorithm indicates a higher similarity between two vectors.

Metric type	Formula	Performance	Description
Euclidean distance (euclidean)		Relatively high	Measures the shortest path between two vectors in a multi-dimensional space. The Euclidean distance algorithm in Tablestore does not perform the final square root calculation. This is done to improve performance. A greater value that is obtained by using the Euclidean distance algorithm indicates a higher similarity between two vectors.
Dot product (dot_product)		Highest	Multiplies the corresponding coordinates of two vectors of the same dimension and adds the products. A greater value that is obtained by using the dot product algorithm indicates a higher similarity between two vectors. Float32 vectors must be normalized before they are written to tables. For example, you can use the L2 norm to normalize Float32 vectors. If you do not normalize Float32 vectors before they are written to tables, you may encounter issues such as inaccurate query results, slow construction of vector indexes, and poor query performance.
Cosine similarity (cosine)		Relatively low	Calculates the cosine of the angle between two vectors in a vector space. A greater value that is obtained by using the cosine similarity algorithm indicates a higher similarity between two vectors. In most cases, this algorithm is used to calculate the similarity between text data. If 0 is used as a divisor, the cosine similarity cannot be calculated because it does not allow 0 to be used as a divisor. Therefore, the sum of squares of Float32 vectors cannot be 0. The process to calculate cosine similarity is complex. We recommend that you normalize vectors before you write data to tables and use the dot product algorithm to measure the distance between vectors.

Benefits

Cost-effectiveness

The core engine of the KNN vector query feature uses the optimized DiskANN technology and does not need to load all index data to the memory. Compared with systems that use the HNSW algorithm, the KNN vector query feature consumes less than 10% of the memory to achieve the high recall rate and high performance of the HNSW algorithm and significantly reduces the overall costs.

Ease of use

As a feature of search indexes, KNN vector query is also serverless. You can use the KNN vector query feature after you create an instance in the Tablestore console, without the need to build or deploy a system.
The feature supports the pay-as-you-go billing method. You do not need to worry about the usage or scaling. The system supports the horizontal scaling of storage and computing resources. KNN vector query supports up to hundreds of billions of data entries, whereas other query features support up to 10 trillion data entries.
The internal engine of the KNN vector query feature uses the query optimizer to automatically select the optimal algorithm and execution path. You can achieve a high recall rate and high performance without the need to tune numerous parameters. This significantly simplifies the use of KNN vector query and effectively shortens the business development cycle.
You can use the KNN vector query feature by using SQL statements, SDKs for Java, Go, Python, and Node.js, and open source frameworks such as Langchain, LangChain4J, and LlamaIndex.

Billing rules

During the public preview, you are not charged for the billable items specific to the KNN vector query feature when you use the feature. You are charged for other billable items based on the existing billing rules.

When you use a search index to query data, you are charged for the read throughput that is consumed. For more information, see Billable items of search indexes.

Prerequisites

A vector field is specified when you create a search index. For more information, see Create a search index.

Note

If a search index is created, you can dynamically modify the schema of the search index. For more information, see Dynamically modify the schema of a search index.

Usage notes

When you use the KNN vector query feature, take note of the following items:

Limits are imposed on the number of vector fields and the number of dimensions for a vector field. For more information, see Search index limits.
The search index server has multiple partitions. Each partition of the search index server returns the top K neighbors nearest to the vector that you want to query. The top K nearest neighbors returned by the partitions are aggregated on the client node. If you want to use tokens to query all data by page, the total number of rows in the response is related to the number of partitions of the search index server.
The KNN vector query feature is supported in the following regions: China (Hangzhou), China (Shanghai), China (Qingdao), China (Beijing), China (Zhangjiakou), China (Ulanqab), China (Shenzhen), China (Guangzhou), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Thailand (Bangkok), US (Virginia), Indonesia (Jakarta), Japan (Tokyo), Germany (Frankfurt), UK (London), SAU (Riyadh - Partner Region), and Philippines (Manila).

API operation

You can call the Search operation and set the query type to KnnVectorQuery to use the KNN vector query feature.

Parameters

Parameter	Required	Description
fieldName	Yes	The name of the vector field.
topK	Yes	The top K query results that have the highest similarity as the vector that you want to query. For information about the maximum value of the topK parameter, see Search index limits. Important A greater value of K indicates higher recall rate, query latency, and costs. If the value of the topK parameter is less than the value of the limit parameter in SearchQuery, the server automatically uses the value of the limit parameter as the value of the topK parameter.
float32QueryVector	Yes	The vector for which you want to query the similarity.
filter	No	The filter. You can use a combination of query conditions that are not KNN vector query conditions.

Methods

Note

If an exception occurs when you use the KNN vector query feature, submit a ticket.

You can use the Tablestore console or Tablestore SDKs to use the KNN vector query feature. Before you use the KNN vector query feature to query data, make sure that the following requirements are met:

You have an Alibaba Cloud account or a Resource Access Management (RAM) user that has permissions to perform operations on Tablestore. For more information about how to grant the operation permissions on Tablestore to a RAM user, see Use a RAM policy to grant permissions to a RAM user.
An AccessKey pair is created for your Alibaba Cloud account or RAM user if you want to use Tablestore SDKs or the Tablestore CLI to perform operations on Tablestore. For more information, see Create an AccessKey pair.
A data table is created. For more information, see Operations on a data table.
A search index is created for the data table. For more information, see Create a search index.
A vector field is specified when you create the search index.
An OTSClient instance is initialized if you want to use Tablestore SDKs to perform operations on Tablestore. For more information, see Initialize an OTSClient instance.

Use the Tablestore console

Go to the Indexes tab.
1. Log on to the Tablestore console.
2. In the top navigation bar, select a resource group and a region.
3. On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column of the instance.
4. On the Tables tab of the Instance Details tab, click the name of the data table or click Indexes in the Actions column of the data table.
On the Indexes tab, find the search index that you want to use to query data and click Manage Data in the Actions column.
In the Search dialog box, specify query conditions.
1. By default, the system returns all attribute columns. To return specific attribute columns, turn off All Columns and specify the attribute columns that you want to return. Separate multiple attribute columns with commas (,).
  Note
  By default, the system returns all primary key columns of the data table.
2. Select the And, Or, or Not logical operator based on your business requirements.
  If you select the And logical operator, data that meets the query conditions is returned. If you select the Or operator and specify a single query condition, data that meets the query condition is returned. If you select the Or logical operator and specify multiple query conditions, data that meets one of the query conditions is returned. If you select the Not logical operator, data that does not meet the query conditions is returned.
3. Select a vector field and click Add.
4. Set the Query Type parameter to KNN Vector Query(KnnVectorQuery) and enter the vector that you want to query and the value of the topK parameter.
  Enter a vector in the valid format as prompted.
5. By default, the sorting feature is disabled. If you want to sort the query results based on specific fields, turn on Sort and specify the fields based on which you want to sort the query results and the sorting order.
6. By default, the aggregation feature is disabled. If you want to collect statistics on a specific field, turn on Collect Statistics, specify the field based on which you want to collect statistics, and then configure the information that is required to collect statistics.
Click OK.
Data that meets the query conditions is displayed in the specified order on the Indexes tab.

Use Tablestore SDKs

Note

The KNN vector query feature is supported by Tablestore SDK for Java V5.17.0 and later, Tablestore SDK for Go of the latest version, Tablestore SDK for Python V5.4.4 and later, and Tablestore SDK for Node.js V5.5.0 and later.

You can use Tablestore SDK for Java, Tablestore SDK for Go, Tablestore SDK for Python, or Tablestore SDK for Node.js to use the KNN vector query feature. In this example, Tablestore SDK for Java is used.

The following sample code provides an example on how to query the top 10 vectors in a table that have the highest similarity as the specified vector. In this example, the top 10 vectors must meet the following query conditions: the value of the Col_Keyword column is hangzhou and the value of the Col_Long column is less than 4.

private static void knnVectorQuery(SyncClient client) {
    SearchQuery searchQuery = new SearchQuery();
    KnnVectorQuery query = new KnnVectorQuery();
    query.setFieldName("Col_Vector");
    query.setTopK(10); // Return the top 10 vectors in the table that have the highest similarity as the specified vector. 
    query.setFloat32QueryVector(new float[]{0.1f, 0.2f, 0.3f, 0.4f});
    // Specify the query conditions for the top 10 vectors: the value of the Col_Keyword column is hangzhou and the value of the Col_Long column is less than 4. 
    query.setFilter(QueryBuilders.bool()
            .must(QueryBuilders.term("Col_Keyword", "hangzhou"))
            .must(QueryBuilders.range("Col_Long").lessThan(4))
    );
    searchQuery.setQuery(query);
    searchQuery.setLimit(10);
    // Sort the query results based on scores. 
    searchQuery.setSort(new Sort(Collections.singletonList(new ScoreSort())));
    SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery);
    SearchRequest.ColumnsToGet columnsToGet = new SearchRequest.ColumnsToGet();
    columnsToGet.setColumns(Arrays.asList("Col_Keyword", "Col_Long"));
    searchRequest.setColumnsToGet(columnsToGet);
    // Call the Search operation. 
    SearchResponse resp = client.search(searchRequest);
    for (SearchHit hit : resp.getSearchHits()) {
        // Display the scores. 
        System.out.println(hit.getScore());
        // Display the data. 
        System.out.println(hit.getRow());
    }
}

References

When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, fuzzy query, Boolean query, geo query, nested query, KNN vector query, and exists query. You can select query methods based on your business requirements to query data from multiple dimensions.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.

Appendix 1: Combination of KNN vector query and Boolean query

You can use KNN vector query and Boolean query in various combinations. The query performance varies based on the combination that you use. In the example in this section, a small amount of data meets the filter conditions.

In this example, 0.1 billion images are stored in a table and 50,000 images belong to User A. Among the 50,000 images, 50 images are stored in the previous seven days. User A wants to search for 10 images that have the highest similarity to the specified image among the 50 images. The following table describes two common combinations of KNN vector query and Boolean query that User A can use to meet the query requirements.

Combination

Query condition

Description

Boolean query in the filter of KNN vector query

The rows that meet the query conditions of KNN vector query are the top K rows that meet the query conditions of Boolean query. The top K rows have the highest similarity to the vector that User A wants to query. The number of rows in the response to SearchRequest is determined based on the value of the Size parameter. The Size parameter specifies the number of rows that User A wants to return in order of similarity from high to low among the top K rows.

When this combination is used, a filter of KNN vector query is used to obtain all images of User A that are stored in the previous seven days, which are 50 images. Then, the top 10 images that have the highest similarity to the image that User A wants to query among the 50 images are found and returned to User A.

KNN vector query in Boolean query

Each subquery condition of Boolean query is matched first, and then the intersection of the results of all subqueries is calculated.

When this combination is used, KNN vector query returns the top 500 images that have the highest similarity to the image that User A wants to query among 0.1 billion images. Then, a term query and a range query are performed to query 10 images of User A that are stored in the previous seven days among the 500 images. The top 500 images may not include all 50 images of User A that are stored in the previous seven days. When User A uses this combination, User A may fail to obtain 10 images that have the highest similarity to the specified image among the 50 images. In extreme cases, User A may obtain no images.

Appendix 2: Sample vector normalization

The following sample code provides an example on how to normalize vectors:

  public static float[] l2normalize(float[] v, boolean throwOnZero) {
    double squareSum = 0.0f;
    int dim = v.length;
    for (float x : v) {
      squareSum += x * x;
    }
    if (squareSum == 0) {
      if (throwOnZero) {
        throw new IllegalArgumentException("can't normalize a zero-length vector");
      } else {
        return v;
      }
    }
    double length = Math.sqrt(squareSum);
    for (int i = 0; i < dim; i++) {
      v[i] /= length;
    }
    return v;
  }