Overview
To meet more diversified and complex business requirements, OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This topic describes the syntax and usage notes of vector indexes.
Syntax
query = vector_index:'Vector'
Example: Use a 64-dimensional vector index to query data
vector: '0.377796,-0.958450,0.409853,-0.238177,-1.293826,0.356797,-0.295727,0.847301,-1.220337,0.148032,-1.128458,0.903187,0.509352,0.293686,-1.005852,-0.488839,0.888227,-0.555556,-0.658025,0.267552,-0.567601,0.003045,0.591734,-0.515983,-1.316453,-1.462450,0.091946,1.554954,0.384802,0.720498,0.144338,1.217826,0.724039,0.044212,0.571332,-1.425430,0.618965,0.481887,-1.617787,1.505416,-0.683652,1.030900,0.562021,0.162437,0.816546,0.112229,-0.739288,-0.342643,-0.199292,0.508368,-1.384887,-1.842170,0.952622,-1.699499,0.199430,-0.232464,-0.273227,-0.383696,-0.511302,0.005458,1.873572,-0.926169,-0.417587,-0.660156'
Examples
Syntax for a query that includes a specified threshold value
Description: specifies that a vector is not involved in the retrieval process if its score is lower than the specified lower threshold value or higher than the specified upper threshold value.
Parameter format: &sf=number
Example:
query=index_name:'0.1,0.2,0.98,0.6;0.3,0.4,0.98,0.6&sf=0.8'
Syntax for a query for top N vectors
Description: specifies the top N vectors that can be returned.
Parameter format: &n=number
Concatenate the N parameter after the vector.
Example:
query=vector_index:'0.1,0.2,0.98,0.6;0.3,0.4,0.98,0.6&n=10'
Syntax for a query for sorting the results based on the scores of vectors
Description: specifies the proxima_score ()
function in the sort expression to sort query results based on the scores of vectors.
Procedure:
Create a fine sort policy.
Note: Specify the name of the vector index as the parameter of the proxima_score function.
The following figure shows that the created fine sort policy is used to perform a test on the Search Test page.
The vector distance supports two types. By default, the system adopts the Euclidean distance (l2) as the type of distance.
Inner product distance (ip): The larger the vector score, the higher the document relevance.
Euclidean distance (l2): The smaller the vector score, the higher the document relevance.
Usage notes
By default, the system adopts the Euclidean distance (l2) as the type of vector distance when a vector index is built. If you want to adopt the inner product distance (ip), you must normalize the vector of the inner product distance type before you pass the data to the engine.
Vector indexes are applicable only to fields of the DOUBLE_ARRAY type.
OpenSearch supports 64-dimensional, 128-dimensional, 256-dimensional, and 512-dimensional vector analyzers. Fields of the DOUBLE_ARRAY type must also contain 64, 128, 256, or 512 elements to match each type of vector analyzer.
The maximum length of a vector index is 4 KB before encoding. A query supports up to two vector indexes.