All Products
Search
Document Center

OpenSearch:Common configurations of vector indexes

Last Updated:Feb 27, 2024

When you create a table, you can configure advanced configurations for vector indexes in the Index Schema step. This topic describes the parameters for the advanced configurations of vector indexes.

When you create a table, configure the index schema in the Index Schema step.

image.png

The following figure shows the parameters for advanced configurations.

image.png

The following table describes the parameters.

Parameter

Valid value

Description

Vector Dimension

N/A

The number of features or attributes of a vector. The Vector Dimension parameter specifies the complexity of the information and features that the vector can represent. You must configure the Vector Dimension parameter based on the vector generated by your vector model.

Distance Type

  • SquareEuclidean

  • InnerProduct

The type of the distance that is used to calculate the vector similarity. If you set this parameter to SquareEuclidean, a smaller vector score indicates that the vector is more relevant.

If you set this parameter to InnerProduct, a greater vector score indicates that the vector is more relevant.

Vector Index Algorithm

  • Qc

  • HNSW

  • Linear

The vector indexing algorithm. For more information, see Introduction to vectors.

Real-time Indexing

  • true

  • false

Specifies whether to enable the real-time indexing feature. If you set this parameter to true, the real-time indexing feature is enabled. The OpenSearch Vector Search Edition instance builds indexes for the real-time data that you push by calling API operations. Then, you can query the data in real time.

Real-time Indexing Parameters

{"proxima.oswg.streamer.segment_size":2048}

The parameters for real-time indexing. We recommend that you use the default value.

Index Retrieval Parameters

N/A

The parameters for real-time retrieval. You must configure this parameter based on the vector indexing algorithm. For more information, see the following topics:

Vector Separator

Customizable

The delimiter that is used to separate dimensions during vector retrieval. For example, a comma (,) is used as the delimiter in vector:'1.05066,0.15610,0.156145...'.

Threshold for Linear Building

Default value: 5000

The threshold value for operations that do not create indexes in order. A value of 5000 specifies that indexes are created in order if the number of documents is less than 5,000.

Ignore Invalid Vector Data

  • true

  • false

Specifies whether to ignore invalid vector data. If you set this parameter to true, the system creates indexes for full or batch incremental data as expected when the vector dimension is invalid and the vector data is empty.