This topic describes how to configure vector indexes when you create a table. Use these configurations to meet your business requirements for performance, cost, and real-time capabilities.
Parameter configurations
In step 4 of table creation, configure the Index Schema. You can configure vector fields in detail at this step.

Vector dimension
Specifies the number of features in a vector. Set this value to match the output dimension of your vector model exactly.
Recommendations:
Keep consistent: If the configured dimension does not match the actual vector data, index building fails.
Performance impact: Higher dimensions capture more information but increase memory usage and computational overhead. Doubling the dimension roughly doubles memory usage.
Distance type
Specifies how to compute similarity between vectors. Choose the distance type that best fits your data characteristics and business scenario. The choice directly affects retrieval quality.
Selection guide:
Distance type
Meaning of vector score
Cosine distance
The score ranges from
[-1, 1]. A higher score means higher similarity. A score of1means identical vectors. A score of-1means opposite vectors.Inner Product Distance (InnerProduct)
A higher score means higher similarity.
Squared Euclidean distance (SquareEuclidean)
A lower score means higher similarity. A score of
0means identical vectors.
Vector index algorithm
Specifies the underlying algorithm used to build vector indexes. Each algorithm balances build speed, memory usage, query performance, and recall rate differently.
Selection guide:
Algorithm
Description
Distance types
Data scale
Recall
Latency
RAM usage
Scenarios
FLAT
(formerly Linear)
Data scale: Up to tens of thousands of vectors.
Description: High accuracy. 100% recall.
InnerProduct, SquaredEuclidean, Cosine
Very small (<10k)
Tens of thousands
100% (exact)
Very slow
Very low
Benchmarking. Exact ranking for very small datasets.
HNSW
Data scale: Up to tens of millions of vectors.
Description: Performance benchmark. Strict requirements for accuracy and latency.
InnerProduct, SquaredEuclidean, and Cosine
Medium (10M+)
Tens of millions
Very high
Low
Very high
High-performance in-memory online search.
HNSW_RaBitQ
Vector scale: Supports data volumes of up to one billion vectors.
Description: Optimized for massive datasets under strict memory constraints. Lower accuracy requirements.
SquaredEuclidean
Medium to large (100M+)
Hundreds of millions
High
Very low
Very low
Lightweight search optimized for binary quantization.
CagraHNSW
Data scale: Up to hundreds of millions of vectors.
Description: GPU-accelerated graph indexing. Best with multiple GPUs for large-scale workloads.
InnerProduct, SquaredEuclidean
Medium to large (100M+)
Hundreds of millions
Very high
Very low (GPU)
Very high
GPU acceleration. Extremely high throughput.
HNSW_SQ
(formerly QGraph)
Data scale: Up to billions of vectors.
Description: High query speed and performance. Lower accuracy requirements.
InnerProduct, SquaredEuclidean, Cosine
Medium (100M+)
Billions
High
Low
High
IVF_SQ8
Data scale: Up to hundreds of millions of vectors.
Description: Balanced trade-off. Moderate requirements for both accuracy and latency.
InnerProduct, SquaredEuclidean, and Cosine
Large scale
500 million
Medium to high
Middle
Low
Cold-hot tiered storage for budget-constrained, large-scale workloads. Reduces memory usage by compressing vectors. A classic balance of cost and scale.
DiskANN
Data scale: Billions of vectors or more.
Description: Uses local disks. Tolerates higher latency and uses minimal memory.
InnerProduct, SquaredEuclidean, or Cosine
Massive (Billion+)
Billions or more
High
Medium to high
Very low
Disk-resident, ultra-large-scale search.
Real-time indexing
Enables immediate indexing and querying of incremental data written via API. Data becomes visible within seconds.
How it works: The system first builds temporary in-memory indexes for real-time writes. When enough data accumulates, it merges those indexes with the full disk-based index.
Recommendations:
Enable (
true): Use for online services where data must be immediately searchable. This uses extra memory and CPU resources.Disable (
false): Use for batch imports or offline analytics where updates are infrequent.
Advanced configurations
Threshold for linear building
If the number of documents in a shard is less than this threshold, the system uses
Linear(brute-force scan) for search—even if you selected another vector index algorithm.Recommendations:
Default:
5000. This is an empirical value. At this scale, brute-force search often performs along with or better than building a complex index.Adjust only if needed: Usually, leave this unchanged. If your query concurrency is very high and your data volume is near this threshold, lower the value to force use of high-performance indexes such as
HNSW. Note that this may increase index-building overhead.
Ignore invalid vector data
Controls how the system handles abnormal vectors—such as mismatched dimensions or empty values—during full or incremental index building.
Recommendations:
Option
Behavior
Recommended scenario
trueSkips rows with invalid vectors. Index building continues. A warning appears in the logs.
Development and testing. Helps debug quickly without letting a few dirty records break the entire job.
falseStops index building and returns an error on any invalid vector.
Production environments. Ensures data quality and prevents silent data loss. Use with upstream data cleaning workflows.
Real-time indexing parameters
Tunes how real-time data streams are processed after real-time indexing is enabled.
Example:
{"proxima.oswg.streamer.segment_size":2048}Explanation: The
proxima.oswg.streamer.segment_sizeparameter controls how many records accumulate in memory before being flushed to a small in-memory segment (Segment).Tuning recommendations:
High write QPS: Increase this value (for example, to
4096) to reduce the number of segments in memory and lower index management overhead. This slightly increases the delay between write and query availability.Low write QPS: Keep the default value
2048, or decrease it slightly, to make newly written data available for queries faster.
Real-time retrieval parameters
Dynamically adjusts search behavior per index algorithm to balance recall and latency. Keys and values depend on the selected vector index algorithm.
General note: These parameters usually control search scope. For example, with
HNSW, theefparameter sets how many neighbor nodes to traverse during search. A largerefimproves recall but increases latency.Example (HNSW):
{"searcher_name":"HNSW", "ef":200}The
efvalue typically ranges fromk(the number of top-K results requested) to4096. Start testing at100and adjust based on your recall and latency requirements.
Vector separator
Specifies the delimiter between vector dimensions in string-formatted vector data.
Example: In
1.05,0.15,0.14, the delimiter is a comma (,). This is the default. You rarely need to change it.