TairVector performance test and results - Tair (Redis® OSS-Compatible)

TairVector is an in-house data structure of Tair (Enterprise Edition) that provides high-performance real-time storage and retrieval of vectors. This topic describes the method used to test the performance of TairVector and provides the test results obtained by Alibaba Cloud.

TairVector supports the approximate nearest neighbor (ANN) search algorithm. You can use TairVector for semantic retrieval of unstructured data and personalized recommendations. For more information, see Vector.

Test description

Database test environment

Item	Description
Region and zone	Zone A in the China (Zhangjiakou) region
Storage type	DRAM-based instance that runs Redis 6.0
Engine version	6.2.8.2
Instance architecture	Standard master-replica architecture for which the cluster mode is disabled. For more information, see Standard architecture.
Instance type	tair.rdb.16g. The instance type has a trivial impact on test results.

Client test environment

An Elastic Compute Service (ECS) instance that is deployed in the same virtual private cloud (VPC) as the Tair (Redis OSS-compatible) instance is created and is connected to the Tair (Redis OSS-compatible) instance over the VPC.
The Linux operating system is used.
Python 3.7 or later is installed.

Test data

The Sift-128-euclidean, Gist-960-euclidean, Glove-200-angular, and Deep-image-96-angular datasets are used to test the Hierarchical Navigable Small World (HNSW) indexing algorithm. The Random-s-100-euclidean and Mnist-784-euclidean datasets are used to test the Flat Search indexing algorithm.

Dataset	Description	Vector dimension	Number of vectors	Number of queries	Data volume	Distance formula
Sift-128-euclidean	Image feature vectors that are generated by using the Texmex dataset and the scale-invariant feature transform (SIFT) algorithm.	128	1,000,000	10,000	488 MB	L2
Gist-960-euclidean	Image feature vectors that are generated by using the Texmex dataset and the gastrointestinal stromal tumor (GIST) algorithm.	960	1,000,000	1,000	3.57 GB	L2
Glove-200-angular	Word vectors that are generated by applying the GloVe algorithm to the text data from the Internet.	200	1,183,514	10,000	902 MB	COSINE
Deep-image-96-angular	Vectors that are extracted from the output layer of the GoogLeNet neural network with the ImageNet training dataset.	96	9,990,000	10,000	3.57 GB	COSINE
Random-s-100-euclidean	Vectors that are extracted from the output layer of the GoogLeNet neural network with the ImageNet training dataset.	100	90,000	10,000	34 MB	L2
Mnist-784-euclidean	A dataset from the Modified National Institute of Standards and Technology (MNIST) database of handwritten digits.	784	60,000	10,000	179 MB	L2

Test tools and methods

Install tair-py and hiredis on the test server.
Run the following command to install hiredis:
```
pip install tair hiredis
```
Download and decompress Ann-benchmarks.
Run the following command to decompress Ann-benchmarks:
```
tar -zxvf ann-benchmarks.tar.gz
```
Configure the endpoint, port number, username, and password of the Tair instance in the algos.yaml file.
Open the algos.yaml file, search for tairvector to find the relevant configuration items, and then configure the following parameters of base-args:
- url: the endpoint, username, and password of the Tair instance. Format: redis://user:password@host:port.
- parallelism: the number of concurrent threads. Default value: 4. We recommend that you use the default value.
Example:
```
{"url": "redis://testaccount:Rp829dlwa@r-bp18uownec8it5****.redis.rds.aliyuncs.com:6379", "parallelism": 4}
```

Run the run.py script to start the test.

Important

After you run the run.py script, the entire test is started to create an index, write data to the index, and then query and record the results. Do not repeatedly run the script on a single dataset.

Example:

# Run a multi-threaded test by using the Sift-128-euclidean dataset and HNSW indexing algorithm. 
python run.py --local --runs 3 --algorithm tairvector-hnsw --dataset sift-128-euclidean --batch
# Run a multi-threaded test by using the Mnist-784-euclidean dataset and Flat Search indexing algorithm. 
python run.py --local --runs 3 --algorithm tairvector-flat --dataset mnist-784-euclidean --batch

You can also use the built-in web frontend to execute the test. Example:

# Install the Streamlit dependency in advance. 
pip3 install streamlit
# Start the web frontend. Then, you can enter http://localhost:8501 in your browser. 
streamlit run webrunner.py

Run the data_export.py script and export the results.

Example:

# Multiple threads.
python data_export.py --output out.csv --batch

Test results

We recommend that you pay more attention to the test results of write performance, k-nearest neighbor (kNN) query performance, and memory efficiency.

Write performance: The write performance of TairVector increases in proportion to the write throughput.
kNN query performance: The number of queries per second (QPS) reflects the system performance, and the recall rate reflects the accuracy of the results. Typically, the higher the recall rate, the lower the QPS. QPS comparison holds significance only if the recall rate is the same. In this context, the test results are presented with the "QPS v.s. Recall" curve. For FLAT indexes, only QPS is presented because the recall rate is always 1.
Memory efficiency: The lower the memory usage of vector indexes, the better the performance of TairVector.

Note

Both write and kNN query tests involve four concurrent threads.

In this example, the performance of TairVector is tested with the float32 and float16 data types. The default data type is float32. The performance of the HNSW indexing algorithm is tested with the AUTO_GC feature enabled.

HNSW indexes

Write performance

The following figures show the write performance of the HNSW indexing algorithm at different values of the M parameter when ef_construct is set to 500. The M parameter specifies the maximum number of outgoing neighbors on each layer in a graph index structure.

The write performance of the HNSW indexing algorithm decreases in inverse proportion to the value of the M parameter.
Compared with the float32 data type, the write performance of the HNSW indexing algorithm slightly decreases in most cases when the float16 data type is used.
After the AUTO_GC feature is enabled, the write performance of the HNSW indexing algorithm increases by up to 30%.

image (39).png

kNN query performance

The higher the recall rate and QPS, the better the kNN query performance. Therefore, the closer the curve is to the upper-right corner, the better the performance of the HNSW indexing algorithm.

The following figures show the "QPS v.s. Recall" curves when HNSW indexes are used with different datasets.

For all four datasets, HNSW indexes can achieve a recall rate of more than 99%.
Compared with the float32 data type, the performance of the HNSW indexing algorithm slightly decreases when the float16 data type is used. The performance of these two data types is extremely close.
After the AUTO_GC feature is enabled, the kNN query performance significantly decreases. Therefore, we recommend that you enable the AUTO_GC feature only when you want to delete a large amount of data.

image (46).png

To visually present how parameter settings affect the kNN query performance, the following figures show how the QPS and recall rate change with the values of M and ef_search. In this example, the Sift-128-euclidean dataset and the float32 data type are used, and the AUTO_GC feature is disabled.

As the values of M and ef_search increase, the QPS decreases and the recall rate increases.

Note

You can modify the relevant parameters based on your business requirements to balance the kNN query performance with the recall rate.

Memory efficiency

The memory usage of HNSW indexes increases only in proportion to the value of the M parameter.

The following figures show the memory usage of HNSW indexes between different datasets.

Compared with the float32 data type, the float16 data type can significantly reduce the memory usage by more than 40%.
After the AUTO_GC feature is enabled, the memory usage slightly increases.
Note
You can determine an appropriate value for the M parameter based on the dimension of vectors and your memory capacity budget. If you can accept a certain loss of precision, we recommend that you use the float16 data type to save memory space.

FLAT indexes

Write performance

The following figure shows the write throughput of FLAT indexes between two datasets.

Compared with the float32 data type, the write performance of FLAT indexes decreases by approximately 5% when the float16 data type is used.

image (43).png

kNN query performance

The following figure shows the kNN QPS of FLAT indexes between two datasets.

Compared with the float32 data type, the kNN query performance of FLAT indexes increases by approximately 10% when the float16 data type is used.

image (44).png

Memory efficiency

The following figure shows the memory usage of FLAT indexes between two datasets.

Compared with the float32 data type, the float16 data type can reduce the memory usage by more than 40%.

image (45).png