This topic describes the methods that are used to test the write and query performance of TairSearch and RediSearch, and provides the test results.
TairSearch is an in-house full-text search data structure of Tair. TairSearch uses query syntax that is similar to that of Elasticsearch to implement effective full-text search. For more information, see Search.
Test description
Client test environment
Item | Description |
Host of the client | An Elastic Compute Service (ECS) instance of the ecs.g7.8xlarge type. For more information, see Overview of instance families. |
Region and zone | Zone K in China (Hangzhou) |
Operating system | CentOS 7.9 64-bit |
Database test environment
A Tair database and a self-managed Redis database are hosted on the same ECS instance.
Item | Description |
Tair version | Tair DRAM-based instance that runs the minor version 5.0.30 and is compatible with Redis 5.0. |
Number of I/O threads | 4 |
CPU resource | 6 vCPUs. Sample command: |
Item | Description |
Redis version | 7.0.10 |
RediSearch version | 2.6.6. The database must have the CONCURRENT_WRITE_MODE parameter set to true. |
RedisJSON version | 2.4.6 |
Number of I/O threads | 4 |
CPU resource | 6 vCPUs. Sample command: |
Test data
The test data is a collection of articles in Chinese and a collection of articles in English from Wikimedia. For more information, see Index of /zhwiki/latest/ and Index of /enwiki/latest/.
Examples:
{
"id":"History_of_Pakistan",
"title":"History of Pakistan",
"url":"https://en.wikipedia.org/wiki/History_of_Pakistan",
"abstract":"The history of Pakistan for the period preceding the country's independence in 1947Pakistan was created as the Dominion of Pakistan on 14 August 1947 after the end of British rule in, and partition of British India. is shared with that of Afghanistan, India."
}
{
"id":"Wikipedia:哲学",
"title":"Wikipedia:哲学",
"url":"https://zh.wikipedia.org/wiki/%E5%93%B2%E5%AD%A6",
"abstract":"哲学()是研究普遍的、基本问题的学科,包括存在、知识、价值、理智、心灵、语言等领域。哲学与其他学科不同之处在於哲学有独特之思考方式,例如批判的方式、通常是系统化的方法,并以理性论证为基础。"
}
Test tool
Download the binary executable file that matches your operating system. The file for Darwin is named TairSearchBench.Darwin, the file for Linux is named TairSearchBench.Linux, and the file for Windows is named TairSearchBench.Windows.
In this example, TairSearchBench.Linux is used. Run the ./TairSearchBench.Linux --help
command to check how to use the tool.
Usage of ./TairSearchBench.linux:
-a string
The address(ip:port) of network to connect
# The endpoint of the Tair instance.
-c int
Benchmark concurrency (default 30)
# The number of tests that can be run concurrently. Default value: 30.
-d uint
Specify the number of seconds for the benchmark (default 30)
# The duration of the test. When the duration ends, the test is terminated. Default value: 30. Unit: seconds.
-e string
The engine backend to run [tairsearch/redisearch]
# Specify TairSearch or RediSearch as the engine that the instance runs.
-f string
Input file to ingest data from (wikipedia abstracts)
# The path of the execution data file.
-h string
Print usage (default "help")
# Display the usage of the tool.
-j string
Specify the big json file to write
# Specify the path of the JSON file to be written.
-n uint
Specify the number of times to benchmark (default 100000)
# The total number of operations to perform for a test. Default value: 100000.
-o int
Overwrite the doc (We will write the document with the same document id)
# Specify whether to overwrite the original document. Valid values: 1 (true) and 0 (false). Default value: 0.
-p string
The password of redis to connect
# The password of the instance.
-q string
Search query string to benchmark
# The query statement that is used to run tests.
-s uint
Specify the compress threshold for tairsearch (default 10000000000)
# Specify the compression threshold for TairSearch. If the size of a document exceeds the threshold, the document is compressed. Unit: bytes. Default value: 10000000000 (10 KB).
-t string
Specify the type of benchmark [write/search/readwrite]
# Set the test type to write, search, or readwrite.
-z string
Specify the analyzer to use for query (default "standard")
# Specify the analyzer for the query. Default value: standard.
Before you perform testing, allocate 20 vCPUs to the ECS instance. Sample command: taskset -c 10-30 ./TairSearchBench.linux
.
Preparations
Create a schema (index). Examples:
TairSearch
{ "settings": { "compress_doc": { "size": "user-defined compression threshold", "enable": true } }, "mappings": { "properties": { "id": {"type": "keyword"}, "url": {"type": "keyword", "index": false}, "title": {"type": "text", "analyzer": "user-defined analyzer"}, "abstract": {"type": "text", "analyzer": "user-defined analyzer"}, "url_len": {"type": "integer"}, "abstract_len": {"type": "integer"}, "title_len": {"type": "integer"} } } }
RediSearch
\\ SCHEMA $.id AS id TEXT $.url AS url TEXT NOINDEX $.title AS title TEXT $.abstract AS abstract TEXT $.abstract_len AS abstract_len NUMERIC $.url_len AS url_len NUMERIC $.title_len AS title_len NUMERIC \\ If the test data is documents in Chinese, add LANGUAGE CHINESE to the preceding code.
Test commands and test results
In the following tests, one million documents are written for each write test. One million queries are performed on one million documents for each query test. Each test that combines write and query operations is configured to run for 60 seconds.
Write data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e tairsearch -f ./enwiki-latest-abstract.xml -c 20 -n 1000000 -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e redisearch -f ./enwiki-latest-abstract.xml -c 20 -n 1000000 -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) | Memory used (GB) |
TairSearch | 22,615.15 | 0.874 | 1.735 | 1.39 |
RediSearch | 18,295.10 | 1.092 | 2.352 | 1.67 |
Write data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e tairsearch -f ./zhwiki-latest-abstract.xml -c 20 -n 1000000 -a 127.0.0.1:6379 -z jieba
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e redisearch -f ./zhwiki-latest-abstract.xml -c 20 -n 1000000 -a 127.0.0.1:6379 -z chinese
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) | Memory used (GB) |
TairSearch | 13,980.41 | 1.427 | 3.275 | 1.87 |
RediSearch | 10,924.40 | 1.830 | 3.857 | 1.83 |
TairSearch has a higher memory usage than RediSearch because the jieba analyzer is used and more fine-grained tokens are generated.
Overwrite data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e tairsearch -f ./enwiki-latest-abstract.xml -c 20 -n 1000000 -o 1 -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e redisearch -f ./enwiki-latest-abstract.xml -c 20 -n 1000000 -o 1 -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) | Memory used (GB) |
TairSearch | 9,775.03 | 2.041 | 3.974 | 0.0002 |
RediSearch | 22,239.67 | 0.898 | 1.38 | 0.165 |
When you perform an overwrite operation, RediSearch marks the original document for later deletion. This causes additional memory usage. In comparison, TairSearch deletes the original document in real time.
Overwrite data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e tairsearch -f ./zhwiki-latest-abstract.xml -c 20 -n 1000000 -o 1 -a 127.0.0.1:6379 -z jieba
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t write -e redisearch -f ./zhwiki-latest-abstract.xml -c 20 -n 1000000 -o 1 -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) | Memory used (GB) |
TairSearch | 6,194.15 | 3.206 | 6.456 | 0.025 (including the memory used by the jieba analyzer dictionary) |
RediSearch | 25,096.18 | 0.796 | 1.338 | 0.671 |
When you perform an overwrite operation, RediSearch marks the original document for later deletion. This causes additional memory usage. In comparison, TairSearch deletes the original document in real time.
Use a term statement to query data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"term":{"abstract":"hello"}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:hello" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 45,501.13 | 0.437 | 0.563 |
RediSearch | 28,513.87 | 0.700 | 0.833 |
Use a term statement to query data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"term":{"abstract":"你好"}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:你好" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 40,670.47 | 0.489 | 0.635 |
RediSearch | 24,437.48 | 0.817 | 1.331 |
Use a match statement to query data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"match":{"abstract":{"operator":"and","query":"chinese history"}}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:chinese history" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 24,548.94 | 0.812 | 0.971 |
RediSearch | 2,420.66 | 8.261 | 8.523 |
Use a match statement to query data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 100000 -q '{"query":{"match":{"abstract":{"operator":"and","query":"中国的历史"}}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 100000 -q "@abstract:中国的历史" -a 127.0.0.1:6379 -analyzer jieba
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 6,601.05 | 3.027 | 3.669 |
RediSearch | 889.37 | 22.486 | 22.985 |
Use a bool statement to query data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 100000 -q '{"query":{"bool":{"must":[{"term":{"abstract":"war"}},{"term":{"abstract":"japanese"}},{"range":{"abstract_len":{"gt":500}}}],"must_not":{"term":{"abstract":"America"}},"should":[{"term":{"abstract":"chinese"}},{"term":{"abstract":"china"}}]}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 100000 -q "@abstract:(war japanese -America (chinese|china)) @abstract_len:[500 +inf]" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 4,554.22 | 4.388 | 5.702 |
RediSearch | 1,124.08 | 17.791 | 18.444 |
Use a bool statement to query data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 100000 -q '{"query":{"bool":{"must":[{"term":{"abstract":"战争"}},{"term":{"abstract":"日本"}},{"range":{"abstract_len":{"gt":500}}}],"must_not":{"term":{"abstract":"美国"}},"should":[{"term":{"abstract":"中国"}},{"term":{"abstract":"亚洲"}}]}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:(战争 日本 -美国 (中国|亚洲)) @abstract_len:[500 +inf]" -a 127.0.0.1:6379 -analyzer jieba
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 2,619.00 | 7.623 | 18.42 |
RediSearch | 1,199.76 | 16.669 | 17.064 |
Use a range statement to query data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"range":{"abstract_len":{"lte":420, "gte":400}}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract_len:[400,420]" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 2,840.02 | 7.038 | 8.599 |
RediSearch | 1,307.02 | 15.300 | 16.817 |
Use a prefix statement to query data in English
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"prefix":{"abstract":"happiness"}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:happiness*" -a 127.0.0.1:6379
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 36,491.10 | 0.545 | 0.688 |
RediSearch | 25,558.92 | 0.781 | 0.930 |
Use a prefix statement to query data in Chinese
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e tairsearch -c 20 -n 1000000 -q '{"query":{"prefix":{"abstract":"开心"}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t search -e redisearch -c 20 -n 1000000 -q "@abstract:开心*" -a 127.0.0.1:6379 -z chinese
Results
Engine | QPS | Average latency (ms) | 99th percentile latency (ms) |
TairSearch | 41,308.71 | 0.481 | 0.638 |
RediSearch | 27,457.86 | 0.727 | 1.234 |
Write data and use a term statement to query data
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t readwrite -e tairsearch -f ./enwiki-latest-abstract.xml -c 20 -d 60 -q '{"query":{"term":{"abstract":"hello"}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t readwrite -e redisearch -f ./enwiki-latest-abstract.xml -c 20 -d 60 -q "@abstract:hello" -a 127.0.0.1:6379
Results
Engine | Average write QPS | Average write latency (ms) | Average QPS | Average query latency (ms) |
TairSearch | 14,699.77 | 1.359 | 16,224.03 | 1.232 |
RediSearch | 11,386.75 | 1.755 | 11,386.70 | 1.755 |
Write data and use a bool statement to query data
Commands
TairSearch
taskset -c 10-30 ./TairSearchBench.linux -t readwrite -e tairsearch -f ./enwiki-latest-abstract.xml -c 20 -d 60 -q '{"query":{"bool":{"must":[{"term":{"abstract":"war"}},{"term":{"abstract":"japanese"}},{"range":{"abstract_len":{"gt":500}}}],"must_not":{"term":{"abstract":"America"}},"should":[{"term":{"abstract":"chinese"}},{"term":{"abstract":"china"}}]}}}' -a 127.0.0.1:6379
RediSearch
taskset -c 10-30 ./TairSearchBench.linux -t readwrite -e redisearch -f ./enwiki-latest-abstract.xml -c 20 -d 60 -q "@abstract:(war japanese -America (chinese|china)) @abstract_len:[500 +inf]" -a 127.0.0.1:6379
Results
Engine | Average write QPS | Average write latency (ms) | Average QPS | Average query latency (ms) |
TairSearch | 9,589.18 | 2.085 | 10,504.31 | 1.903 |
RediSearch | 5,284.01 | 3.784 | 5,283.96 | 3.784 |
Summary
TairSearch uses multi-core parallel computing technology and inverted indexes designed specifically for text search to deliver high throughput and low latency. Additionally, TairSearch uses a dedicated data structure for document compression to reduce memory usage and save costs without compromising read and write performance.