TairSearch is an in-house full-text search data structure of Tair and uses query syntax that is similar to that of Elasticsearch.
Overview
TairSearch has the following features:
Low latency and high performance: provides millisecond-level write and full-text search capabilities based on the ultra-high computing power of Tair. For more information, see TairSearch performance whitepaper.
Incremental and partial updates: supports partial updates of indexes and incremental updates of documents, including adding, updating, removing, and auto-incrementing fields.
Flexible syntax: provides custom sorting order and supports the JSON syntax that allows bool, match, term, and paging queries. This syntax is similar to that of Elasticsearch.
Aggregate query: supports terms, metrics, and filter aggregations. For more information, see Aggregations.
Auto-complete suggestion: supports fuzzy match with prefixes and automatic completion for search operations.
Custom analyzers: provides built-in analyzers for major languages such as English and Chinese and meets personalized requirements for the dictionaries and stop words of analyzers. For more information, see Search analyzers.
Shard index query: allows you to use the TFT.MSEARCH command to search for multiple shard indexes and return an aggregated result set.
Document compression: supports the storage of compressed documents to reduce memory usage. By default, this feature is disabled.
Query caching: stores the latest query results in cache to improve the query efficiency of hot data.
Release notes
Redis 5.0-compatible DRAM-based instances
On March 11, 2022, TairSearch was released with Tair V1.7.27.
On May 24, 2022, Tair V1.8.5 was released to add support for the TairSearch aggregation feature.
On September 6, 2022, Tair V5.0.15 was released to add support for the TFT.MSEARCH command.
On January 13, 2023, Tair V5.0.25 was released to add support for analyzers.
On March 15, 2023, Tair V5.0.28 was released to add support for query caching, document compression, and the TFT.ANALYZER command.
On June 12, 2023, Tair V5.0.35 was released to add support for documents of the ARRAY data type and the similarity ranking algorithm Okapi BM25.
Redis 6.0-compatible DRAM-based instances
On February 7, 2023, Tair V6.2.4.1 was released to add support for TairSearch.
Tair V6.2.4.1 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.25.
On March 14, 2023, Tair V6.2.5.0 was released to add support for query caching, document compression, and the TFT.ANALYZER command.
Tair V6.2.5.0 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.28.
On June 12, 2023, Tair V6.2.7.3 was released to add support for documents of the ARRAY data type and the similarity ranking algorithm Okapi BM25.
Tair V6.2.7.3 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.35.
On December 21, 2023, Tair V23.12.1.2 was released with support for the TFT.EXPLAINSCORE command.
Best practices
Prerequisites
The instance is a Tair DRAM-based instance that meets one of the following requirements:
DRAM-based instance that is compatible with Redis 5.0: runs minor version 1.7.27 or later.
DRAM-based instance that is compatible with Redis 6.0: runs minor version 6.2.4.1 or later.
The latest minor version provides more features and higher stability. We recommend that you update the instance to the latest minor version. For more information, see Update the minor version of an instance. If your instance is a cluster instance or read/write splitting instance, we recommend that you update the proxy nodes in the instance to the latest minor version to ensure that all commands can be run as expected.
Precautions
The TairSearch data that you want to manage is stored on a Tair instance.
To reduce memory usage, we recommend that you perform the following operations:
When you create indexes, set index to true for document fields that you want to specify as inverted fields. For other document fields, set index to false.
Specify an object containing arrays of includes and excludes patterns in the _source parameter to filter out document fields that you do not need and save fields that you need.
If you want to split a document into tokens, choose an appropriate analyzer to prevent unnecessary splitting and increased memory usage.
If a document is excessively large, use the document compression feature to automatically compress and decompress the document.
Do not add a large number of documents to a single index. Add the documents to multiple indexes. We recommend that you keep the number of documents per index within 5 million to prevent data skew in cluster instances, balance read and write requests, and reduce the number of large keys and hotkeys.
Supported commands
Table 1. Full-text search commands
Command | Syntax | Description |
| Creates an index and a mapping for the index. The syntax used to create a mapping is similar to that used to create an explicit mapping in Elasticsearch. For more information, see Explicit mapping. You must create an index before you can add documents to the index. | |
| Adds the properties field to the specified index or modifies the settings of the index. | |
| Obtains the mapping content of an index. | |
| Adds a document to an index. You can specify a unique ID for the document in the index by using WITH_ID doc_id. If the document ID already exists, the existing ID is overwritten. If you do not specify WITH_ID doc_id, a document ID is automatically generated. By default, WITH_ID doc_id is not specified. | |
| Adds multiple documents to an index. Each document must have a document ID that is specified by doc_id. If a document fails to be added due to an invalid format, all documents that the command involves are not added to the index. | |
| Updates the document specified by doc_id in an index. If the document fields that you want to update are indexed in a way that is determined by the document mapping, the field data types must be the same as those specified by the mapping. If the fields that you want to update are not indexed fields, they can be of any data types. Note If the fields already exist, the document is updated. If the fields do not exist, the fields are added. If the document does not exist, the document is automatically created. In this case, the command is equivalent to TFT.ADDDOC. | |
| Deletes the specified field from the document specified by doc_id in an index. If the field is an indexed field, the information of the field is also deleted from the index. Note If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. | |
| Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative integer. The data type of the field can only be LONG or INTEGER. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. | |
| Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative floating-point number. The data type of the field can be DOUBLE. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. | |
| Obtains the content of the document specified by doc_id in an index. | |
| Checks whether the document specified by doc_id exists in an index. | |
| Obtains the number of documents in an index. | |
| Obtains the IDs of all documents in an index. | |
| Deletes the document specified by doc_id from an index. Multiple document IDs can be specified. | |
| Deletes all documents from an index but retains the index. | |
| Queries the tokenization effects of the specified analyzer. | |
| Queries the documents in an index. The query syntax is similar to the Query Domain Specific Language (DSL) syntax used in Elasticsearch. For more information, see Query DSL. | |
| Queries documents in multiple indexes that have mappings and settings set to the same values by using the query clause and gathers the results from these indexes. Then, the results are rated, sorted, aggregated, and returned. | |
| Queries the execution duration of a query statement. The output includes the number of documents that are involved in the query and the amount of time consumed by each operation in the query. | |
| Queries the detailed score information of documents resulting from the execution of a query statement. You can use this command to gain insights into the process of how document scores are calculated. Then, you can optimize search queries to enhance the effectiveness of document retrieval. | |
| Deletes one or more TairSearch keys. |
Table 2. Auto-complete commands
Command | Syntax | Description |
| Adds one or more auto-complete text entries and their weights to the specified index. | |
| Deletes one or more auto-complete text entries from the specified index. | |
| Obtains the number of auto-complete text entries in the specified index. | |
| Obtains the auto-complete text entries that can be matched based on the specified prefix. Text entries are returned in descending order of weights. | |
| Obtains all auto-complete text entries in the specified index. |
Uppercase keyword
: indicates the command keyword.Italic text
: indicates variables.[options]
: indicates that the enclosed parameters are optional. Parameters that are not enclosed by brackets must be specified.A|B
: indicates that the parameters separated by the vertical bars (|) are mutually exclusive. Only one of the parameters can be specified....
: indicates that the parameter preceding this symbol can be repeatedly specified.
TFT.CREATEINDEX
Item | Description |
Syntax |
|
Command description | Creates an index and a mapping for the index. The syntax used to create a mapping is similar to that used to create an explicit mapping in Elasticsearch. For more information, see Explicit mapping. You must create an index before you can add documents to the index. Note To prevent large keys from being generated, you can split large indexes into small indexes and devise load distribution rules to write data to different indexes. When you create indexes, make sure that these indexes have the mappings and settings parameters set to the same values. After these indexes are created, you can query them by using TFT.MSEARCH. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.UPDATEINDEX
Item | Description |
Syntax |
|
Command description | Adds the properties field to the specified index or modifies the settings of the index. |
Parameter |
Note For more information about the mappings and settings parameters, see TFT.CREATEINDEX. |
Output |
|
Example | Sample command:
Sample output:
|
TFT.GETINDEX
Item | Description |
Syntax |
|
Command description | Obtains the mapping content of an index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.ADDDOC
Item | Description |
Syntax |
|
Command description | Adds a document to an index. You can specify a unique ID for the document in the index by using WITH_ID doc_id. If the document ID already exists, the existing ID is overwritten. If you do not specify WITH_ID doc_id, a document ID is automatically generated. By default, WITH_ID doc_id is not specified. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
Sample arrays to be added:
|
TFT.MADDDOC
Item | Description |
Syntax |
|
Command description | Adds multiple documents to an index. Each document must have a document ID that is specified by doc_id. If a document fails to be added due to an invalid format, all documents that the command involves are not added to the index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.UPDATEDOCFIELD
Item | Description |
Syntax |
|
Command description | Updates the document specified by doc_id in an index. If the document fields that you want to update are indexed in a way that is determined by the document mapping, the field data types must be the same as those specified by the mapping. If the fields that you want to update are not indexed fields, they can be of any data types. Note If the fields already exist, the document is updated. If the fields do not exist, the fields are added. If the document does not exist, the document is automatically created. In this case, the command is equivalent to TFT.ADDDOC. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.DELDOCFIELD
Item | Description |
Syntax |
|
Command description | Deletes the specified field from the document specified by doc_id in an index. If the field is an indexed field, the information of the field is also deleted from the index. Note If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.INCRLONGDOCFIELD
Item | Description |
Syntax |
|
Command description | Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative integer. The data type of the field can only be LONG or INTEGER. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.INCRFLOATDOCFIELD
Item | Description |
Syntax |
|
Command description | Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative floating-point number. The data type of the field can be DOUBLE. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist, the operation cannot be performed. For example, the field may not exist because it is filtered out by using _source. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.GETDOC
Item | Description |
Syntax |
|
Command description | Obtains the content of the document specified by doc_id in an index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.EXISTS
Item | Description |
Syntax |
|
Command description | Checks whether the document specified by doc_id exists in an index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.DOCNUM
Item | Description |
Syntax |
|
Command description | Obtains the number of documents in an index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.SCANDOCID
Item | Description |
Syntax |
|
Command description | Obtains the IDs of all documents in an index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.DELDOC
Item | Description |
Syntax |
|
Command description | Deletes the document specified by doc_id from an index. Multiple document IDs can be specified. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.DELALL
Item | Description |
Syntax |
|
Command description | Deletes all documents from an index but retains the index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.ANALYZER
Item | Description |
Syntax |
|
Command description | Queries the tokenization effects of the specified analyzer. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.SEARCH
Item | Description |
Syntax |
|
Command description | Queries the documents in an index. The query syntax is similar to the Query Domain Specific Language (DSL) syntax used in Elasticsearch. For more information, see Query DSL. |
Parameter | index: the name of the index that you want to manage by running this command. query: the Query DSL statement that is similar to the syntax used in Elasticsearch. The following fields are supported:
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.MSEARCH
Item | Description |
Syntax |
|
Command description | Queries documents in multiple indexes that have mappings and settings set to the same values by using the query clause and gathers the results from these indexes. Then, the results are rated, sorted, aggregated, and returned. Note The output of the TFT.MSEARCH command is a result of rating, sorting, and aggregating query results from these indexes. The output is different from results generated by directly rating, sorting, and aggregating datasets in multiple indexes. TFT.MSEARCH policy:
|
Parameter |
Note Unlike the query statement of the TFT.SEARCH command, the query statement of the TFT.MSEARCH command does not support the from parameter, but supports paged query by using the size, reply_with_keys_cursor, and keys_cursor parameters. For more information about the syntax of other parameters, see TFT.SEARCH. |
Output |
|
Example | The following commands are run in advance:
Sample command:
Sample output:
Sample command for querying the second page:
Sample output:
|
TFT.EXPLAINCOST
Item | Description |
Syntax |
|
Command description | Queries the execution duration of a query statement. The output includes the number of documents that are involved in the query and the amount of time consumed by each operation in the query. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.EXPLAINSCORE
Item | Description |
Syntax |
|
Command description | Queries the detailed score information of documents resulting from the execution of a query statement. You can use this command to gain insights into the process of how document scores are calculated. Then, you can optimize search queries to enhance the effectiveness of document retrieval. This command is available only for DRAM-based instances that are compatible with Redis 6.0. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.ADDSUG
Item | Description |
Syntax |
|
Command description | Adds one or more auto-complete text entries and their weights to the specified index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.DELSUG
Item | Description |
Syntax |
|
Command description | Deletes one or more auto-complete text entries from the specified index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.SUGNUM
Item | Description |
Syntax |
|
Command description | Obtains the number of auto-complete text entries in the specified index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.GETSUG
Item | Description |
Syntax |
|
Command description | Obtains the auto-complete text entries that can be matched based on the specified prefix. Text entries are returned in descending order of weights. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.GETALLSUGS
Item | Description |
Syntax |
|
Command description | Obtains all auto-complete text entries in the specified index. |
Parameter |
|
Output |
|
Example | Sample command:
Sample output:
|
Aggregations
You can add the aggs (aggregations) parameter to TFT.SEARCH commands to aggregate results obtained by using query clauses.
Usage
In most cases, you must specify the aggregation name, type, and field for the aggs parameter. Only fields of the numeric and keyword types are supported. Sample command:
TFT.SEARCH shares '{"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Sum":{"sum":{"field":"purchase_price"}}}}'
# Specify Jay_Sum as the aggregation name, sum as the aggregation type, and purchase_price as the aggregation field.
Query and aggregation results are returned. Example:
{"hits":{"hits":[{"_id":"16581351808123930","_index":"today_shares0718","_score":1.0,"_source":{"shares_name":"XAX","logictime":14300210,"purchase_type":1,"purchase_price":101.1,"purchase_count":100,"investor":"Jay"}},{"_id":"16581351809626430","_index":"today_shares0718","_score":1.0,"_source":{"shares_name":"XAX","logictime":14300310,"purchase_type":1,"purchase_price":111.1,"purchase_count":100,"investor":"Jay"}}],"max_score":1.0,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Sum":{"value":212.2}}}
You can add "size":0
to the query syntax so that only aggregation results are returned.
Aggregation types supported by aggs
Metrics, terms, and filter aggregations are supported by the aggs parameter.
Item | Description |
Metrics aggregation | Computes metrics based on values extracted from the fields of documents that are being aggregated. These fields are typically of a numeric type, such as INTEGER or DOUBLE. Nested aggregations are not supported. Valid values:
Note All parameters except for value_count only support numeric fields. Output: DOUBLE-typed values obtained by using specific fields for calculation. |
Terms Aggregation | Calculates the deduplicated number of values. Only fields of the keyword type are supported. Nested aggregations are supported. Valid values:
Example:
Output: a JSON object obtained by using the key aggregation. In the object, the key aggregation uses buckets to display statistics. Each bucket contains a key (aggregation field) and a doc_count value (number of documents associated with the aggregation field). Example:
|
Filter Aggregation | Filters query results of a query statement. Nested aggregations are supported. Output: the number of documents (doc_count) that match the filter conditions. |
Aggregation examples
Create an index.
TFT.CREATEINDEX today_shares '{"mappings":{"properties":{"shares_name":{"type":"keyword"},"logictime":{"type":"long"},"purchase_type":{"type":"integer"},"purchase_price":{"type":"double"},"purchase_count":{"type":"long"},"investor":{"type":"keyword"}}}}' # Create an index that represents the stock trading volume of today. # shares_name: the name of each stock. # logictime: the time when the deal is complete. # purchase_type: the purchase type. # purchase_price: the purchase price. # purchase_count: the number of purchased stock shares. # investor: the ID of the buyer.
Expected output:
OK
Add document data.
Run the following commands:
TFT.ADDDOC today_shares '{"shares_name":"XAX","logictime":14300210, "purchase_type":1,"purchase_price":101.1, "purchase_count":100,"investor":"Jay"}' TFT.ADDDOC today_shares '{"shares_name":"XAX","logictime":14300310, "purchase_type":1,"purchase_price":111.1, "purchase_count":100,"investor":"Jay"}' TFT.ADDDOC today_shares '{"shares_name":"YBY","logictime":14300410, "purchase_type":1,"purchase_price":11.1, "purchase_count":100,"investor":"Mila"}'
Expected output:
OK
Perform a query.
Sample commands:
sum
# Query the total amount that Jay spent to purchase the stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Sum":{"sum":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Sum":{"value":212.2}}}
max
# Query the largest amount that Jay spent to purchase a stock. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Max":{"max":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Max":{"value":111.1}}}
avg
# Query the average amount that Jay spent to purchase different stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Avg":{"avg":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Avg":{"value":106.1}}}
std_deviation
# Query the standard deviation of the amount that Jay spent to purchase stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Std_Deviation":{"std_deviation":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Std_Deviation":{"value":5.0}}}
extended_stats
# Query the statistics of the amount that Jay spent to purchase stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Extended_Stats":{"extended_stats":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Extended_Stats":{"count":2,"sum":212.2,"max":111.1,"min":101.1,"avg":106.1,"sum_of_squares":10221.21,"variance":25.0,"std_deviation":5.0}}}
terms
# Query the buyers that have completed at least two transactions. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}},"aggs":{"Per_Investor_Freq":{"terms":{"field":"investor","min_doc_count":2,"order": {"_key":"desc"}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Per_Investor_Freq":{"buckets":[{"key":"Jay","doc_count":2}]}}}
nested terms aggregation
# Query the number of transactions conducted for each stock and the average amount spent to purchase each stock. The XAX stock is excluded. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}},"aggs":{"Per_Investor_Freq":{"terms":{"field":"shares_name","include":"[A-Z]+","exclude":["XAX"]},"aggs":{"Price_Avg":{"avg":{"field":"purchase_price"}}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Per_Investor_Freq":{"buckets":[{"key":"YBY","doc_count":1,"Price_Avg":{"value":11.1}}]}}}
nested filter aggregation
# Query the total number of stocks purchased by Jay and the statistics of the amount that Jay spent to purchase stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}}, "aggs":{"Jay_BuyIn_Filter": {"filter": {"term":{"investor": "Jay"}},"aggs":{"Jay_BuyIn_Quatation":{"extended_stats":{"field":"purchase_price"}}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Jay_BuyIn_Filter":{"doc_count":2,"Jay_BuyIn_Quatation":{"count":2,"sum":212.2,"max":111.1,"min":101.1,"avg":106.1,"sum_of_squares":10221.21,"variance":25.0,"std_deviation":5.0}}}}