This topic describes how to create a search index for a data table and use the search index to perform queries, such as queries based on non-primary key columns and Boolean queries, in the Tablestore console.
Prerequisites
A data table for which the Max Versions parameter is set to 1 is created. The Time to Live parameter of the data table must meet one of the following conditions. For more information, see Operations on a data table.
The time to live (TTL) of the data table is set to -1, which specifies that data in the data table never expires.
The TTL of the data table is set to a value other than -1, and update operations on the data table are prohibited.
Usage notes
The data types of the fields in a search index must match the data types of the fields in the data table for which the search index is created. For more information, see Mappings of basic data types.
If you want to set the TTL of a search index to a value other than -1, make sure that the UpdateRow operation is prohibited on the data table for which the search index is created. The TTL of a search index that is created for a data table must be less than or equal to the TTL of the data table. For more information, see Specify the TTL of a search index.
Step 1: Create a search index
You can create a search index to accelerate data queries. You must add the fields that you want to query to the search index. You can configure advanced settings, such as the Routing Key, Time to Live, and Pre-sorting parameters, based on your business requirements.
Go to the Indexes tab.
Log on to the Tablestore console.
In the top navigation bar, select a resource group and a region.
On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column of the instance.
On the Tables tab of the Instance Details tab, click the name of the data table for which you want to create a search index or click Indexes in the Actions column of the data table.
On the Indexes tab, click Create Search Index.
In the Create Index dialog box, configure the parameters of the search index.
Retain the default index name that is generated by the system, or enter an index name based on your business requirements.
Select the method that you want to use to generate a schema for the search index.
ImportantThe values of the Field Name and Field Type parameters must match the values of these parameters in the data table. For information about the mappings between field types in data tables and field types in search indexes, see Mappings of basic data types.
If you set the Schema Generation Type parameter to Manual, specify the names and types for the fields. Specify whether to turn on Array for each field based on your business requirements.
If you set the Schema Generation Type parameter to Auto Generate, the system automatically uses the primary key columns and attribute columns of the data table as index fields. You can select field types and specify whether to turn on Array based on your business requirements.
NoteTo optimize indexing performance in specific cases, use virtual columns. For more information, see Virtual columns.
If you want to configure advanced settings, such as the Routing Key, Time to Live, and Pre-sorting parameters, turn on Advanced Settings. The following table describes the parameters.
Parameter
Description
Routing Key
The custom routing fields. You can select one or more primary key columns as the routing fields. Tablestore calculates the distribution of the index data based on the values of the routing fields. The records with the same values of the routing fields are distributed to the same data partition.
Time to Live
The retention period of data in the search index. Unit: seconds. The default value is -1, which specifies that the data never expires.
The TTL must be at least 86,400 seconds (one day) or -1. A value of -1 specifies that the data never expires. The TTL of a search index that is created for a data table must be less than or equal to the TTL of the data table.
If you want the system to automatically clear the historical data in the search index, specify a value that is greater than or equal to 86400. If the retention period exceeds the value of this parameter, the data expires and Tablestore automatically deletes the expired data.
Pre-sorting
The default order in which the data is returned.
Valid values: Default and Custom. A value of Default specifies that the data is sorted based on the primary key. A value of Custom specifies that the data is sorted based on the field that you specify. Configure the Pre-sorting parameter based on your business requirements.
ImportantSearch indexes that contain Nested fields do not support index presorting.
Click OK.
After you create the search index, click Index Details in the Actions column of the search index on the Indexes tab. On the Index Details dialog box, you can view the parameter settings in the Basic Index Information, Index Meters, Routing Key, Index Fields, and Pre-sorting sections.
Step 2: Query data
When you use a search index to query data, you can select the query types based on your business requirements. You can configure the columns that you want to return and the sorting method of the returned data.
Search indexes support the following query types: term query, range query, prefix query, match query, wildcard query, match phrase query, exists query, terms query, Boolean query, geo query, and nested query. Geo queries are available only for Geo fields, and nested queries are available only for Nested fields.
Go to the Indexes tab.
Log on to the Tablestore console.
In the top navigation bar, select a resource group and a region.
On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column of the instance.
On the Tables tab of the Instance Details tab, click the name of the data table for which you want to create a search index or click Indexes in the Actions column of the data table.
On the Indexes tab, find the search index that you want to use to query data and click Manage Data in the Actions column.
In the Search dialog box, specify the query conditions.
By default, the system returns all attribute columns. To return specific attribute columns, turn off All Columns and specify the attribute columns that you want to return. Separate multiple attribute columns with commas (,).
NoteBy default, the system returns all primary key columns of the data table.
Select an index field and click Add to add the index field as a query condition. Specify the query type and value for the index field. Repeat this step to add more query conditions based on your business requirements.
Query type
Description
This query uses exact matches to retrieve data from the data table. A term query is similar to a query based on string match. If the field is of the Text type, Tablestore tokenizes the field values and finds tokens that exactly match the specified value.
This query retrieves data within the specified range from the data table. If the field is of the Text type, Tablestore tokenizes the field values and finds tokens that are within the specified range.
This query retrieves data that contains the specified prefix from the data table. If the field is of the Text type, Tablestore tokenizes the field values and finds tokens that contain the specified prefix.
This query retrieves data from the data table based on approximate matches. If the field is of the Text type, Tablestore tokenizes the field values and the query string by using the specified tokenization method and matches tokens based on approximate matches.
This query allows you to specify wildcard characters in the string that you want to match.
The asterisk (*) and question mark (?) wildcard characters are supported. An asterisk (*) matches a string of any length, and a question mark (?) matches a single character. The string that you want to match can start with an asterisk (*) or a question mark (?).
This query is similar to a match query, except that a match phrase query evaluates the positions of tokens. A row meets the query condition only if the order and positions of the tokens in the row match the specified order and positions.
If the tokenization method used for the column that you want to query is fuzzy tokenization, a match phrase query provides a faster response than a wildcard query.
An exists query is also called a NULL query or a NULL-value query. This query is used in sparse data to determine whether a column of a row exists. For example, you can query the rows in which the value of the address column is not empty.
If you want to check whether the value of a column is empty, use an exists query together with the NOT operator.
This query is similar to a term query. A terms query supports multiple query strings. If at least one of the query strings matches the field value, the row of data is returned. Terms queries are equivalent to the IN operator in SQL statements.
This query retrieves data from the data table based on a combination of subqueries. Tablestore returns the rows that match the subqueries.
Each subquery can be of any type.
You can select the AND, OR, or NOT operator based on your business requirements.
Geo query
Geo queries include geo-distance queries, geo-bounding box queries, and geo-polygon queries. Geo queries are available only for Geo fields.
Geo-distance query: This query allows you to specify a circular geographical area that is defined by a central point and a radius as a query condition. Tablestore returns the rows in which the value of the specified field falls within the circular geographical area.
Geo-bounding box query: This query allows you to specify a rectangular geographical area as a query condition. Tablestore returns the rows in which the value of the specified field falls within the rectangular geographical area.
Geo-polygon query: This query allows you to specify a polygon geographical area as a query condition. Tablestore returns the rows in which the value of the specified field falls within the polygon geographical area.
This query retrieves the data from the child rows of Nested fields. Nested queries are available only for Nested fields.
To query a Nested field, you must select subfields and specify the query type and value for the subfields.
This query performs an approximate nearest neighbor search based on vectors. KNN vector queries are available only for Vector fields.
To query a Vector field, you must specify the vector to query and topK value.
By default, the sorting feature is disabled. To enable sorting, turn on Sort, add the index fields based on which the query results are sorted, and then specify the sorting methods.
Click OK.
Data that meets the query conditions is displayed based on the specified order on the Indexes tab.
Appendix: Manage search indexes
The following table describes the operations you can perform on search indexes.
Operation | Description |
View the details of a search index | To view basic index information, index meters, routing key, index fields, and presorting details of the search index, perform the following steps:
|
Modify the schema of a search index | If you want to add, update, or remove index fields for a search index, you can dynamically modify the schema of the search index. For more information, see Dynamically modify the schema of a search index. |
Modify the TTL of a search index | If you want the system to automatically clear historical data in a search index or extend the retention period of the data in a search index, you can modify the TTL of the search index. To modify the TTL of a search index, perform the following steps:
|
Delete a search index | If you no longer require a search index to accelerate data queries, you can delete the search index. Important You cannot restore a deleted search index. If you want to use a deleted search index, recreate the search index. To delete a search index, perform the following steps:
|
FAQ
References
You can use search indexes by using the Tablestore CLI and Tablestore SDKs. For more information, see Use the Tablestore CLI and Use Tablestore SDKs.
If you want to query new fields or data of new field types without modifying the storage schema and data in data tables, use the virtual column feature. For more information, see Virtual columns.