If you want to perform queries such as queries based on non-primary key columns or Boolean queries during business operations, you can create a search index for a data table in the Tablestore console, and then use the search index to query and analyze data.
Prerequisites
A data table is created. The Max Versions parameter of the data table is set to 1. The Time to Live parameter of the data table meets one of the following conditions. For more information, see Operations on tables.
The time to live (TTL) of the data table is set to -1, which indicates that data in the data table never expires.
The TTL of the data table is set to a value other than -1, and the Allow Updates parameter is set to No, which indicates that update operations are disabled on the data table.
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
Usage notes
The data types of fields in a search index must match the data types of fields in the data table for which the search index is created. For more information, see Data type mappings.
If you want to set the TTL value of a data table to a value other than -1, make sure that the UpdateRow operation is forbidden on the data table. The TTL value of a search index that is created for the data table must be less than or equal to the TTL value of the data table. For more information, see Specify the TTL of a search index.
Use Tablestore SDKs
You can use the following Tablestore SDKs to get started with the TimeSeries model. In this example, Tablestore SDK for Java is used to describe how to use search indexes.
Tablestore SDK for Java: Search Index
Tablestore SDK for Go: Search Index
Tablestore SDK for Python: Search Index
Tablestore SDK for Node.js: Search Index
Tablestore SDK for .NET: Search Index
Tablestore SDK for PHP: Search index
Step 1: Create a search index
You can create a search index to accelerate data queries. When you create a search index, you must add the fields whose data you want to query to the search index. You can configure advanced settings such as the TTL and presorting settings for the search index.
Create a search index by using the default settings
The following sample code provides an example on how to create a search index that consists of the Col_Keyword and Col_Long columns. The value of the Col_Keyword column is of the STRING type and the value of the Col_Long column is of the LONG type. The data in the search index is presorted by primary key and never expires.
private static void createSearchIndex(SyncClient client) {
CreateSearchIndexRequest request = new CreateSearchIndexRequest();
// Specify the name of the data table.
request.setTableName("<TABLE_NAME>");
// Specify the name of the search index.
request.setIndexName("<SEARCH_INDEX_NAME>");
IndexSchema indexSchema = new IndexSchema();
indexSchema.setFieldSchemas(Arrays.asList(
// Specify the name and type of the field.
new FieldSchema("Col_Keyword", FieldType.KEYWORD),
new FieldSchema("Col_Long", FieldType.LONG)));
request.setIndexSchema(indexSchema);
// Call the client to create the search index.
client.createSearchIndex(request);
}
Configure the presorting settings when you create a search index
The following sample code provides an example on how to create a search index with the indexSort parameter specified. The search index consists of the following columns: Col_Keyword, Col_Long, Col_Text, and Timestamp. The value of the Col_Keyword column is of the STRING type, the value of the Col_Long column is of the LONG type, the value of the Col_Text column is of the TEXT type, and the value of the Timestamp column is of the LONG type. The data in the search index is presorted based on the Timestamp column.
private static void createSearchIndexWithIndexSort(SyncClient client) {
CreateSearchIndexRequest request = new CreateSearchIndexRequest();
// Specify the name of the data table.
request.setTableName("<TABLE_NAME>");
// Specify the name of the search index.
request.setIndexName("<SEARCH_INDEX_NAME>");
IndexSchema indexSchema = new IndexSchema();
indexSchema.setFieldSchemas(Arrays.asList(
new FieldSchema("Col_Keyword", FieldType.KEYWORD),
new FieldSchema("Col_Long", FieldType.LONG),
new FieldSchema("Col_Text", FieldType.TEXT),
new FieldSchema("Timestamp", FieldType.LONG)
.setEnableSortAndAgg(true)));
// Presort data based on the Timestamp column.
indexSchema.setIndexSort(new Sort(
Arrays.<Sort.Sorter>asList(new FieldSort("Timestamp", SortOrder.ASC))));
request.setIndexSchema(indexSchema);
// Call the client to create the search index.
client.createSearchIndex(request);
}
Specify the TTL when you create a search index
The TTL feature of a search index is mutually exclusive with the UpdateRow operation on the data table for which the search index is created. Before you specify the TTL of a search index, make sure that the Allow Updates parameter is set to No for the data table for which the search index is created.
The following sample code provides an example on how to create a search index with the TTL specified. The search index consists of the Col_Keyword and Col_Long columns. The value of the Col_Keyword column is of the STRING type and the value of the Col_Long column is of the LONG type. The TTL of the search index is seven days.
// Use Tablestore SDK for Java V5.12.0 or later to create a search index.
public static void createIndexWithTTL(SyncClient client) {
int days = 7;
CreateSearchIndexRequest createRequest = new CreateSearchIndexRequest();
// Specify the name of the data table.
createRequest.setTableName("<TABLE_NAME>");
// Specify the name of the search index.
createRequest.setIndexName("<SEARCH_INDEX_NAME>");
IndexSchema indexSchema = new IndexSchema();
indexSchema.setFieldSchemas(Arrays.asList(
// Specify the name and type of the field.
new FieldSchema("Col_Keyword", FieldType.KEYWORD),
new FieldSchema("Col_Long", FieldType.LONG)));
createRequest.setIndexSchema(indexSchema);
// Specify the TTL for the search index.
createRequest.setTimeToLiveInDays(days);
// Call the client to create the search index.
client.createSearchIndex(createRequest);
}
Step 2: Use the search index to query data
When you use a search index to query data, select the query types based on your business requirements. You can configure the columns to be returned and the method of sorting the returned data.
A search index supports the following query types by using Tablestore SDKs: match all query, term query, range query, prefix query, match query, wildcard query, match phrase query, exists query, terms query, Boolean query, geo query on geo fields, and nested query on nested fields. Geo queries are classified into the following types: geo-distance query, geo-bounding box query, and geo-polygon query.
Query type | Query | Description |
Match all query | This query is used to query the total number of rows in a data table or randomly retrieve multiple rows from a data table. | |
Term query | This query uses full and exact matches to retrieve data from a data table. A term query is similar to a query based on a specific string. If a column of the TEXT type is queried and one of the tokens in a row exactly matches the specified keyword, the row meets the query conditions. | |
Range query | This query retrieves data within the specified range from a data table. If a column of the TEXT type is queried and one of the tokens in a row is within the specified range, the row meets the query conditions. | |
Prefix query | This query retrieves data that contains the specified prefix from a data table. If a column of the TEXT type is queried and one of the tokens in a row contains the specified prefix, the row meets the query conditions. | |
Match query | This query uses approximate matches to retrieve data from a data table. The keyword that you use for a match query and the column values are tokenized based on the analyzer that you specify. Then, a match query is performed based on the tokens. The OR logical operator is used to relate tokens. If the number of tokens in a row that match the tokens in the tokenized keyword reaches the minimum value that you specify, the row meets the query conditions. | |
Wildcard query | This query retrieves data from a data table if the data matches a string that contains one or more wildcard characters. You can use the asterisk (*) and question mark (?) wildcard characters in a string. The asterisk (*) matches a string of any length in, before, or after a term that you want to query. The question mark (?) matches a single character in a specific position. | |
Match phrase query | This query is similar to a match query. A row meets the query conditions only when the order and positions of the tokens in the row match the order and positions of the tokens that are contained in the tokenized keyword. | |
Exists query | This query is also called a NULL query or NULL-value query. This query is used for sparse data to determine whether a column of a row exists. For example, you can query the rows in which the value of the address column is not empty. If a column does not exist in a row or the value of the column is an empty array ("[]"), the column does not exist in the row. | |
Terms query | This query is similar to a term query except that you can specify multiple keywords at the same time. If one of the tokens in a row matches one of the keywords, the row meets the query conditions. | |
Boolean query | This query retrieves data based on one or more subqueries. Tablestore returns the rows that meet the conditions of the subqueries. Subquery conditions can be combined by using logical operators, such as AND, NOT, and OR. | |
Geo-distance query | You can specify a circular geographic area that consists of a central point and radius as a query condition. Tablestore returns the rows in which the value of the specified column falls within the circular geographic area. | |
Geo-bounding box query | You can specify a rectangular geographic area as a query condition. Tablestore returns the rows in which the value of the specified column falls within the rectangular geographic area. | |
Geo-polygon query | You can specify a polygon geographic area as a query condition. Tablestore returns the rows in which the value of the specified column falls within the polygon geographic area. | |
Nested query | This query retrieves data of nested fields. |
Step 3: Use the search index to analyze data
If you want to analyze data in a data table, you can call the Search operation to use the aggregate feature. For example, you can use the aggregate feature to query the maximum value, the sum of values, and the number of rows, and group data by field value. For more information, see Aggregation.
Step 4: Use the search index to export data
If you want to obtain all rows that meet the query conditions without sorting the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.
Appendix: Manage search indexes
This section describes the operations that you can perform on the search indexes that are created.
Query the details of a search index
The following sample code provides an example on how to query the details of a search index, such as the time to live (TTL), creation time, synchronization status, and field information:
private static DescribeSearchIndexResponse describeSearchIndex(SyncClient client) {
DescribeSearchIndexRequest request = new DescribeSearchIndexRequest();
// Specify the name of the data table.
request.setTableName("<TABLE_NAME>");
// Specify the name of the search index.
request.setIndexName("<INDEX_NAME>");
DescribeSearchIndexResponse response = client.describeSearchIndex(request);
// Display the details of the response.
System.out.println(response.jsonize());
// Display the synchronization status of the data in the search index.
System.out.println(response.getSyncStat().getSyncPhase().name());
return response;
}
Queries a list of search indexes
The following sample code provides an example on how to list search indexes that are created for a table:
private static List<SearchIndexInfo> listSearchIndex(SyncClient client) {
ListSearchIndexRequest request = new ListSearchIndexRequest();
// Specify the name of the data table.
request.setTableName("<TABLE_NAME>");
// Query all search indexes that are created for the data table.
return client.listSearchIndex(request).getIndexInfos();
}
Change the TTL of a search index
// Use Tablestore SDK for Java V5.12.0 or later to create a search index.
public void updateIndexWithTTL(SyncClient client) {
int days = 7;
UpdateSearchIndexRequest updateSearchIndexRequest = new UpdateSearchIndexRequest(tableName, indexName);
// Change the TTL of a search index.
updateSearchIndexRequest.setTimeToLiveInDays(days);
client.updateSearchIndex(updateSearchIndexRequest);
}
Delete a search index
The following sample code provides an example on how to delete a search index that is created for the specified data table:
private static void deleteSearchIndex(SyncClient client) {
DeleteSearchIndexRequest request = new DeleteSearchIndexRequest();
// Specify the name of the data table.
request.setTableName("<TABLE_NAME>");
// Specify the name of the search index.
request.setIndexName("<INDEX_NAME>");
// Call the client to delete the search index.
client.deleteSearchIndex(request);
}
FAQ
References
You can also use search indexes in the Tablestore console or Tablestore CLI. For more information, see Use the Tablestore console or Use the Tablestore CLI.
If you call the Search operation to query data, you can sort or paginate the rows that meet the query conditions by using the sorting and paging features. For more information, see Sorting and paging.
If you call the Search operation to query data, you can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to perform a full-text search, you must tokenize the field for which tokenization can be performed and select a suitable query method to query data. For more information, see Tokenization.
If you want to store and query data in multiple logical relationships, you can store the data as a nested field and perform a nested query to query the data. For more information, see ARRAY and Nested field types and Nested query.
If you want the system to automatically delete the data that is retained in a search index for a period of time that exceeds the specified duration, you can specify the TTL of the search index. For more information, see Specify the TTL of a search index.
If you want to add indexed columns to, update indexed columns in, or remove indexed columns from a search index, you can dynamically modify the schema of the search index. For more information, see Dynamically modify schemas.
If you want to query new fields or data of new field types without modifying the storage schema and the data in data tables, you can use the virtual column feature. For more information, see Virtual columns.