Use the CreateSearchIndex method to create a search index for a data table. A data table can have multiple search indexes. When you create a search index, add the fields that you want to query to the index. You can also configure advanced options, such as custom routing keys and pre-sorting.
Prerequisites
Initialize the Tablestore client. For more information, see Initialize Tablestore Client.
You have completed creating a data table that meets the following conditions:
The max versions must be 1.
The time to live (TTL) is -1, or updates to the table are disabled.
Notes
When you create a search index, the data type of a field in the index must match the data type of the corresponding field in the data table.
If you want to set a specific TTL for a search index (a value other than -1), you must disable the UpdateRow write feature for the data table. The TTL of the search index must be less than or equal to the TTL of the data table. For more information, see Lifecycle management.
Parameters
When you create a search index, specify the table name (table_name), the search index name (index_name), and the index schema (schema). The schema includes field schemas (field_schemas), index settings (index_setting), and index pre-sorting settings (index_sort). The following table describes these parameters.
Component | Description |
table_name | The name of the data table. |
index_name | The name of the search index. |
field_schemas | A list of field_schema objects. Each field_schema contains the following parameters:
|
index_setting | Index settings, which include the routing_fields setting. routing_fields (Optional): Custom routing fields. You can select some primary key columns as routing fields. Typically, you only need to set one. If you set multiple routing keys, the system concatenates their values into a single value. When writing index data, the system calculates the data distribution based on the values of the routing fields. Records with the same routing field values are indexed into the same data partition. |
index_sort | Index pre-sorting settings, which include the sorters setting. If you do not set this, the data is sorted by primary key by default. Note Indexes that contain Nested fields do not support indexSort. No pre-sorting is performed. sorters (Required): The pre-sorting method for the index. You can sort by primary key or by field value. For more information about sorting, see Sorting and pagination.
|
Examples
Specify an analyzer when creating a search index
The following example shows how to specify tokenizers when you create a search index. The search index contains six fields: k (Keyword), t (Text), g (Geopoint), ka (Keyword array), la (Long array), and n (Nested). The n field has three sub-fields: nk (Keyword), nl (Long), and nt (Text).
def create_search_index(client):
# A Keyword field. Create an index and enable statistical aggregation.
field_a = FieldSchema('k', FieldType.KEYWORD, index=True, enable_sort_and_agg=True)
# A Text field. Create an index and use single-word tokenization.
field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SINGLEWORD)
# A Text field. Create an index and use fuzzy tokenization.
#field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.FUZZY,analyzer_parameter=FuzzyAnalyzerParameter(1, 6))
# A Text field. Create an index and use a custom separator (a comma) for tokenization.
#field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SPLIT, analyzer_parameter = SplitAnalyzerParameter(","))
# A Geopoint field. Create an index.
field_c = FieldSchema('g', FieldType.GEOPOINT, index=True)
# A Keyword array field. Create an index.
field_d = FieldSchema('ka', FieldType.KEYWORD, index=True, is_array=True)
# A Long array field. Create an index.
field_e = FieldSchema('la', FieldType.LONG, index=True, is_array=True)
# A Nested field that includes three sub-fields: nk (Keyword), nl (Long), and nt (Text).
field_n = FieldSchema('n', FieldType.NESTED, sub_field_schemas=[
FieldSchema('nk', FieldType.KEYWORD, index=True),
FieldSchema('nl', FieldType.LONG, index=True),
FieldSchema('nt', FieldType.TEXT, index=True),
])
fields = [field_a, field_b, field_c, field_d, field_e, field_n]
index_setting = IndexSetting(routing_fields=['PK1'])
index_sort = None # When a search index contains a Nested field, you cannot set index pre-sorting.
#index_sort = Sort(sorters=[PrimaryKeySort(SortOrder.ASC)])
index_meta = SearchIndexMeta(fields, index_setting=index_setting, index_sort=index_sort)
client.create_search_index('<TABLE_NAME>', '<SEARCH_INDEX_NAME>', index_meta)Create a search index and configure vector fields
The following example shows how to create a search index. The search index contains three fields: col_keyword (Keyword), col_long (Long), and col_vector (Vector). The distance measure algorithm for the vector field is the dot product.
def create_search_index(client):
index_meta = SearchIndexMeta([
FieldSchema('col_keyword', FieldType.KEYWORD, index=True, enable_sort_and_agg=True), # String type
FieldSchema('col_long', FieldType.LONG, index=True), # Numeric type
FieldSchema("col_vector", FieldType.VECTOR, # Vector type
vector_options=VectorOptions(
data_type=VectorDataType.VD_FLOAT_32,
dimension=4, # The vector dimension is 4, and the similarity algorithm is dot product.
metric_type=VectorMetricType.VM_DOT_PRODUCT
)),
])
client.create_search_index(table_name, index_name, index_meta)Enable summary and highlighting when creating a search index
The following example shows how to enable summary and highlighting when you create a search index. The search index contains three fields: k (Keyword), t (Text), and n (Nested). The n field has three sub-fields: nk (Keyword), nl (Long), and nt (Text). The summary and highlighting feature is enabled for the t field and the nt sub-field of the n field.
def create_search_index0905(client):
# A Keyword field. Create an index and enable statistical aggregation.
field_a = FieldSchema('k', FieldType.KEYWORD, index=True, enable_sort_and_agg=True)
# A Text field. Create an index, use single-word tokenization, and enable summary and highlighting for the field.
field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SINGLEWORD,
enable_highlighting=True)
# A Nested field that includes three sub-fields: nk (Keyword), nl (Long), and nt (Text). The summary and highlighting feature is enabled for the nt sub-column.
field_n = FieldSchema('n', FieldType.NESTED, sub_field_schemas=[
FieldSchema('nk', FieldType.KEYWORD, index=True),
FieldSchema('nl', FieldType.LONG, index=True),
FieldSchema('nt', FieldType.TEXT, index=True, enable_highlighting=True),
])
fields = [field_a, field_b, field_n]
index_setting = IndexSetting(routing_fields=['id'])
index_sort = None # When a search index contains a Nested field, you cannot set index pre-sorting.
# index_sort = Sort(sorters=[PrimaryKeySort(SortOrder.ASC)])
index_meta = SearchIndexMeta(fields, index_setting=index_setting, index_sort=index_sort)
client.create_search_index('pythontest', 'pythontest_0905', index_meta)FAQ
References
After you create a search index, you can perform multi-dimensional data queries using various query types, such as term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, vector search, nested query, and column existence query.
When you query data, you can also perform sorting and pagination, highlighting, or collapse (deduplication) operations on the result set.
After you create a search index, you can perform various management operations. These operations include lifecycle management, dynamically modifying the schema, listing search indexes, querying search index descriptions, and deleting a search index.
To perform data analytics, such as finding the maximum or minimum value, calculating a sum, or counting rows, you can use the statistical aggregation feature or the SQL query feature.
To quickly export data when the order of the entire result set is not important, you can use the parallel scan feature.