All Products
Search
Document Center

Tablestore:Create a search index

Last Updated:Oct 15, 2024

You can call the CreateSearchIndex operation to create one or more search indexes for a data table. When you create a search index, you can add the fields that you want to query to the search index and configure advanced settings for the search index. For example, you can configure the routing key and presorting settings.

Prerequisites

  • An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.

  • A data table for which the max Versions parameter is set to 1 is created. In addition, the timeToLive parameter of the data table must meet one of the following conditions: For more information, see Create a data table.

    • The timeToLive parameter of the data table is set to -1, which specifies that the data in the data table never expires.

    • The timeToLive parameter of the data table is set to a value other than -1, and update operations on the data table are prohibited.

Usage notes

  • The data types of the fields in a search index must match the data types of the fields in the data table for which the search index is created. For more information, see Data type mappings.

  • To set the timeToLive parameter of a search index to a value other than -1, make sure that the UpdateRow operation is prohibited on the data table for which the search index is created. The value of the timeToLive parameter for the search index must be less than or equal to the value of the timeToLive parameter for the data table. For more information, see Specify the TTL of a search index.

Parameters

When you create a search index, you must configure the tableName, indexName, and schema parameters. You must also configure the fieldSchemas, indexSetting, and indexSort parameters in the schema parameter. The following table describes the preceding parameters.

Parameter

Description

tableName

The name of the data table.

indexName

The name of the search index.

fieldSchemas

The list of field schemas. In each field schema, configure the following parameters:

  • fieldName (required): the name of the field in the search index. The value is used as a column name. Type: String.

    A field in a search index can be a primary key column or an attribute column of the data table for which the search index is created.

  • fieldType (required): the type of the field. Specify the type in the TableStore.FieldType.XXX format. For more information, see Data type mappings.

  • index (optional): specifies whether to enable indexing. Type: Boolean.

    Default value: true. A value of true specifies that Tablestore indexes the field with an inverted indexing schema or a spatio-temporal indexing schema. A value of false specifies that Tablestore does not enable indexing for the field.

  • analyzer (optional): the type of the analyzer that you want to use. If you set the fieldType parameter to Text, you can configure this parameter. If you do not configure this parameter, the default analyzer type single-word tokenization is used. For more information, see Tokenization.

  • analyzerParameter (optional): the settings of the analyzer. Configure this parameter based on the type of analyzer that you specified. For more information, see Tokenization. If you configure the analyzer parameter for a field, you must configure this parameter for the field.

  • enableSortAndAgg (optional): specifies whether to enable sorting and aggregation. Type: Boolean.

    Sorting can be enabled only for fields for which the enableSortAndAgg parameter is set to true. For more information, see Sorting and paging.

    Important

    Nested fields do not support sorting and aggregation. The subfields of Nested fields support sorting and aggregation.

  • store (optional): specifies whether to store the value of the field in the search index. Type: Boolean.

    If you set the store parameter to true, you can read the value of the field from the search index without the need to query the data table. This improves query performance.

  • isAnArray (optional): specifies whether the value is an array. Type: Boolean.

    If you set this parameter to true, the field stores data as an array. Data written to the field must be a JSON array. Example: ["a","b","c"].

    The values of Nested fields are arrays. If you set the fieldType parameter to Nested, ignore this parameter.

  • fieldSchemas (optional): the list of field schemas for subfields. If the field is a Nested field, you must specify this parameter to configure the index types of subfields in the Nested field.

  • isVirtualField (optional): specifies whether the field is a virtual column. Type: Boolean. Default value: false. If you set this parameter to true, you can use virtual columns. For more information, see Virtual columns.

  • sourceFieldName (optional): the name of the source field to which the virtual column is mapped in the data table. Type: String. If you set the isVirtualField parameter to true, you must configure this parameter.

  • dateFormats (optional): the format of dates. Type: String. If you set the fieldType parameter to Date, you must configure this parameter. For more information, see Date data type.

  • enableHighlighting (optional): specifies whether to enable the highlight feature. Type: Boolean. Default value: false. The default value specifies that the highlight feature is disabled. If you set this parameter to true, you can use the highlight feature. Only Text fields support the highlight feature.

    Important
    • You can enable the highlight feature only by using Tablestore SDKs.

    • Tablestore SDK for Node.js V5.5.0 and later support the highlight feature.

  • vectorOptions (optional): specifies the properties of Vector fields. If you set the fieldType parameter to Vector, you must configure this parameter. A Vector field contains the following properties:

    • dataType: the vector type. Only float32 is supported. If you want to use other types of Vector data, submit a ticket.

    • dimension: the vector dimension. For information about the limits on the number of vector dimensions, see Search index limits.

    • metricType: the algorithm that you want to use to measure the distance between vectors. Valid values: euclidean, cosine, and dot_product.

      • euclidean: the Euclidean distance algorithm that measures the shortest path between two vectors in a multi-dimensional space. For better performance, the Euclidean distance algorithm in Tablestore does not perform the final square root calculation. A greater value that is obtained by using the Euclidean distance algorithm indicates a higher similarity between two vectors.

      • cosine: the cosine similarity algorithm that calculates the cosine of the angle between two vectors in a vector space. A greater value that is obtained by using the cosine similarity algorithm indicates a higher similarity between two vectors. In most cases, the algorithm is used to calculate the similarity between text data.

      • dot_product: the dot product algorithm that multiplies the corresponding coordinates of two vectors of the same dimension and adds the products. A greater value that is obtained by using the dot product algorithm indicates a higher similarity between two vectors.

      For more information, see Appendix: distance measurement algorithms for vectors.

indexSetting

The settings of the search index, including the settings of the routingFields parameter.

routingFields (optional): the custom routing fields. You can specify specific primary key columns as routing fields. Tablestore distributes data that is written to a search index across different partitions based on the specified routing fields. Data with the same routing field values is distributed to the same partition.

indexSort

The presorting settings of the search index, including the settings of the sorters parameter. If you do not configure the indexSort parameter, field values are sorted by primary key.

Note

If you set the fieldType parameter to Nested, you cannot configure the indexSort parameter.

sorters: This parameter is required and specifies the presorting method for the search index. Valid values: PrimaryKeySort and FieldSort. For more information, see Sorting and paging.

  • PrimaryKeySort: sorts data by primary key. If you set the sorters parameter to PrimaryKeySort, you must configure the following parameter:

    order: the sorting order. Data can be sorted in ascending or descending order. Default value: TableStore.SortOrder.SORT_ORDER_ASC. This specifies that data is sorted in ascending order.

  • FieldSort: sorts data by field value. If you set the sorters parameter to FieldSort, you must configure the following parameters:

    Note

    Only fields for which an index is created and the enableSortAndAgg parameter is set to true can be presorted.

    • fieldName: the name of the field that is used to sort data.

    • order: the sorting order. Data can be sorted in ascending or descending order. Default value: TableStore.SortOrder.SORT_ORDER_ASC. This specifies that data is sorted in ascending order.

    • mode: the sorting method that is used if the field contains multiple values.

timeToLive

(Optional) The retention period of the data in the search index. Default value: -1.

If the retention period of data exceeds the value of the timeToLive parameter, the data expires. Tablestore automatically deletes the expired data.

The value of this parameter must be greater than or equal to 86400. A value of 86400 specifies one day. You can also set this parameter to -1, which specifies that data never expires.

Examples

Create a search index with an analyzer type specified

The following sample code provides an example on how to create a search index with an analyzer type specified. In this example, the search index consists of the following fields: the pic_id field of the Keyword type, the count field of the Long type, the time_stamp field of the Long type, the pic_description field of the Text type, the col_vector field of the Vector type, the pos field of the Geo-point type, the pic_tag field of the Nested type, the date field of the Date type, the analyzer_single_word field of the Text type, the analyzer_split field of the Text type, and the analyzer_fuzzy field of the Text type. The pic_tag field consists of the sub_tag_name subfield of the Keyword type and the tag_name subfield of the Keyword type. The analyzer type is single-word tokenization for the analyzer_single_word field, delimiter tokenization for the analyzer_split field, and fuzzy tokenization for the analyzer_fuzzy field.

client.createSearchIndex({
    tableName: "<TABLE_NAME>", // Specify the name of the data table. 
    indexName: "<INDEX_NAME>", // Specify the name of the search index. 
    schema: {
        fieldSchemas: [
            {
                fieldName: "pic_id",
                fieldType: TableStore.FieldType.KEYWORD, // Specify the name and type of the field. 
                index: true, // Enable indexing for the field. 
                enableSortAndAgg: true, // Enable sorting and aggregation for the field. 
                store: false,
                isAnArray: false
            },
            {
                fieldName: "count",
                fieldType: TableStore.FieldType.LONG,
                index: true,
                enableSortAndAgg: true,
                store: true,
                isAnArray: false
            },
            {
                fieldName: "time_stamp",
                fieldType: TableStore.FieldType.LONG,
                index: true,
                enableSortAndAgg: false,
                store: true,
                isAnArray: false,
            },
            {
                fieldName: "pic_description",
                fieldType: TableStore.FieldType.TEXT,
                index: true,
                enableSortAndAgg: false,
                store: true,
                isAnArray: false,
            },
            {
                fieldName: "col_vector",
                fieldType: TableStore.FieldType.VECTOR,
                index: true,
                isAnArray: false,
                vectorOptions: {
                    dataType: TableStore.VectorDataType.VD_FLOAT_32,
                    dimension: 4,
                    metricType: TableStore.VectorMetricType.VM_COSINE,
                }
            },
            {
                fieldName: "pos",
                fieldType: TableStore.FieldType.GEO_POINT,
                index: true,
                enableSortAndAgg: true,
                store: true,
                isAnArray: false,
            },
            {
                fieldName: "pic_tag",
                fieldType: TableStore.FieldType.NESTED,
                index: false,
                enableSortAndAgg: false,
                store: false,
                fieldSchemas: [
                    {
                        fieldName: "sub_tag_name",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                        store: false,
                    },
                    {
                        fieldName: "tag_name",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                        store: false,
                    }
                ]
            },
            {
                fieldName: "date",
                fieldType: TableStore.FieldType.DATE,
                index: true,
                enableSortAndAgg: true,
                store: true,
                isAnArray: false,
                dateFormats: ["yyyy-MM-dd'T'HH:mm:ss.SSSSSS"],
            },
            {
                fieldName: "analyzer_single_word",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "single_word",
                index: true,
                enableSortAndAgg: false,
                store: true,
                isAnArray: false,
                analyzerParameter: {
                    caseSensitive: true,
                    delimitWord: false,
                }
            },
            {
                fieldName: "analyzer_split",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "split",
                index: true,
                enableSortAndAgg: false,
                store: true,
                isAnArray: false,
                analyzerParameter: {
                    delimiter: ",",
                }
            },
            {
                fieldName: "analyzer_fuzzy",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "fuzzy",
                index: true,
                enableSortAndAgg: false,
                store: true,
                isAnArray: false,
                analyzerParameter: {
                    minChars: 1,
                    maxChars: 5,
                }
            },
        ],
        indexSetting: { // Configure the settings of the search index. 
            "routingFields": ["count", "pic_id"], // You can specify only the primary key columns of the data table as the routing fields of the search index. 
            "routingPartitionSize": null
        },
        //indexSort: {// You can skip the presorting settings for search indexes that contain Nested fields. 
            //sorters: [
                // { // If you do not configure the indexSort parameter, data is sorted by primary key in ascending order. 
                //     primaryKeySort: {
                //         order: TableStore.SortOrder.SORT_ORDER_ASC
                //     }
                // },
                //{
                //   fieldSort: {
                //        fieldName: "Col_Keyword",
                //        order: TableStore.SortOrder.SORT_ORDER_DESC // Specify the index sort order. 
                //    }
                //}
            //]
        //},
        timeToLive: 1000000, // Specify the data retention period. Unit: seconds. 
    }
}, function (err, data) {
    if (err) {
        console.log('error:', err);
        return;
    }
    console.log('success:',data);
});

Create a search index with the highlight feature enabled

The following sample code provides an example on how to create a search index with the highlight feature enabled. In this example, the search index consists of the following fields: the k field of the Keyword type, the t field of the Text type, and the n field of the Nested type. The n field consists of the following subfields: the nk subfield of the Keyword type, the nk subfield of the Long type, and the nt subfield of the Text type. In addition, the highlight feature is enabled for the t field of the Text type and the nt subfield of the Text type.

client.createSearchIndex({
    tableName: "<TABLE_NAME>", // Specify the name of the data table. 
    indexName: "<SEARCH_INDEX_NAME>", // Specify the name of the search index. 
    schema: {
        fieldSchemas: [
            {
                fieldName: "k",
                fieldType: TableStore.FieldType.KEYWORD, // Specify the name and type of the field. 
                index: true, // Enable indexing for the field. 
                enableSortAndAgg: true, // Enable sorting and aggregation for the field. 
                store: false,
                isAnArray: false
            },
            {
                fieldName: "t",
                fieldType: TableStore.FieldType.TEXT,
                index: true,
                enableSortAndAgg: false,
                enableHighlighting: true, // Enable highlight for the field. 
                store: true,
                isAnArray: false,
            },
            {
                fieldName: "n",
                fieldType: TableStore.FieldType.NESTED,
                index: false,
                enableSortAndAgg: false,
                store: false,
                fieldSchemas: [
                    {
                        fieldName: "nk",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                        store: false,
                    },
                    {
                        fieldName: "nl",
                        fieldType: TableStore.FieldType.LONG,
                        index: true,
                        enableSortAndAgg: true,
                        store: false,
                    },
                    {
                        fieldName: "nt",
                        fieldType: TableStore.FieldType.TEXT,
                        index: true,
                        enableSortAndAgg: false,
                        enableHighlighting: true, // Enable highlight for the field. 
                        store: false,
                    },
                ]
            },
        ],
        indexSetting: { // Configure the settings of the search index. 
            "routingFields": ["id"], // You can specify only the primary key columns of the data table as the routing fields of the search index. 
            "routingPartitionSize": null
        },
        //indexSort: {// You can skip the presorting settings for search indexes that contain Nested fields. 
            //sorters: [
                // { // If you do not configure the indexSort parameter, data is sorted by primary key in ascending order. 
                //     primaryKeySort: {
                //         order: TableStore.SortOrder.SORT_ORDER_ASC
                //     }
                // },
                //{
                //   fieldSort: {
                //        fieldName: "Col_Keyword",
                //        order: TableStore.SortOrder.SORT_ORDER_DESC // Specify the index sort order. 
                //    }
                //}
            //]
        //},
        timeToLive: 1000000, // Specify the data retention period. Unit: seconds. 
    }
}, function (err, data) {
    if (err) {
        console.log('error:', err);
        return;
    }
    console.log('success:',data);
});

FAQ

References

  • After you create a search index, you can use the query methods provided by the search index to query data from multiple dimensions based on your business requirements. A search index usually provides the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, KNN vector query, nested query, and exists query.

    If you call the Search operation to query data, you can sort or paginate the rows that meet the query conditions by using the sorting and paging features. For more information, see Sorting and paging.

  • If you call the Search operation to query data, you can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).

  • You can specify the time to live (TTL) for a search index to delete historical data in the search index or extend the retention period of data in the search index. For more information, see Update information about search indexes.

  • If you want to analyze data in a table, you can call the Search operation to use the aggregation feature or the SQL query feature. For example, you can query the maximum and minimum values, the sum of the values, and the number of rows. For more information, see Aggregation and SQL query.

  • If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.

  • You can dynamically modify the schema of a search index to add, update, or remove index columns in the search index. For more information, see Dynamically modify the schema of a search index.

  • You can call the ListSearchIndex operation to query all search indexes that are created for a data table. For more information, see List search indexes.

  • You can call the DescribeSearchIndex operation to query the description of a search index, such as the field information and search index configurations. For more information, see Query the description of a search index.

  • You can delete a search index that you no longer require. For more information, see Delete a search index.