All Products
Search
Document Center

Tablestore:Search index

Last Updated:Aug 19, 2024

After a data table is created, you can use Tablestore CLI to create a search index, obtain a list of search indexes, query information about a search index, use a search index to query, and delete search indexes.

Create a search index

You can create a search index.

Syntax

create_search_index -n search_index_name

Parameter description:

Parameter

Required

Example

Description

-i, --input

No

/tmp/indexschema.json

The JSON file that is used to configure the search index schema.

-n, --name

Yes

search_index

The name of the search index.

-t,--table

No

mytable

The name of the data table.

--ttl

No

-1

The retention period of data. The system deletes data based on a Time-to-live (TTL) mechanism. If the retention period exceeds the TTL, Tablestore automatically deletes the expired data. Default value: -1. Unit: seconds.

A value of -1 indicates that data never expires. If you want to specify a data TTL, the value must be equal to or greater than 86400 seconds (one day).

Important

To use the TTL feature, you must set Allow Updates to No for the data table in which the search index is created. The TTL value of a search index is independent of and must be smaller than or equal to the TTL value of the data table for which the search index is created. For more information, see Specify the TTL of a search index.

Examples

Create a search index named search_index:

create_search_index -n search_index

Enter the index schema as prompted:

 {

    "IndexSetting": {
        "RoutingFields": null
    },
    "FieldSchemas": [
        {
            "FieldName": "gid",
            "FieldType": "LONG",
            "Index": true,
            "EnableSortAndAgg": true,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": false
        },
        {
            "FieldName": "uid",
            "FieldType": "LONG",
            "Index": true,
            "EnableSortAndAgg": true,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": false
        },
        {
            "FieldName": "col2",
            "FieldType": "LONG",
            "Index": true,
            "EnableSortAndAgg": true,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": false
        },
        {
            "FieldName": "col3",
            "FieldType": "TEXT",
            "Index": true,
            "Analyzer": "single_word",
            "AnalyzerParameter": {
                "CaseSensitive": true,
                "DelimitWord": null
            },
            "EnableSortAndAgg": false,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": false
        },
        {
            "FieldName": "col1",
            "FieldType": "KEYWORD",
            "Index": true,
            "EnableSortAndAgg": true,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": false
        },
        {
            "FieldName": "col3V",
            "FieldType": "LONG",
            "Index": true,
            "EnableSortAndAgg": true,
            "Store": true,
            "IsArray": false,
            "IsVirtualField": true,
            "SourceFieldNames": [
                "col3"
            ]
        }
    ]
}

The following table describes the parameters of the search index schema.

Parameter

Required

Description

IndexSetting

No

The setting of the search index, which includes the following parameter:

RoutingFields (optional): The custom routing fields. You can specify some primary key columns as routing fields. Tablestore distributes data that is written to a search index across different partitions based on the specified routing fields. The data records whose routing field values are the same is distributed to the same partition.

FieldSchemas

Yes

The list of field schemas. You can configure the following parameters for each field schema:

  • FieldName (required): The name of the field in the search index. The value is used as the column name. Type: String.

    A column in a search index can be a primary key column or an attribute column of the data table.

  • FieldType (required): The type of the field. For more information, see Data type mappings.

  • IsArray (optional): Specifies whether the value is an array. Type: Boolean.

    If you set this parameter to true, the column stores data as an array. Data written to the column must be a JSON array. Example: ["a","b","c"].

    Nested values are arrays. If you set FieldType to Nested, skip this parameter.

  • Index (optional): Specifies whether to enable indexing for the column. Type: Boolean.

    Default value: true. A value of true indicates that Tablestore creates an inverted index or a spatial index for the column. A value of false indicates that Tablestore does not create an index for the column.

  • Analyzer (optional): The type of the analyzer that you want to use. If FieldType is set to Text, you can configure this parameter. If you do not configure this parameter, the default analyzer type single-word tokenization is used. For more information about tokenization, see Tokenization.

  • AnalyzerParameter: The tokenization configuration when you use single-word tokenization. It includes the CaseSensitive and DelimitWord parameters.

    • CaseSensitive: Specifies whether to enable case sensitivity. Default value: false. If you set the parameter to false, all letters are converted to lowercase letters. If you do not need the system to automatically convert letters to lowercase letters, set the CaseSensitive parameter to true.

    • DelimitWord: Specifies whether to tokenize alphanumeric characters. The default value is false. Letters and numbers that are connected together will not be split. If you want to split numbers and letters, set the DelimitWord parameter to true.

  • EnableSortAndAgg (optional): Specifies whether to enable sorting and aggregation. Type: Boolean.

    Sorting can be enabled only for fields for which EnableSortAndAgg is set to true. For more information, see Sorting and paging.

    Important

    Fields of the Nested type do not support sorting and aggregation, but subcolumns of fields of the Nested type support sorting and aggregation.

  • Store (optional): Specifies whether to store the value of the field in the search index. Type: Boolean.

    If you set the Store parameter to true, you can read the values of the column from the search index without the need to query the data table. This improves query performance.

  • IsVirtualField (optional): Specifies whether the column is virtual. Type: Boolean. Default value: false. This parameter is required only when you use virtual columns. For more information about virtual columns, see Virtual columns.

  • SourceFieldNames (optional): Specifies the name of the source field to which the virtual column is mapped in the data table. Type: String. This parameter is required when IsVirtualField is set to true.

IndexSort

No

The presorting setting for the search index, which includes the Sorters parameter. If no value is specified for the IndexSort parameter, field values are sorted based on primary key by default.

Important

You can skip the presorting settings for the search indexes that contain fields of the Nested type.

Sorters (optional): The indexing pre-sorting methods. It includes the Name and Sorter parameters. You can use the PrimaryKeySort and FieldSort methods. For more information, see Perform sorting and paging.

  • Name: The sorting method. Valid values:

    • PrimaryKeySort indicates sorting by primary key.

    • FieldSort indicates sorting by field values.

      Only fields for which indexing is enabled and enableSortAndAgg is set to true can be presorted.

  • Sorter: This parameter includes the following configurations:

    When Name is set to PrimaryKeySort, only Order is required. When Name is set to FieldSort, FieldName and Order are both required.

    • FieldName: the name of the field that is used to sort data.

    • Order: The sorting order, which can be in the ascending order (ASC) or the descending order (DESC). Default value: ASC.

Obtain a list of search indexes

You can obtain a list of search indexes that are created for the current data table.

Syntax

list_search_index

Parameter description:

Parameter

Required

Example

Description

-a, --all

No

N/A

Specifies that the search indexes of all data tables are displayed.

-d, --detail

No

N/A

Specifies that the details about the search indexes are displayed.

-t,--table

No

mytable

The name of the data table.

Examples

Obtain a list of the search indexes of the current data table:

list_search_index

Sample result:

+-----------+--------------+
| TableName | IndexName    |
+-----------+--------------+
| mytable   | search_index |
+-----------+--------------+

Query information about a search index

You can query information about a search index.

Syntax

describe_search_index -n search_index_name

Parameter description:

Parameter

Required

Example

Description

-n, --name

Yes

search_index

The name of the search index.

-o, --output

No

/tmp/describeinfo.json

Export query results to a JSON or TXT format file on your computer.

-t,--table

No

mytable

The name of the data table.

Examples

Query information about the search index named search_index:

describe_search_index -n search_index

The Index schema parameter in the returned results describes the search index schema, and TimeToLive indicates the data retention period for the search index.

Query data by using a search index

You can query data that meets specified conditions by using a search index and then perform aggregation operations on the returned results.

Syntax

search -n search_index_name --return_all_indexed

Parameter description:

Parameter

Required

Example

Description

-c, --column

No

col1,col2

The columns to return. Separate multiple columns with commas (,).

-n, --name

Yes

search_index

The name of the search index.

-f, --print_format

No

table

The format in which the output results are printed. Default value: table.

--return_all

No

N/A

Specifies to return all columns.

--return_all_indexed

No

N/A

Specifies to return all columns that are indexed in the search indexes.

-t,--table

No

mytable

The name of the data table.

Examples

Query data by using the search index named search_index and return all columns for which indexing is enabled:

search -n search_index --return_all_indexed

Enter the query conditions as prompted by the system:

{
    "Offset": -1,
    "Limit": 10,
    "Collapse": null,
    "Sort": null,
    "GetTotalCount": true,
    "Token": null,
    "Query": {
        "Name": "BoolQuery",
        "Query": {
            "MinimumShouldMatch": null,
            "MustQueries": null,
            "MustNotQueries": null,
            "FilterQueries": null,
            "ShouldQueries": [{
                "Name": "RangeQuery",
                "Query": {
                    "FieldName": "gid",
                    "From": null,
                    "To": 10,
                    "IncludeLower": false,
                    "IncludeUpper": false
                }
            }, {
                "Name": "TermQuery",
                "Query": {
                    "FieldName": "gid",
                    "Term": "77"
                }
            }]
        }
    },
    "Aggregations": [{
        "Name": "avg",
        "Aggregation": {
            "AggName": "agg1",
            "Field": "gid",
            "MissingValue": null
        }
    }]
}

Parameter description:

Parameter

Required

Description

Offset

No

The position from which the current query starts.

Limit

No

The maximum number of rows that you want the current query to return.

If you want the current query to return multiple random rows of data, set limit to a positive integer.

To query only the number of rows that meet the query conditions without returning specific data, set limit to 0.

Collapse

No

The configuration of the collapse parameter. It can be used to collapse the result set of a specified column, so that data of the specified types appears only once in the result set. This ensures diversity in the result set. For more information, see Collapse (distinct).

FieldName: The name of the column based on which the result set is collapsed. Only columns whose values are of the INTEGER, FLOATING-POINT, and KEYWORD data types are supported.

Sort

No

The sorting method of the results. For more information, see the "Specify a sorting method when you query data" section of Sorting and paging.

If a sorting method is not specified in the query, the query results are returned by the presorting method. If the presorting method is not specified when the search index is created, the query results are sorted by the primary key.

GetTotalCount

No

Specifies whether to return the total number of rows that meet the query conditions. The default value of this parameter is false, which indicates that the total number of rows that meet the query conditions is not returned.

If this parameter is set to true, the query performance is compromised. To obtain the total number of matched rows, set GetTotalCount to true.

Token

No

If Tablestore cannot read all data that meets the query conditions, Tablestore returns Token. You can use the Token value to continue to read the subsequent data. When you read data for the first time, set the token value to null.

Query

Yes

The query method. The following query methods are supported: match all query (MatchAllQuery), match query (MatchQuery), match phrase query (MatchPhraseQuery), term query (TermQuery), terms query (TermsQuery), and prefix query (PrefixQuery). For more information, see Use Tablestore SDKs.

Aggregations

No

The configuration of aggregation operations. You can perform aggregation operations to obtain the minimum value, maximum value, sum, average, and count and distinct count of rows. For more information, see Aggregation.

  • Name: The aggregation operation type, such as min, max, sum, avg, and count.

  • Aggregation: The aggregation parameters. It includes the following parameters:

    • AggName: The custom name for the aggregation that is used to differentiate aggregations.

    • Field: The field used for aggregation.

    • MissingValue: The default value of a field when it is empty in a record. If MissingValue is not set, the record is ignored during the aggregation. If MissingValue is set, the set value is used as the value of the field for aggregation.

Delete a search index

You can delete a search index that you no longer need.

Syntax

drop_search_index -n search_index_name -y

Parameter description:

Parameter

Required

Example

Description

-n, --name

Yes

search_index

The name of the search index.

-t,--table

No

mytable

The name of the data table.

-y, --yes

Yes

N/A

Specifies that the confirmation information is displayed. This parameter is required.

Examples

Delete the search index named search_index:

drop_search_index -n search_index -y