Lindorm provides two types of efficient and easy-to-use indexes: secondary indexes and search indexes. These indexes are individually applicable to queries in which the primary key is not used for matching and multi-dimensional queries. When you use the CREATE INDEX syntax to create an index, you can specify the index type and add attributes to the index.
Applicable engines and versions
The CREATE INDEX syntax is applicable to all versions of LindormTable.
To use the CREATE INDEX syntax to create a search index or columnar index, the Lindorm SQL version must be 2.6.1 or later. For more information about how to view the Lindorm SQL version, see SQL versions.
Syntax
create_index_statement ::= CREATE INDEX [IF NOT EXISTS] [ index_identifier ]
[ USING index_method_definition ]
ON table_identifier '(' index_key_expression ')'
[ INCLUDE include_expression]
[ PARTITION BY partition_definition ]
[ { ASYNC | SYNC} ]
[ WITH '(' index_options ')' ]
index_method_definition ::= { KV | SEARCH | COLUMNAR }
index_key_expression ::= '('
index_key_definition
|wildcard_string_literal
')'
index_key_definition ::= {
column_identifier [ DESC ]
| column_identifier'(' column_options ')'
| function_expression
}
[ (',' index_key_definition)* ]
column_options ::= '('
option_definition (',' option_definition )*
')'
function_expression ::= function_identifier
'('
column_identifer ( ',' column_identifer )*
')'
option_definition ::= option_identifer '=' string_literal
include_expression ::= '('
column_identifier( ',' column_identifier )*
')'
partition_definition ::= {
{RANGE TIME}
'('
column_identifer
')' [ PARTITIONS number_literal ]
|
HASH '('
column_identifer
( ',' column_identifer )*
')' [ PARTITIONS number_literal ]
|
ENUMERABLE
'('
column_identifer
( ',' column_identifer )*
index_options ::= '('
option_definition (',' option_definition )*
')'
Supported parameters
LindormTable allows you to use the CREATE INDEX syntax to create two types of indexes: secondary indexes and search indexes.
The following table describes the parameters that are supported when you create each type of indexes.
Parameter | Secondary index | Search index |
〇 | 〇 | |
〇 | 〇 | |
〇 | ✖️ | |
✖️ | 〇 | |
Index creation method (ASYNC|SYNC) Important Only LindormTable 2.6.3 and later versions support the | 〇 | 〇 |
〇 | 〇 |
Parameters
You can create up to three secondary indexes and one search index for a wide table.
Index type (index_method_definition)
You can specify the type of an index by using the USING
keyword when you create the index. The following table describes the values that you can configure for this parameter when you create different types of indexes supported by LindormTable.
Parameter value | Index type | Description |
KV | Secondary index | By default, if you do not explicitly specify an index type in the Important Only eight secondary indexes can be created for an instance at the same time. If eight secondary indexes are being created when you try to create a secondary index, the index fails to be created. |
SEARCH | Search index | Search indexes are used for full-text queries based on LindormSearch. Search indexes are suitable for complex multi-dimensional query scenarios, such as word segmentation, fuzzy queries, aggregate analysis, and sorting and paging. For more information, see Overview. Search indexes support all basic data types except for DATE, TIME, and DECIMAL. For more information about the supported data types, see Basic data types. Important
|
Index key expression (index_key_expression)
You can define one column or multiple columns as an index key. An index that consists of multiple index keys is also referred as a federated index.
Index key definition (index_key_definition)
You can use this parameter to add attributes for the index key of a search index. You can also use this parameter to specify an index key as a function expression.
Index key attributes for search indexes (option_definition)
You can specify attributes for index keys when you use the ALTER INDEX
syntax to add index columns. For example, you can specify c3(type=text,analyzer=ik)
in the ALTER INDEX statement to create an index for the c3 column and specify that the IK analyzer is used for the c3 column.
The following table describes the index key attributes that are supported by search indexes.
Attribute | Data type | Description |
indexed | STRING | Specifies whether to create an inverted index for the specified column in the index key.
|
rowStored | STRING | Specifies whether to store raw data. Valid values:
|
columnStored | STRING | Specifies whether to use the columnar structure to accelerate data sorting and analysis. Valid values:
|
type | STRING | The data type. When analyzers are used, set this attribute to text for the related fields. In other scenarios, set the value of this attribute to the data type in your wide table by default. Important The type attribute must be configured together with the analyzer attribute. |
analyzer | STRING | The analyzer that you want to use. Valid values:
Important The analyzer attribute must be configured together with the type attribute. |
mapping | STRING | The custom index key attribute. The value of this attribute is a string that represents a JSON object. This attribute is compatible with the syntax of LindormSearch (Compatible with Elasticsearch). Important
|
Secondary index function expression (function_expression)
When you create a secondary index, you can specify the index key as a function expression. The following five functions are supported: Z-ORDER, S2, CAST, MD5, and SHA256.
ImportantYou can specify the index key as an expression that contains the MD5 or SHA256 function only in LindormTable 2.6.7.5 and later versions. If you cannot upgrade LindormTable in the console, contact the technical support (DingTalk ID: s0s3eg3).
Z-ORDER: creates spatio-temporal secondary indexes for one or more spatio-temporal columns. Syntax:
Z-ORDER '(' column_identifier ( ',' column_identifer )* ')'
The column specified by column_identifer must be a spatio-temporal column. For more information about spatio-temporal indexes, see Spatio-temporal indexes.
CAST: creates indexes for a column whose data type is converted. Syntax:
CAST(column_identifier AS type)
In the preceding syntax, type specifies the data type to which the data in the column is converted. For more information, see Basic data types.
S2: creates an S2 secondary index for a column that contains spatio-temporal data. Syntax:
S2 '(' column_identifier, level ')'
The data type of column_identifer must be POLYGON or MULTIPOLYGON. The value range of level is [1,30]. For more information, see S2 indexing function.
MD5: creates an index for the MD5 hash of a column that contains VARCHAR data. Syntax:
MD5(column_identifier)
For more information, see MD5.
SHA256: creates an index for the SHA256 hash of a column that contains VARCHAR data. Syntax:
SHA256(column_identifier)
For more information, see SHA256.
Wildcard constant (wildcard_string_literal)
Only search indexes support the wildcard constant (*).
The wildcard constant (*) can be used to create an index for all columns. Example: CREATE INDEX IF NOT EXISTS idx5 USING SEARCH ON test(*);
.
Columns that are added to the table after the index is created are not automatically added to the search index or columnar index. You must execute the
ALTER INDEX
statement to manually add the columns to the index.Dynamic columns are not contained in indexes. For more information, see Dynamic columns
Included column (include_expression)
Included columns are index columns that are the same as those in the base table. Indexes with Included columns are also referred as covering indexes. A covering index can provide all data required for a query that hits the index without the need to access the base table. This way, the query performance can be ensured.
You can use the WITH
keyword to specify dynamic columns as included columns in secondary indexes by adding table attributes. For more information, see Secondary indexes.
Index partitioning (partition_definition)
Only search indexes support index partitioning.
Index partitioning is an index management policy based on which the server automatically splits and stores data into different partitions and prunes partitions when the data is queried.
Search indexes support range partitioning and hash partitioning. For more information, see Partition index.
Index creation method (ASYNC|SYNC)
When you use the CREATE INDEX
syntax to create an index, you can specify the ASYNC
or SYNC
keyword to specify whether to create the index synchronously or asynchronously.
ASYNC: creates the index asynchronously. If you use this method, the index starts to be created immediately after the
CREATE INDEX
statement is executed. The response to theCREATE INDEX
statement is immediately returned regardless of whether the index is successfully created.SYNC: creates the index synchronously. If you use this method, the index starts to be created immediately after the
CREATE INDEX
statement is executed. No results are returned for theCREATE INDEX
statement until the index is successfully created.
The following table shows the creation methods supported by secondary indexes and search indexes.
Creation method | Secondary indexes | Search indexes |
ASYNC Important In LindormTable 2.6.1 and later versions, the default index creation method for the | 〇 | 〇 |
SYNC Important The SYNC method is supported only in LindormTable 2.6.3 and later versions. | 〇 | 〇 |
Index attributes (index_options)
When you use the CREATE INDEX
syntax to create an index, you can use the WITH
keyword to specify attributes for the index. The following tables describe the attributes supported by different types of indexes.
Secondary index
Attribute | Data type | Description |
COMPRESSION | STRING | The compression algorithm you want to use for the index table. Valid values:
|
INDEX_COVERED_TYPE | STRING | The included columns in the index. Valid values:
You can use the |
STARTKEY | STRING | The start key of the index. Important Timestamp columns and spatio-temporal columns cannot be set to the start key of the index. |
ENDKEY | STRING | The end key of the index. Important Timestamp columns and spatio-temporal columns cannot be set to the end key of the index. |
NUMREGIONS | INTEGER | The estimated number of partitions of the index table. Important This attribute is not supported by Timestamp columns and spatio-temporal columns. |
Search index
Attribute | Data type | Description |
indexState | STRING | The status of the search index. Valid values:
|
numShards | INTEGER | The number of shards. The default value is equal to twice the number of search nodes. |
RANGE_TIME_PARTITION_START | INTEGER | The number of days before you create the index. The first partition is created based on the point in time n days before the point in time when you create the index. The number of days is the value of n. This attribute can be specified in scenarios in which your table contains historical data. If the timestamp of a row of historical data is earlier than the point in time based on which the first partition is created, an error message is returned. Note This attribute is required if you create a partitioned index. |
RANGE_TIME_PARTITION_INTERVAL | INTEGER | The interval at which a new partition is created. For example, If you set Note This attribute is required if you create a partitioned index. |
RANGE_TIME_PARTITION_TTL | INTEGER | The retention period of your data in a partition. For example, if you set Note This attribute is required if you create a partitioned index. |
RANGE_TIME_PARTITION_MAX_OVERLAP | INTEGER | If the timestamp of the data that you write indicates a time in the future, this option specifies the maximum time period between the point in time when data is written and the point in time when you create the index. Unit: days. If you do not specify this attribute, data of up to one day in the future can be written by default. |
RANGE_TIME_PARTITION_FIELD_TIMEUNIT | LONG | The unit of values in the partition key column. Default value: ms. The value ms means milliseconds.
|
RANGE_TIME_PARTITION_CHS | INTEGER | The hot and cold data boundary. Data that has been stored for a period longer than the specified boundary is archived to cold storage. Unit: seconds. For example, if you set this attribute to Note If you do not specify this attribute, the hot and cold data separation feature is not enabled. In this case, all data is stored in hot storage by default. |
INDEX_SETTINGS | STRING | The custom index attribute. The value of this attribute is a string that represents a JSON object. This attribute is compatible with the syntax of LindormSearch (Compatible with Elasticsearch). Important The INDEX_SETTINGS attribute is applicable only to LindormSearch (Compatible with Elasticsearch). |
Examples
In the following examples, a base table named test is created by executing the following statement:
CREATE TABLE test (
p1 VARCHAR NOT NULL,
p2 INTEGER NOT NULL,
c1 BIGINT,
c2 DOUBLE,
c3 VARCHAR,
c4 TIMESTAMP,
c5 GEOMETRY(POINT),
PRIMARY KEY(p1, p2)
) WITH (CONSISTENCY = 'strong', MUTABILITY='MUTABLE_LATEST');
Examples related to secondary indexes
Create a secondary index asynchronously
By default, if you do not specify the creation method, the index is created asynchronously.
CREATE INDEX idx1 ON test(c1 desc) include(c3,c4) WITH (COMPRESSION='ZSTD');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Create a federated index
Create a federated index for the c1, c2, and c3 columns.
CREATE INDEX idx1 ON test(c1, c2, c3) include(c4) SYNC WITH ( COMPRESSION ='ZSTD');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Include all columns in the index
CREATE INDEX idx1 ON test(c4 desc) WITH (INDEX_COVERED_TYPE ='COVERED_ALL_COLUMNS_IN_SCHEMA');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Include all dynamic columns in the index
CREATE INDEX idx1 ON test(c4 desc) WITH (INDEX_COVERED_TYPE='COVERED_DYNAMIC_COLUMNS');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Configure the estimated number of partitions of the index table
Set the estimated number of partitions of the index table to 32.
CREATE INDEX idx1 ON test(c4 desc) include(c5,c6) WITH (NUMREGIONS ='32');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Specify the start and end keys of the index table
Specify that the index table is split into 32 partition in the range from the start key 11111111
to the end key9999999
.
CREATE INDEX idx1 ON test(c3 desc) include(c5,c6) WITH (NUMREGIONS ='32', STARTKEY ='11111111', ENDKEY = '9999999');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Create a Z-ORDER secondary index
Create a Z-ORDER secondary index for the c5 column that stores POINT data.
CREATE INDEX idx1 ON test(Z-ORDER(c5));
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Create an S2 secondary index
Create an S2 secondary index for the c5 column that stores POINT data. The index can be created only in asynchronous mode.
CREATE INDEX idx1 ON test(S2(c5, 10));
Build the S2 secondary index.
BUILD INDEX s2_idx ON test;
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Convert the data type of the specified column
Create a secondary index for the c3 column after converting its data type to INTEGER.
CREATE INDEX idx1 ON test(CAST(c3 AS INTEGER));
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Examples related to search indexes
Create a secondary index asynchronously
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(p1, p2, c1, c2, c3);
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Create a search index on all columns
Create a search index for all columns without specifying column attributes. In this case, the default values of all column attributes are retained.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Add attributes for index keys
Add non-custom attributes for index keys
Create a search index for all columns. If the c3 column is included in the index, set the following attributes for all index columns: type=text, analyzer=ik, indexed=true.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*',c3(type=text,analyzer=ik,indexed=true));
Verify the result
You can execute the
SHOW INDEX FROM test;
statement to check whether the index is created.Add custom attributes for index keys
In this example, a sample table named test is created by executing the following statement:
CREATE TABLE test ( p1 VARCHAR NOT NULL, p2 INTEGER NOT NULL, c1 BIGINT, c2 DOUBLE, c3 VARCHAR, c4 TIMESTAMP, PRIMARY KEY(p1, p2) ) WITH (CONSISTENCY = 'strong', MUTABILITY='MUTABLE_LATEST');
Create a search index for all columns. Configure the following attributes for the c3 column: type=text, analyzer=ik_max_word.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*',c3(mapping='{ "type": "text", "analyzer": "ik_max_word" }'));
ImportantThe mapping attribute is applicable only to LindormSearch (Compatible with Elasticsearch).
If you specify the mapping attribute, other attributes of the index key become invalid.
Verify the result
You can execute the
SHOW INDEX FROM test;
statement to check whether the index is created.
Configure the index status
Create a search index on the specified columns, specify index column attributes, and set the status of the search index to ACTIVE.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(c1, c3(type=text,analyzer=ik)) WITH (indexState=ACTIVE,numShards=4);
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Configure custom index attributes
Create a search index and set the status of the search index to ACTIVE. Configure custom attributes to set the number of shards to 4, the compression algorithm to ZSTD, and the interval at which data is updated to 10 seconds.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(c1, c3(type=text,analyzer=ik)) WITH (indexState=ACTIVE,INDEX_SETTINGS='{
"index": {
"codec": "zstd",
"refresh_interval": "10s",
"number_of_shards": 4
}
}');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.
Configure time-based partitioning
Perform partitioning based on the time column c4. Partitioning is automatically performed every seven days from 30 days ago. By default, partitions are retained for 90 days.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test (c1, c2, c3, c4)
PARTITION BY RANGE TIME(c4) PARTITIONS 16
WITH (indexState=ACTIVE, RANGE_TIME_PARTITION_START='30', RANGE_TIME_PARTITION_INTERVAL='7', RANGE_TIME_PARTITION_TTL='90', RANGE_TIME_PARTITION_MAX_OVERLAP='90');
Verify the result
You can execute the SHOW INDEX FROM test;
statement to check whether the index is created.