If an application needs to use different attributes as query conditions to query data, you can set these attributes as the primary key columns of a secondary index to speed up the query. Tablestore provides the local secondary index and global secondary index features to meet different requirements on read consistency during data queries. This topic describes the terms and features of secondary indexes, the differences between secondary index types, and the usage notes on secondary indexes.
Background information
You can query the data of a data table in Tablestore by specifying the primary key of the data table. However, in most cases, many applications may need to query data by using attributes other than the primary key of the data table. To meet this requirement, you can create one or more secondary indexes and use the index tables to query data.
The secondary index feature allows you to query data based on the primary key of a data table and the index key columns of the secondary indexes that are created for the data table. This improves query efficiency. When you create a secondary index, you need to set the attributes that you want to query as the primary key columns of the index table. Then, you can use the primary key columns of the index table to query data. For more information, see Scenarios.
Terms
Term | Description |
index table | A table that is created based on the indexed columns of a data table. The data in the index table is read-only. |
predefined column | A non-primary key column that is predefined when you create a data table. A predefined column can be used as an index key column or attribute column of an index table. You need to specify the data type of a predefined column when you add the predefined column. Note Tablestore uses a schema-free model. You can write different attribute columns to a row without the need to specify the attribute columns in the table schema. |
single-column index | An index that is created on a single column. |
combined index | An index that is created on multiple columns. For example, a combined index can have index key columns 1 and 2. |
indexed attribute column | A predefined column that is mapped from a data table to an index table as a predefined column of the index table. |
autocomplete | The feature that allows the system to automatically add the primary key columns of the data table that are not specified as index key columns to an index table as the primary key columns of the index table. |
Secondary index types
In Tablestore, secondary indexes are classified into global secondary indexes and local secondary indexes. If you use the global secondary index feature, Tablestore automatically synchronizes data from the indexed columns and primary key columns of a data table for which an index table is created to the index table in asynchronous mode. The synchronization latency is within a few milliseconds.
To meet user requirements on strong consistency during data queries, Tablestore provides the local secondary index feature. If you use the local secondary index feature, Tablestore automatically synchronizes data from the indexed columns and primary key columns of a data table for which an index table is created to the index table in synchronous mode. After data is written to the data table, you can immediately query the data from the index table.
The following table describes the differences between global secondary indexes and local secondary indexes in terms of their synchronization mode, requirements on the first primary key column, synchronization latency, and read consistency.
Item | Global secondary index | Local secondary index |
Synchronization mode | Asynchronous mode | Synchronous mode |
Requirements on the first primary key column | The first primary key column of a global secondary index can be any primary key column or predefined column of the data table for which the secondary index is created. | The first primary key column of a local secondary index must be the same as the first primary key column of the data table for which the secondary index is created. |
Synchronization latency | Within a few milliseconds | Real-time |
Read consistency | Eventual consistency | Strong consistency |
Features
The features of secondary indexes include single-column indexes, combined indexes, index synchronization, covered indexes, indexes that contain the historical data of data tables, and sparse indexes. The following table describes the preceding features.
Feature | Description |
Single-column index and combined index | You can create an index on one or more columns in a data table. |
Index synchronization | Global secondary indexes and local secondary indexes synchronize data in different modes.
|
Covered index | Index tables can contain attribute columns. You can create predefined columns when you create a data table. Then, you can create an index table based on the predefined columns and primary key columns of the data table. You can specify several predefined columns of the data table as attribute columns of an index table. You can also specify no attribute columns for an index table. If you specify several predefined columns of a data table as attribute columns of an index table, you can query the values in the predefined columns from the index table. You do not need to query the data table. |
Index that contains the historical data of a data table | You can create an index table that contains the historical data of a data table. |
Sparse index | You can specify a predefined column of a data table as an attribute column of an index table. If a row in the data table does not contain the predefined column but contains all index key columns, an index is created on the row. However, an index cannot be created on a row if the row does not contain some index key columns. For example, a data table contains primary key columns PK0, PK1, and PK2, and predefined columns Defined0, Defined1, and Defined2. You create an index table that contains index key columns PK0, Defined0, and Defined1, and an attribute column Defined2.
|
Limits
For more information, see Secondary index limits.
Usage notes
Usage notes on creating index tables
The system automatically adds the primary key columns of the data table that are not specified as index key columns to an index table as the primary key columns of the index table. When you scan data in an index table, you must specify the values of primary key columns. The values of primary key columns range from negative infinity to positive infinity.
When you create an index table, you need to only specify the index key columns. The system automatically adds all primary key columns of the data table to the index table. For example, a data table contains primary key columns PK0 and PK1 and a predefined column Defined0.
If you use the global secondary index feature, you can create an index on columns based on your business requirements.
If you create an index on the Defined0 column, Tablestore generates an index table whose primary key columns are Defined0, PK0, and PK1.
If you create an index on the Defined0 and PK1 columns, Tablestore generates an index table whose primary key columns are Defined0, PK1, and PK0.
If you create an index on the PK1 column, Tablestore generates an index table whose primary key columns are PK1 and PK0.
If you use the local secondary index feature, the first primary key column of an index table must be the same as the first primary key column of the data table for which the index table is created.
If you create an index on the PK0 and Defined0 columns, Tablestore generates an index table whose primary key columns are PK0, Defined0, and PK1.
If you create an index on the PK0, PK1, and Defined0 columns, Tablestore generates an index table whose primary key columns are PK0, PK1, and Defined0.
If you create an index on the PK0 and PK1 columns, Tablestore generates an index table whose primary key columns are PK0 and PK1.
You can specify predefined columns of a data table as attribute columns of the index table that is created for the data table based on your query modes and cost requirements.
After you specify a predefined column of a data table as an attribute column of the index table that is created for the data table, you can query the values in the predefined column from the index table. You do not need to query the data table. However, this increases storage costs. If a predefined column of the data table is not specified as an attribute column of the index table, you must query the values in the predefined column from the data table.
When you use the global secondary index feature, specify a column of a data table as the first primary key column of an index table based on your business requirements.
If a column whose value is time or a date is the first primary key column of an index table, the update speed of the index table may decrease. Therefore, we recommend that you do not specify this type of column as the first primary key column of an index table.
We recommend that you hash the column whose value is time or a date and create an index on the hashed column. If you need to perform hashing, use DingTalk to contact Tablestore technical support.
We recommend that you do not specify a column of low cardinality or a column that contains enumerated values as the first primary key column of an index table. For example, if you specify the gender column as the first primary key column of an index table, the horizontal scalability of the index table is limited. As a result, the write performance is compromised.
Usage notes on using index tables
You must comply with the following rules when you write data to a data table for which an index table is created. Otherwise, the data cannot be written to the data table.
You cannot specify a version number for the data that you write to the data table.
Duplicate rows that have the same primary key value cannot exist in a batch write operation.
You can use an index table only to query data. You cannot write data to an index table.
API operations
The following table describes the API operations that Tablestore provides to manage secondary indexes.
API operation | Description |
Creates an index table for an existing data table. Note
| |
Reads a row of data from an index table. | |
Reads data within a specified range from an index table. | |
Deletes an index table from a data table. Important Before you call the DeleteTable operation to delete a data table, you must first delete the index tables that are created on the data table. Otherwise, you cannot delete the data table. |
Use secondary indexes
You can use secondary indexes by using the Tablestore console, Tablestore CLI, or Tablestore SDKs. For more information, see the following topics:
Billing
You are charged storage fees when you use the secondary index feature. In addition, computing resource are consumed when you write data to a data table, create an index table for a data table, or read data. For more information, see Billable items of secondary indexes.
References
If you want to query data in a more efficient and flexible manner, you can use the search index feature. The search index feature provides multiple query methods, including Boolean query, full-text search, prefix query, and fuzzy query. For more information, see Overview.