You may need to add, update, or delete an index column in a search index or change the routing key and presorting method of the search index to meet new business requirements or optimize performance. In this case, you can dynamically modify the schema of the search index. To dynamically modify the schema of a search index, you must modify the schema of the source index to create a canary index, wait until all table data is synchronized to the canary index, specify the weight of query traffic that is allocated to the canary index to perform A/B testing, switch the schemas of the source index and canary index, and then delete the canary index.
Feature overview
Data tables of Tablestore are schema-free. However, search indexes have rigid schemas. When you create a search index, you must specify the columns that you want to add to the search index. Then, you can query these columns when you use the search index to query data. To adapt to business changes and optimize performance, you must frequently modify the schemas of search indexes. When you modify the schema of a search index, you can perform the following operations:
Add index columns: You can add index columns if your business requires more columns for queries.
Update index columns: You can modify the data type, virtual column, array, and analyzer settings of index columns.
Delete index columns: You can delete index columns that no longer need to be queried.
Change the routing key: You can specify a proper routing key to reduce read workloads and improve query efficiency.
Change the presorting method: If the presorting method is the same as the method that you want to use to sort the query results, the rows that meet the query conditions are returned in an efficient manner. You can change the presorting method to accelerate data queries.
Flowchart
The following flowchart shows how to dynamically modify a schema. The procedure does not affect your business. You do not need to change business code.
The following table describes the steps that are required to modify the schema of a search index.
Step | Operation | Description |
1 | Create a canary index | Modify the schema of a search index to create a canary index for the search index. |
2 | Check the index synchronization progress | The data in the data table is automatically synchronized to the canary index. Wait for the existing and incremental data of the data table to be synchronized to the canary index until the synchronization progress of the canary index is the same as the synchronization progress of the source index. |
3 | Specify weights for A/B testing | A/B testing allows you to allocate traffic to the source index and the canary index based on proportions and verify the effects of modifications to the schema. Use A/B testing to gradually switch traffic to the canary index and wait until all traffic is switched to the canary index. |
4 | Switch index schemas | After all query traffic is switched to the canary index, switch the schemas of the source index and canary index. After you switch the schemas, the name of the source index is associated with the new schema. The name of the canary index is associated with the old schema. All traffic is switched to query the source index whose name is associated with the new schema. |
5 | Delete the canary index | You can delete the canary index after you switch the schemas and verify that the new schema is correct. Before you delete the canary index, we recommend that you wait for a period of time, such as one day. |
Procedure
Go to the Indexes tab.
Log on to the Tablestore console.
In the top navigation bar, select a region and a resource group.
On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column of the instance.
In the Tables section of the Instance Details tab, click the name of the data table for which you want to manage indexes. On the Manage Table page, click Indexes. You can also click Indexes in the Actions column of the data table.
Create a canary index based on the source index.
On the Indexes tab, click Change Schema in the Actions column of the search index.
In the Reindex dialog box, add, modify, or delete index fields based on your business requirements.
If you want to change the routing key or the presorting method of the search index, turn on Advanced Settings and configure the parameters. The following table describes the parameters.
Parameter
Description
Routing Key
The custom routing fields. You can specify one or more of the primary key columns as routing fields.
When data is written to the index, the location where the data is distributed is determined based on the value of the routing key. The value of the routing key is calculated based on the values of the routing fields. Data records that share the same routing key value are distributed in the same partition. For more information, see How do I use routing fields?
Pre-sorting
The presorting method that is used to sort data in the search index. When you use the search index to query data, the presorting method determines the default order in which the query results are returned. For more information, see Index presorting.
If you want to sort data in the search index based on the primary key, set this parameter to Default.
If you want to sort data in the search index based on field values or a combination of the primary key columns, set this parameter to Custom and perform the following steps to specify a presorting method:
Select Primary Key Pre-sorting or Field Pre-sorting and click Add.
Specify field names and whether to sort the data in the search index in ascending or descending order.
ImportantYou need to specify field names only if you select Field Pre-sorting.
If you select Custom for Pre-sorting, you can select Primary Key Pre-sorting and Field Pre-sorting at the same time based on your business requirements.
Click OK.
In the Index Comparison message, view the comparison of the routing key, presorting method, and schema between the source index and the canary index. After you confirm that the information is correct, click OK.
View the index synchronization information.
Click the icon before the source index or click the name of the source index.
The system displays the canary index of the source index.
Click Use Gray Index in the Actions column of the canary index.
ImportantThe existing data synchronization and incremental data synchronization stages are required for the canary index.
If you move the pointer over Use Gray Index in the Actions column before data synchronization is complete, the Yes, but the operation may cause security risks message appears.
If you move the pointer over Use Gray Index in the Actions column after data synchronization is complete, the Yes. The operation is secure message appears. You can proceed to the Use Gray Index dialog box.
In the Use Gray Index dialog box, view the synchronization information of the indexes.
After the synchronization is complete, specify weights to perform A/B testing.
A/B testing allows you to allocate traffic to the source index and the canary index based on proportions and verify the effects of modifications to the schema. You can perform subsequent operations only when all traffic is switched to the canary index.
In the Operations section of the Use Gray Index dialog box, drag the slider to adjust the weights of the source index and the canary index, and then click Set Weight.
In the Set Weight message, view the weight data and the schema comparison information.
After you confirm that the information is correct, click Set Weight.
In the message that appears, click OK.
After all query traffic is switched to the canary index, switch the schemas of the source index and the canary index.
After you switch the schemas, the name of the source index is associated with the new schema. The name of the canary index is associated with the old schema. All traffic is switched to query the source index whose name is associated with the new schema.
In the Operations section of the Use Gray Index dialog box, click Switch Index.
In the Switch Index message, view the comparison of the routing key, presorting method, and schema between the source index and the canary index. After you confirm that the information is correct, click Confirm Switch.
Delete the source index after you switch the schemas and verify that the new schema is correct. Before you delete the source index, we recommend that you wait for a period of time, such as one day.
In the Use Gray Index dialog box, click Delete Canary Release.
In the Delete Canary Release dialog box, confirm that the information about the canary index that you want to delete is correct. In the text box that appears, enter
I confirm that the new index data is synchronized and a canary release is completed for a specific period of time.
Click Yes.
Security
To limit the impact of incorrect operations, Tablestore provides the rollback mechanism and switchover notes to minimize the risks caused by modifying schemas.
Rollback mechanism
When you dynamically modify the schema of a search index, you can roll back the modification.
After you create a canary index, you can delete the canary index and create a new index if the schema of the canary index does not meet your expectations.
When you perform A/B testing, you can specify weights to gradually switch traffic to the canary index. In this process, you can reset the weights anytime to switch traffic back to the source index if issues occur.
After you switch the schemas between the source index and the canary index, you can cancel the switchover anytime to roll back the schemas if issues occur. Index switchover is the reverse of switchover cancellation.
Switchover notes
If you switch traffic to a canary index when the synchronization progress of the canary index is slower than that of the source index, the data you query may not be the latest. In this case, Tablestore determines whether switchover can be performed based on the synchronization status and the last synchronization time of the source index and the canary index.
If the following situations exist, Tablestore determines that switchover can be performed:
The source index is in the full data synchronization stage. The canary index is in the full or incremental data synchronization stage. The synchronization progress of the canary index is the same as that of the source index.
The source index and the canary index are in the incremental data synchronization stage. The last synchronization time of the source index is up to 60 seconds earlier than that of the canary index.
Billing
You are not charged for building canary indexes and writing data. You are charged for the storage space that the source index and the canary index occupy and the reserved read capacity units (CUs). For more information, see Billable items of search indexes.
References
You can query new fields and data of new field types by modifying the schema of the search index. You can also create a search index and use the virtual column feature. You do not need to modify the table schema. For more information, see Virtual columns.
You can use the following features to query data based on your business requirements: match all query, match query, match phrase query, term query, terms query, prefix query, range query, wildcard query, Boolean query, nested query, geo query, geo-bounding box query, geo-polygon query, exists query, collapse (distinct), fuzzy query, and virtual columns.
You can use the SQL query feature of search indexes to query different types of data in search indexes in an efficient manner. For more information, see Overview, Create mapping tables for search indexes, Query data, Full-text search, Array type in search indexes, NESTED supported in search indexes, and Virtual columns of search indexes.
You can use the aggregation feature of search indexes to perform the following operations: obtain the minimum value, maximum value, sum, average value, count and distinct count of rows, and percentile statistics, group results by field value, range, geographical location, filter, histogram, or date histogram, perform nested queries, and query the rows from the results of an aggregation operation in each group. For more information, see Aggregation.