When you use a search index to query data, you can use a predefined sorting method or specify a sorting method. This way, the rows that meet the query conditions are returned based on the order that you predefined or specified. If a large number of rows are included in the response, you can locate the required data by configuring the limit and offset parameters or by using tokens.
Scenarios
Category | Method | Feature | Scenario |
Sorting | Predefine a sorting method when you create a search index | By default, data in a search index is sorted based on the presorting settings that are specified by the IndexSort parameter. The presorting settings that are specified by the IndexSort parameter determine the default order in which the rows that meet the query conditions are returned. | |
Specify a sorting method when you query data | Sorting based on the BM25-based keyword relevance score (ScoreSort) | You can use ScoreSort to sort query results based on the BM25-based keyword relevance score. ScoreSort is suitable for scenarios such as full-text search. | |
You can use PrimaryKeySort to sort query results based on the primary key value. PrimaryKeySort is suitable for scenarios in which you want to sort data based on the unique identifiers of the data. | |||
Sorting based on the values of one or more columns (FieldSort) | You can use FieldSort to sort query results based on the values of one or more columns. FieldSort is suitable for scenarios in which you want to sort data based on properties such as sales volume or page views. In most cases, FieldSort is used in industries such as e-commerce and social networking and media asset. | ||
You can use GeoDistanceSort to sort query results by geographical location. GeoDistanceSort is suitable for scenarios in which you want to sort data based on the distance from a specific location. In most cases, GeoDistanceSort is used in industries such as mapping and logistics. For example, you can sort restaurants around a location based on the distance from the location. You can specify the mode parameter in FieldSort for multi-value fields, such as Array fields or Nested fields, to determine the elements that are used to sort the query results. | |||
Paging | Specify a paging method when you query data | If the number of rows in the response is less than 50,000, you can use this method to jump to a page. | |
If you use this feature, data is returned page by page and you can only page backward. If you want to page forward, you can cache and use a previous token because tokens are permanently valid during a query. |
Specify a sorting method when you create a search index
By default, data in a search index is sorted based on the value of the IndexSort parameter. When you use a search index to query data, the value of the IndexSort parameter determines the default order in which the matched data is returned.
When you create a search index, you can specify presorting settings by configuring the IndexSort parameter. If you do not specify presorting settings, data in the search index is sorted by primary key.
You can specify PrimaryKeySort or FieldSort as the presorting method for a search index. PrimaryKeySort indicates sorting by primary key and FieldSort indicates sorting by field value.
Search indexes that contain Nested fields do not support index presorting.
Specify a sorting method when you query data
Sorting can be enabled only for fields for which the enableSortAndAgg parameter is set to true.
You can specify a sorting method for each query. Search index-based queries support the following sorting methods. You can also specify multiple sorting methods based on different priorities.
ScoreSort
You can use ScoreSort to sort the query results based on the BM25-based keyword relevance score. ScoreSort is suitable for scenarios such as full-text search.
Before you sort the matched data by keyword relevance score, you must configure the parameters for ScoreSort. Otherwise, the matched data is sorted based on the presorting settings that are specified by the IndexSort parameter.
SearchQuery searchQuery = new SearchQuery();
searchQuery.setSort(new Sort(Arrays.asList(new ScoreSort())));
PrimaryKeySort
You can use PrimaryKeySort to sort the query results based on the value of the primary key.
SearchQuery searchQuery = new SearchQuery();
searchQuery.setSort(new Sort(Arrays.asList(new PrimaryKeySort()))); // Sort the query results in ascending order.
//searchQuery.setSort(new Sort(Arrays.asList(new PrimaryKeySort(SortOrder.DESC)))); // Sort the query result in descending order.
FieldSort
You can use FieldSort to sort the query results based on the values of one or more specific columns.
Sort the query results based on the values of a single column
You can use FieldSort to sort the query results based on the values of a specific column.
SearchQuery searchQuery = new SearchQuery(); searchQuery.setSort(new Sort(Arrays.asList(new FieldSort("col", SortOrder.ASC))));
Sort the query results based on the values of multiple columns
You can also sort query result based on the values of two columns in specific orders to determine the order in which the matched data is returned.
SearchQuery searchQuery = new SearchQuery(); searchQuery.setSort(new Sort(Arrays.asList( new FieldSort("col1", SortOrder.ASC), new FieldSort("col2", SortOrder.ASC))));
Sort the query results based on the values of a supplemental column
If you sort data based on the column values of the Long, Double, or Date type, you can configure the missingField parameter to specify a column of the same data type as a supplemental column that is used when data is missing in the original column.
/** * Sort the query results based on the values of the Col_Long column in descending order. If the data in the Col_Long column of the Long type is missing, you can use the Col_Long_sec column of the same data type as the supplemental column for sorting. */ SearchQuery searchQuery = new SearchQuery(); FieldSort fieldSort = new FieldSort("Col_Long"); // Specify the Col_Long_sec column as the supplemental column for sorting when the data in the Col_Long column is missing. fieldSort.setMissingField("Col_Long_sec"); fieldSort.setOrder(SortOrder.DESC);
Sort the query results based on the specified elements for multi-value fields
You can specify the mode parameter for multi-value fields, such as Array fields or Nested fields, to determine the elements that are used to sort the query results.
You can specify the mode parameter for Array fields to determine the elements that are used to sort the query results.
// The doc1 and doc2 rows contain the field1 column of the Array type. The value of the field1 column in the doc1 row is [2,3] and the value of the field1 column in the doc2 row is [1,3,4]. // You can specify the mode parameter to determine the elements of the Array field in the two rows that are used to sort the two rows. { // If you set the mode parameter to SortMode.MAX, the maximum elements of the Array field are used to sort the rows. In this example, 3 in the doc1 row and 4 in the doc2 row are used to sort the two rows, and the sorting result is doc2 and doc1. FieldSort fieldSort = new FieldSort("field1", SortOrder.DESC); fieldSort.setMode(SortMode.MAX); } { // If you set the mode parameter to SortMode.MIN, the minimum elements of the Array field are used to sort the rows. In this example, 2 in the doc1 row and 1 in the doc2 row are used to sort the two rows, and the sorting result is doc1 and doc2. FieldSort fieldSort = new FieldSort("field1", SortOrder.DESC); fieldSort.setMode(SortMode.MIN); }
You can also specify the mode parameter for Nested fields to determine the elements that are used to sort the child rows.
// The doc1 and doc2 rows contain the field1 column of the Nested type. // The value of the field1 column in the doc1 row is [{"name":"b", "age":1},{"name":"a", "age":7}]. // The value of the field1 column in the doc2 row is [{"name":"a", "age":1},{"name":"c", "age":1},{"name":"d", "age":5}]. { // You can specify the mode parameter to determine the elements of the Nested field in the two rows that are used to sort the child rows. // If you set the mode parameter to SortMode.MAX, the maximum elements of the Nested field are used to sort the rows. In this example, 7 in the doc1 row and 5 in the doc2 row are used to sort the two rows by the age column, and the sorting result is doc1 and doc2. FieldSort fieldSort = new FieldSort("field1.age", SortOrder.DESC); fieldSort.setMode(SortMode.MAX); String path = "field1"; NestedFilter nestedFilter = new NestedFilter(path, QueryBuilders.matchAll().build()); fieldSort.setNestedFilter(nestedFilter); } { // You can specify the mode parameter to determine the elements of the Nested field in the two rows that are used to sort the child rows in which the value of the age column is 1. { // If you set the mode parameter to SortMode.MAX, the maximum elements of the Nested field are used to sort the rows. In this example, "b" in the doc1 row and "c" in the doc2 row are used to sort the two rows by the name column, and the sorting result is doc2 and doc1. FieldSort fieldSort = new FieldSort("field1.name", SortOrder.DESC); fieldSort.setMode(SortMode.MAX); String path = "field1"; NestedFilter nestedFilter = new NestedFilter(path, QueryBuilders.term("field1.age",1).build()); fieldSort.setNestedFilter(nestedFilter); } { // If you set the mode parameter to SortMode.MIN, the minimum elements of the Nested field are used to sort the rows. In this example, "b" in the doc1 row and "a" in the doc2 row are used to sort the two rows by the name column, and the sorting result is doc1 and doc2. FieldSort fieldSort = new FieldSort("field1.name", SortOrder.DESC); fieldSort.setMode(SortMode.MIN); String path = "field1"; NestedFilter nestedFilter = new NestedFilter(path, QueryBuilders.term("field1.age",1).build()); fieldSort.setNestedFilter(nestedFilter); } }
GeoDistanceSort
You can use GeoDistanceSort to sort the query results by geographical location.
SearchQuery searchQuery = new SearchQuery();
// Sort the results based on the distance from the value in the geo column of the Geopoint type to the coordinate pair (0, 0).
Sort.Sorter sorter = new GeoDistanceSort("geo", Arrays.asList("0, 0"));
searchQuery.setSort(new Sort(Arrays.asList(sorter)));
Specify a paging method
You can configure the limit and offset parameters or use tokens to page the rows in the response.
Configure the limit and offset parameters
If the total number of rows in the response is less than 50,000, you can configure the limit and offset parameters to page the rows. The sum of the values of the limit and offset parameters cannot exceed 50,000. The maximum value of the limit parameter is 100.
For information about how to increase the maximum value of the limit parameter, see How do I increase the value of the limit parameter to 1000 when I call the Search operation of the search index feature to query data?
If you do not specify values for the limit and offset parameters, the default values are used. The default value of the limit parameter is 10. The default value of the offset parameter is 0.
SearchQuery searchQuery = new SearchQuery();
searchQuery.setQuery(new MatchAllQuery());
searchQuery.setLimit(100);
searchQuery.setOffset(100);
Use a token
We recommend that you use a token for deep paging because this method has no limits on the paging depth.
If Tablestore cannot read all data that meets the query conditions, Tablestore returns nextToken. You can use nextToken to continue to read the subsequent data.
By default, you can only page backward when you use a token. However, you can cache and use the previous token to page forward because a token is valid during the query.
If you want to persist nextToken or transfer nextToken to the frontend page, you can use Base64 to encode nextToken into a string. Tokens are not strings. If you use new String(nextToken)
to encode a token into a string, information about the token is lost.
When you use a token, the sorting method is the same as the method that is used in the previous request. Tablestore sorts data based on the IndexSort parameter by default or based on the method that you specified. You cannot specify the sorting method when you use a token. You cannot configure the offset parameter when you use a token. Data is returned page by page in sequence. This results in a slow query.
Search indexes that contain Nested fields do not support index presorting. If you require paging and you use a search index that contains Nested fields to query data, you must specify the sorting method in the query conditions to return data in the specified order. Otherwise, Tablestore does not return nextToken when only part of the data that meets the query conditions is returned.
private static void readMoreRowsWithToken(SyncClient client) {
SearchQuery searchQuery = new SearchQuery();
searchQuery.setQuery(new MatchAllQuery());
searchQuery.setGetTotalCount(true);// Specify that the total number of rows that meet the query conditions is returned.
// Specify the name of the data table. Example: sampleTable. Then, specify the name of the search index. Example: sampleSearchIndex. To obtain the name of the search index, you can query the list of search indexes in the Tablestore console or by using Tablestore SDKs. If you query the list of search indexes in the Tablestore console, you can view the list on the Indexes tab of the Manage Table page.
SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery);
SearchResponse resp = client.search(searchRequest);
if (!resp.isAllSuccess()) {
throw new RuntimeException("not all success");
}
List<Row> rows = resp.getRows();
while (resp.getNextToken()!=null) { // If the nextToken value is null in the response, all data is read.
// Query the nextToken value.
byte[] nextToken = resp.getNextToken();
{
// If you need to persist nextToken or transfer nextToken to the frontend page, you can use Base64 to encode nextToken into a string.
// Tokens are not strings. If you use new String(nextToken) to encode a token into a string, information about the token is lost.
String tokenAsString = Base64.toBase64String(nextToken);
// Decode the string into bytes.
byte[] tokenAsByte = Base64.fromBase64String(tokenAsString);
}
// Set the token in this request to the nextToken value in the previous response.
searchRequest.getSearchQuery().setToken(nextToken);
resp = client.search(searchRequest);
if (!resp.isAllSuccess()) {
throw new RuntimeException("not all success");
}
rows.addAll(resp.getRows());
}
System.out.println("RowSize: " + rows.size());
System.out.println("TotalCount: " + resp.getTotalCount());// Specify that the total number of matched rows instead of the number of returned rows is displayed.
}
FAQ
References
When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, KNN vector query, nested query, and exists query. You can use the query methods provided by the search index to query data from multiple dimensions based on your business requirements.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.