All Products
Search
Document Center

Tablestore:Sorting and paging

Last Updated:Aug 19, 2024

When you use a search index to query data, you can use a predefined sorting method or specify a sorting method. This way, the rows that meet the query conditions are returned based on the order that you predefined or specified. If a large number of rows are included in the response, you can locate the required data by configuring the limit and offset parameters or by using tokens.

Scenarios

Category

Method

Feature

Scenario

Sorting

Predefine a sorting method when you create a search index

Index presorting (IndexSort)

By default, data in a search index is sorted based on the presorting settings that are specified by the IndexSort parameter. The presorting settings that are specified by the IndexSort parameter determine the default order in which the rows that meet the query conditions are returned.

Specify a sorting method when you query data

Sorting based on the BM25-based keyword relevance score (ScoreSort)

You can use ScoreSort to sort query results based on the BM25-based keyword relevance score. ScoreSort is suitable for scenarios such as full-text search.

Sorting based on the primary key value (PrimaryKeySort)

You can use PrimaryKeySort to sort query results based on the primary key value. PrimaryKeySort is suitable for scenarios in which you want to sort data based on the unique identifiers of the data.

Sorting based on the values of one or more columns (FieldSort)

You can use FieldSort to sort query results based on the values of one or more columns. FieldSort is suitable for scenarios in which you want to sort data based on properties such as sales volume or page views. In most cases, FieldSort is used in industries such as e-commerce and social networking and media asset.

Sorting by geographical location (GeoDistanceSort)

You can use GeoDistanceSort to sort query results by geographical location. GeoDistanceSort is suitable for scenarios in which you want to sort data based on the distance from a specific location. In most cases, GeoDistanceSort is used in industries such as mapping and logistics. For example, you can sort restaurants around a location based on the distance from the location.

Paging

Specify a paging method when you query data

Paging based on the limit and offset parameters

If the number of rows in the response is less than 50,000, you can use this method to jump to a page.

Paging based on tokens

If you use this feature, data is returned page by page and you can only page backward. If you want to page forward, you can cache and use a previous token because tokens are valid during the query.

Index presorting

By default, data in a search index is sorted based on the presorting settings that are specified by the IndexSort parameter. When you use a search index to query data, the presorting settings that are specified by the IndexSort parameter determine the default order in which the matched data is returned.

When you create a search index, you can specify presorting settings by configuring the IndexSort parameter. If you do not specify presorting settings, data in the search index is sorted by primary key value.

Important
  • You can specify PrimaryKeySort or FieldSort as the presorting method for a search index. PrimaryKeySort sorts data by primary key value and FieldSort sorts data by field value.

  • Search indexes that contain Nested fields do not support index presorting.

  • If you want to modify the settings of the IndexSort parameter for an existing search index, you can dynamically modify the schema of the search index. For more information, see Dynamically modify the schema of a search index.

Specify a sorting method when you query data

Sorting can be performed only for fields for which the enable_sort_and_agg parameter is set to true.

You can specify a sorting method for each query. Search index-based queries support the following sorting methods. You can also specify multiple sorting methods based on different priorities.

ScoreSort

You can use ScoreSort to sort query results based on the BM25-based keyword relevance score. ScoreSort is suitable for scenarios such as full-text search.

Important

Before you sort the matched data by keyword relevance score, you must configure the parameters for ScoreSort. Otherwise, the matched data is sorted based on the presorting settings that are specified by the IndexSort parameter.

'sort' => array(
    array(
        'score_sort' => array(
            'order' => SortOrderConst::SORT_ORDER_DESC
        )
    ),
)

PrimaryKeySort

You can use PrimaryKeySort to sort query results based on the primary key value.

'sort' => array(
    array(
        'pk_sort' => array(
            'order' => SortOrderConst::SORT_ORDER_ASC
        )
    ),
)

FieldSort

You can use FieldSort to sort the query results based on the values of one or more specific columns.

'sort' => array(
    array(
        'field_sort' => array(
            'field_name' => 'keyword',
            'order' => SortOrderConst::SORT_ORDER_ASC,
            'mode' => SortModeConst::SORT_MODE_AVG,
        )
    ),
)

GeoDistanceSort

You can use GeoDistanceSort to sort query results by geographical location.

'sort' => array(
    array(
        'geo_distance_sort' => array(
            'field_name' => 'geo',
            'order' => SortOrderConst::SORT_ORDER_ASC,
            'distance_type' => GeoDistanceTypeConst::GEO_DISTANCE_PLANE,
            'points' => array('0.6,0.6')
        )
    ),
)

Combination of multiple sorting methods

You can sort query results based on the values of two columns in specific orders to determine the order in which the matched data is returned.

'sort' => array(
    array(
        'field_sort' => array(
            'field_name' => 'keyword',
            'order' => SortOrderConst::SORT_ORDER_ASC,
            'mode' => SortModeConst::SORT_MODE_AVG,
        )
    ),
    array(
        'pk_sort' => array(
            'order' => SortOrderConst::SORT_ORDER_ASC
        )
    ),
)

Specify a paging method

You can configure the limit and offset parameters or use tokens to page the rows in the response.

Configure the limit and offset parameters

If the total number of rows in the response is less than 50,000, you can configure the limit and offset parameters to page the rows. The sum of the values of the limit and offset parameters cannot exceed 50,000. The maximum value of the limit parameter is 100.

Note

For information about how to increase the maximum value of the limit parameter, see How do I increase the value of the limit parameter to 1000 when I call the Search operation of the search index feature to query data?

If you do not specify values for the limit and offset parameters, the default values are used. The default value of the limit parameter is 10. The default value of the offset parameter is 0.

$request = array(
    'table_name' => 'php_sdk_test',
    'index_name' => 'php_sdk_test_search_index',
    'search_query' => array(
        'offset' => 10,
        'limit' => 10,
        'get_total_count' => true,
        'query' => array(
            'query_type' => QueryTypeConst::MATCH_ALL_QUERY
        ),
        'sort' => array(
            array(
                'field_sort' => array(
                    'field_name' => 'keyword',
                    'order' => SortOrderConst::SORT_ORDER_ASC
                )
            ),
        ),
        'token' => null,
    ),
    'columns_to_get' => array(
        'return_type' => ColumnReturnTypeConst::RETURN_SPECIFIED,
        'return_names' => array('col1', 'col2')
    )
);
$response = $otsClient->search($request);

Use a token

We recommend that you use a token for deep paging because this method has no limits on the paging depth.

If Tablestore cannot read full data that meets the query conditions, Tablestore returns next_token. You can use next_token to continue reading the subsequent data.

By default, you can only page backward when you use a token. However, you can cache and use the previous token to page forward because a token is valid during the query.

When you use a token, the sorting method is the same as the method that is used in the previous request. Tablestore sorts data based on the IndexSort parameter by default or based on the method that you specified. You cannot specify the sorting method when you use a token. You cannot configure the offset parameter when you use a token. Data is returned page by page in sequence. This results in a slow query.

Important

Search indexes that contain fields of the Nested type do not support IndexSort. If you require paging and use a search index that contains fields of the Nested type to query data, you must specify the sorting method in the query conditions to return data in the specified order. Otherwise, Tablestore does not return next_token when only part of the data that meets the query conditions is returned.

$request = array(
    'table_name' => 'php_sdk_test',
    'index_name' => 'php_sdk_test_search_index',
    'search_query' => array(
        'offset' => 0,
        'limit' => 10,
        'get_total_count' => true,
        'query' => array(
            'query_type' => QueryTypeConst::FUNCTION_SCORE_QUERY,
            'query' => array(
                'query' => array(
                    'query_type' => QueryTypeConst::TERM_QUERY,
                    'query' => array(
                        'field_name' => 'keyword',
                        'term' => 'keyword'
                    )
                ),
                'field_value_factor' => array(
                    'field_name' => 'long'
                )
            )
        ),
        'sort' => array(
            array(
                'score_sort' => array(
                    'order' => SortOrderConst::SORT_ORDER_DESC
                )
            ),
        )
    ),
    'columns_to_get' => array(
        'return_type' => ColumnReturnTypeConst::RETURN_SPECIFIED,
        'return_names' => array('keyword', 'long')
    )
);

$response = $otsClient->search($request);
print "total_hits: " . $response['total_hits'] . "\n";
print json_encode($response['rows'], JSON_PRETTY_PRINT);

while($response['next_token'] != null) {
    $request['search_query']['token'] = $response['next_token'];
    $request['search_query']['sort'] = null;// If next_token is included in the response, you cannot specify the sorting method because the token includes the sorting information. 
    $response = $otsClient->search($request);
    print json_encode($response['rows'], JSON_PRETTY_PRINT);
}

FAQ

References

  • When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, Boolean query, geo query, nested query, and exists query. You can use different query methods to query data from multiple dimensions based on your business requirements.

    If you want to sort or paginate the rows that meet the query conditions, you can use the sorting and paging feature. For more information, see Sorting and paging.

    If you want to collapse the result set based on a specific column, you can use the collapse (distinct) feature. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).

  • If you want to analyze data in a data table, such as obtaining the extreme values, sum, and total number of rows, you can perform aggregation operations or execute SQL statements. For more information, see Aggregation and SQL query.

  • If you want to quickly obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.