All Products
Search
Document Center

Tablestore:Match query

Last Updated:Aug 01, 2024

You can use match query (MatchQuery) to query data in a table based on approximate matches. Tablestore tokenizes the values in TEXT columns and the keywords that you use to perform match queries based on the analyzer type that you specify. This way, Tablestore can perform match queries based on the tokens. We recommend that you use match phase query for columns for which fuzzy tokenization is used to ensure high performance in fuzzy queries.

Prerequisites

Parameters

Parameter

Description

fieldName

The name of the column that you want to match.

Match query applies to TEXT columns.

text

The keyword that is used to match the value of the column when you perform a match query.

If the column that you want to match is a TEXT column, the keyword is tokenized into multiple tokens based on the analyzer type that you specify when you create the search index. By default, single-word tokenization is performed if you do not specify the analyzer type when you create the search index.

For example, if the column that you want to match is a TEXT column, you set the analyzer type to single-word tokenization, and you use "this is" as a search keyword, you can obtain query results such as "..., this is tablestore", "is this tablestore", "tablestore is cool", "this", and "is".

query

The query type, which is set to matchQuery.

offset

The position from which the current query starts.

limit

The maximum number of rows that you want the current query to return.

To query only the number of matched rows without returning specific data, you can set limit to 0. This way, Tablestore returns the number of matched rows without specific data from the table.

minimumShouldMatch

The minimum number of matched tokens contained in a column value.

A row is returned only when the value of the fieldName column in the row contains at least the minimum number of matched tokens.

Note

The minimumShouldMatch parameter must be used with the OR logical operator.

operator

The logical operator. By default, OR is used as the logical operator, which specifies that a row meets the query conditions when the column value contains at least the minimum number of tokens.

If you set the operator parameter to AND, the row meets the query conditions only when the column value contains all tokens.

getTotalCount

Specifies whether to return the total number of rows that meet the query conditions. The default value of this parameter is false, which specifies that the total number of rows that meet the query conditions is not returned.

If you set this parameter to true, the query performance is compromised.

weight

The weight that you want to assign to the field that you want to query to calculate the BM25-based keyword relevance score. This parameter is used in full-text search scenarios. If you specify a higher weight for the field that you want to query, the BM25-based keyword relevance score is higher for the field. The value of this parameter is a positive floating point number.

This parameter does not affect the number of rows that are returned. However, this parameter affects the BM25-based keyword relevance scores of the query results.

tableName

The name of the data table.

indexName

The name of the search index.

columnsToGet

Specifies whether to return all columns of each row that meets the query conditions. You can specify the returnAll and columns fields for this parameter.

The default value of the returnAll field is false, which specifies that not all columns are returned. In this case, you can use the columns field to specify the columns that you want to return. If you do not specify the columns that you want to return, only the primary key columns are returned.

If you set the returnAll field to true, all columns are returned.

Sample code

The following sample code provides an example on how to query the rows in which the value of the Col_Keyword column matches "hangzhou" in a table:

/**
 * Query the rows in which the value of the Col_Keyword column matches "hangzhou" in a table. Tablestore returns the total number of rows that meet the query conditions and the specific data of some of these rows. 
 * @param client
 */
private static void matchQuery(SyncClient client) {
    SearchQuery searchQuery = new SearchQuery();
    MatchQuery matchQuery = new MatchQuery(); // Set the query type to MatchQuery. 
    matchQuery.setFieldName("Col_Keyword"); // Specify the name of the column that you want to query. 
    matchQuery.setText("hangzhou"); // Specify the keyword that you want to match. 
    searchQuery.setQuery(matchQuery);
    searchQuery.setOffset(0); // Set offset to 0. 
    searchQuery.setLimit(20); // Set limit to 20 to return up to 20 rows. 
    //searchQuery.setGetTotalCount(true); // Specify that the total number of matched rows is returned. 

    SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery);
    // You can configure the columnsToGet parameter to specify the columns to return or specify that all columns are returned. If you do not configure this parameter, only the primary key columns are returned. 
    //SearchRequest.ColumnsToGet columnsToGet = new SearchRequest.ColumnsToGet();
    //columnsToGet.setReturnAll(true); // Specify that all columns are returned. 
    //columnsToGet.setColumns(Arrays.asList("ColName1","ColName2")); // Specify the columns that you want to return. 
    //searchRequest.setColumnsToGet(columnsToGet);

    SearchResponse resp = client.search(searchRequest);
    //System.out.println("TotalCount: " + resp.getTotalCount()); // Specify that the total number of matched rows instead of the number of returned rows is displayed. 
    System.out.println("Row: " + resp.getRows());
}
            

FAQ

References

  • When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, KNN vector query, nested query, and exists query. You can use the query methods provided by the search index to query data from multiple dimensions based on your business requirements.

    You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.

    You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).

  • If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.

  • If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.