All Products
Search
Document Center

Tablestore:Wildcard query

Last Updated:Oct 23, 2024

You can use wildcard query to perform fuzzy queries. A wildcard query is similar to the LIKE operator, which can be used to query data that contains a word or phrase in text.

Note

If you want to use the NOT LIKE operator, you must use wildcard query together with the mustNotQueries parameter of Boolean query. For more information, see Boolean query.

Feature overview

You can use specific symbols in a wildcard query to perform fuzzy matching. This improves the accuracy and efficiency of data search.

When you use wildcard query to query data, you can use a string that contains the asterisk (*) and question mark (?) wildcard characters to match data. The asterisk (*) wildcard character matches a string of any length at, before, or after a search term. The question mark (?) wildcard character matches a single character in a specific position. A query string can start with an asterisk (*) or a question mark (?). For example, if you search for the table*e string, tablestore is matched.

The Keyword and FuzzyKeyword types support wildcard query.

  • Keyword: the basic data type of strings. When the Keyword type is used, the performance of fuzzy queries, such as wildcard queries, on medium- and large-sized data is poor. The performance declines when the data size increases.

  • FuzzyKeyword: a data type that is optimized for fuzzy queries, such as wildcard queries. The FuzzyKeyword type provides better and more stable query performance than the Keyword type regardless of the data size. The performance does not decline when the data size increases.

To meet fuzzy query requirements in various scenarios, search indexes provides three types of wildcard query. The following table describes the three types of wildcard query.

Note

This topic describes wildcard query based on the Keyword and FuzzyKeyword types. For information about fuzzy query based on the Text type, see Fuzzy query.

Data type

Query method

Advantage

Disadvantage

Keyword

WildcardQuery

Compatible with Elasticsearch

The query performance declines when the size of index data increases.

FuzzyKeyword

WildcardQuery

The performance is good and stable and does not decline when the data size increases.

Data expansion occurs.

Text

MatchPhraseQuery

Data of the Text type can be case-insensitive.

Data expansion occurs.

Usage notes

The length of a query string in wildcard queries on data of the Keyword or FuzzyKeyword type cannot exceed 32 characters in length.

API operation

To perform a wildcard query, call the Search or ParallelScan operation and set the query type to WildcardQuery.

Parameters

Parameter

Description

fieldName

The name of the field that you want to query.

value

The string that contains wildcard characters. The string cannot exceed 32 characters in length.

query

The type of the query. Set this parameter to WildcardQuery.

getTotalCount

Specifies whether to return the total number of rows that meet the query conditions. The default value of this parameter is false, which specifies that the total number of rows that meet the query conditions is not returned.

If you set this parameter to true, the query performance is compromised.

weight

The weight that you want to assign to the field that you want to query to calculate the BM25-based keyword relevance score. This parameter is used in full-text search scenarios. If you specify a higher weight for the field that you want to query, the BM25-based keyword relevance score for the field is higher. The value of this parameter is a positive floating point number.

This parameter does not affect the number of rows that are returned but affects the BM25-based keyword relevance scores of the query results.

tableName

The name of the data table.

indexName

The name of the search index.

columnsToGet

Specifies whether to return all columns of each row that meets the query conditions. You can configure the returnAll and columns fields for this parameter.

The default value of the returnAll field is false, which specifies that not all columns are returned. In this case, you can use the columns field to specify the columns that you want to return. If you do not specify the columns that you want to return, only the primary key columns are returned.

If you set the returnAll field to true, all columns are returned.

Methods

You can use the Tablestore console, Tablestore CLI, or Tablestore SDKs to perform a wildcard query. Before you perform a wildcard query, make sure that the following preparations are made:

Important

You can use only Tablestore SDKs to perform a wildcard query on data of the FuzzyKeyword type.

Use the Tablestore console

  1. Go to the Indexes tab.

    1. Log on to the Tablestore console.

    2. In the top navigation bar, select a resource group and a region.

    3. On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column of the instance.

    4. On the Tables tab of the Instance Details tab, click the name of the data table or click Indexes in the Actions column of the data table.

  2. On the Indexes tab, find the search index that you want to use to query data and click Manage Data in the Actions column.

  3. In the Search dialog box, specify query conditions.

    1. By default, the system returns all attribute columns. To return specific attribute columns, turn off All Columns and specify the attribute columns that you want to return. Separate multiple attribute columns with commas (,).

      Note

      By default, the system returns all primary key columns of the data table.

    2. Select the And, Or, or Not logical operator based on your business requirements.

      If you select the And logical operator, data that meets the query conditions is returned. If you select the Or operator and specify a single query condition, data that meets the query condition is returned. If you select the Or logical operator and specify multiple query conditions, data that meets one of the query conditions is returned. If you select the Not logical operator, data that does not meet the query conditions is returned.

    3. Select a field and click Add.

    4. Set the Query Type parameter to WildcardQuery(WildcardQuery) and enter a value that contains wildcard characters.

    5. By default, the sorting feature is disabled. If you want to sort the query results based on specific fields, turn on Sort and specify the fields based on which you want to sort the query results and the sorting order.

    6. By default, the aggregation feature is disabled. If you want to collect statistics on a specific field, turn on Collect Statistics, specify the field based on which you want to collect statistics, and then configure the information that is required to collect statistics.

  4. Click OK.

    Data that meets the query conditions is displayed in the specified order on the Indexes tab.

Use the Tablestore CLI

You can use the Tablestore CLI to run the search command to query data by using search indexes. For more information, see Search index.

Important

You can use the Tablestore CLI to perform a wildcard query only on data of the Keyword type. You cannot use the Tablestore CLI to perform a wildcard query on data of the FuzzyKeyword type.

  1. Run the search command to use the search_index search index to query data and return all indexed columns of each row that meets the query conditions.

    search -n search_index --return_all_indexed
  2. Enter the query conditions as prompted by the system:

    {
        "Offset": -1,
        "Limit": 10,
        "Collapse": null,
        "Sort": null,
        "GetTotalCount": true,
        "Token": null,
        "Query": {
            "Name": "WildcardQuery",
            "Query": {
                "FieldName": "col_keyword",
                "Value": "hang*u"
            }
        }
    }

Use Tablestore SDKs

You can perform a wildcard query by using the following Tablestore SDKs: Tablestore SDK for Java, Tablestore SDK for Go, Tablestore SDK for Python, Tablestore SDK for Node.js, Tablestore SDK for .NET, and Tablestore SDK for PHP. In this example, Tablestore SDK for Java is used.

Note

The query statement in a wildcard query on data of the Keyword type is the same as the query statement in a wildcard query on data of the FuzzyKeyword type. However, the type of the field that you want to query is different.

The following sample code provides an example on how to query rows in which the value of the Col_Keyword column matches the "hang*u" pattern.

/**
 * Search the table for rows in which the value of the Col_Keyword column matches the "hang*u" pattern. 
 * @param client
 */
private static void wildcardQuery(SyncClient client) {
    SearchQuery searchQuery = new SearchQuery();
    WildcardQuery wildcardQuery = new WildcardQuery(); // Use WildcardQuery. 
    wildcardQuery.setFieldName("Col_Keyword");
    wildcardQuery.setValue("hang*u"); // Specify a string that contains one or more wildcard characters in wildcardQuery. 
    searchQuery.setQuery(wildcardQuery);
    //searchQuery.setGetTotalCount(true); // Specify that the total number of matched rows is returned. 

    SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery);
    // You can configure the columnsToGet parameter to specify the columns to return or specify that all columns are returned. If you do not configure this parameter, only the primary key columns are returned. 
    //SearchRequest.ColumnsToGet columnsToGet = new SearchRequest.ColumnsToGet();
    //columnsToGet.setReturnAll(true); // Specify that all columns are returned. 
    //columnsToGet.setColumns(Arrays.asList("ColName1","ColName2")); // Specify the columns that you want to return. 
    //searchRequest.setColumnsToGet(columnsToGet);

    SearchResponse resp = client.search(searchRequest);
    //System.out.println("TotalCount: " + resp.getTotalCount()); // Display the total number of matched rows instead of the number of returned rows. 
    System.out.println("Row: " + resp.getRows());
}

Billing rules

When you use a search index to query data, you are charged for the read throughput that is consumed. For more information, see Billable items of search indexes.

FAQ

References

  • When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, fuzzy query, Boolean query, geo query, nested query, KNN vector query, and exists query. You can select query methods based on your business requirements to query data from multiple dimensions.

    You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.

    You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).

  • If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.

  • If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.