Highlight the query string - Tablestore - Alibaba Cloud Documentation Center

When you create a search index, you can enable the highlight feature for a Text field. This way, when you use the search index to query data based on the Text field, you can configure highlight parameters to highlight the query strings in the segments of the rows that meet the query conditions.

Scenarios

You can use the highlight feature in full-text search to highlight the query strings in the segments of the rows that meet the query conditions. The highlight feature is suitable for scenarios such as web search, chat history retrieval, and document search.

Feature overview

You can use the highlight feature to highlight the text that matches or relates to the query strings in the query results. This helps improve information retrieval efficiency by allowing users to quickly locate the query strings. You can use the highlight feature to accurately locate the required information in nested data that has a complicated structure, such as the JSON structure. By default, Tablestore uses <em></em> to highlight the query strings in the query results.

To use the highlight feature, you must complete the following configurations:

When you create a search index, set the enableHighlighting parameter to True for a Text field. For more information, see Create a search index.
Important
You can enable the highlight feature only for Text fields.
When you use the highlight feature in a query request, you can specify a custom highlight style by configuring parameters, such as the encoding method for the highlighted text fragments, the maximum number of highlighted text fragments that you want to return in each row, the opening tag, and the closing tag.

For example, you enable the highlight feature for a Text field when you create a search index. If you perform a match phrase query to query data that contains the West Lake query string in the Text field and configure the highlight parameters, a row in which the value of the Text field is Hangzhou West Lake Scenic Area meets the query conditions, and the highlighted text segment Hangzhou <em>West Lake</em> Scenic Area is returned.

Usage notes

If you enable the highlight feature in a match query or match phrase query, the query strings in the query results may be highlighted by using multiple opening tags (preTag) and closing tags (postTag).
If the tokenization method of a Text field is maximum semantic unit-based tokenization (MaxWord), the highlight feature is not supported when you perform a match phrase query on the Text field.
If you want to return multiple fragments, the query strings in the fragments may be split. In this case, the query strings may not be highlighted.

API operation

To use the highlight feature, call the Search operation and set the query type to TermQuery, TermsQuery, MatchQuery, MatchPhraseQuery, PrefixQuery, WildcardQuery, or NestedQuery.

Parameters

In most cases, you can configure the Highlight parameters to use the highlight feature. For subfields of a nested field, you must configure the InnerHits parameters to use the highlight feature.

Highlight parameters

Parameter		Description
highlightEncoder		The encoding method of the highlighted text fragments. Valid values: PLAIN (default): displays the highlighted text fragments without the need for encoding. HTML: performs HTML encoding on the highlighted text fragments. After HTML encoding is complete, `<` is converted into `<`, `>` into `>`, `"` into `"`, `'` into `'`, and `/` into `/`. If you want to display web pages, we recommend that you use the HTML format.
fieldHighlightParams		The highlight settings for the field. You can enable the highlight feature only for fields that contain the keywords specified in SearchQuery objects.
HighlightParameter	numberOfFragments	The maximum number of highlighted text fragments to return. We recommend that you set the value to 1.
	fragmentSize	The length of each text fragment to return. Default value: 100. Important The actual length of the returned text fragment may differ from the value of this parameter.
	preTag	The opening tag used to highlight the query keywords. Examples: `<em>` and `<b>`. Default value: `<em>`. You can specify a custom opening tag based on your business requirements. The preTag parameter supports the following character sets: `< > " ' /`, `a-z`, `A-Z`, and `0-9`.
	postTag	The closing tag used to highlight the query keywords. Examples: `</em>` and `</b>`. Default value: `<em>`. You can specify a custom closing tag based on your business requirements. The postTag parameter supports the following character sets: `< > " ' /`, `a-z`, `A-Z`, and `0-9`.
	highlightFragmentOrder	The sorting rule of the highlighted text fragments to return. TEXT_SEQUENCE (default): The highlighted text fragments are sorted based on the order of their appearance in the original text. SCORE: The highlighted text fragments are sorted based on the score of the hit keywords.

InnerHits parameters

Parameter	Description
sort	The sorting rule for the child rows of the nested field.
offset	The start position of the child rows to return if the nested field consists of multiple child rows.
limit	The maximum number of child rows to return if the nested field consists of multiple child rows. Default value: 3.
highlight	The highlight settings for the subfields of the nested field. For more information, see Highlight parameters.

Methods

Important

You can use the highlight feature only by using Tablestore SDKs.

Before you use the highlight feature, make sure that the following preparations are made:

An Alibaba Cloud account or a RAM user that has Tablestore operation permissions is created. For information about how to grant Tablestore operation permissions to a RAM user, see Use a RAM policy to grant permissions to a RAM user.
If you want to use the highlight feature by using Tablestore SDKs, an AccessKey pair is created for your Alibaba Cloud account or RAM user. For more information, see Create an AccessKey pair.
A data table is created. For more information, see Operations on a data table.
A search index is created for the data table and the highlight feature is enabled for a specific field. For more information, see Create a search index.
If you want to use the highlight feature by using Tablestore SDKs, an OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.

You can use the following Tablestore SDKs to use the highlight feature: Tablestore SDK for Java, Tablestore SDK for Go, Tablestore SDK for Python, and Tablestore SDK for Node.js. In this example, Tablestore SDK for Java is used.

Use the highlight feature when you query non-nested fields

The following sample code provides an example on how to use the MatchQuery feature to query data that matches hangzhou shanghai from the Col_Text field and highlight the keywords in the query results. In this example, the Col_Text field is of the Text type.

/**
 * Enable the highlight feature for keywords in the MatchQuery object. 
 */
public static void matchQueryWithHighlighting(SyncClient client) {
    SearchRequest searchRequest = SearchRequest.newBuilder()
            .tableName("<TABLE_NAME>")
            .indexName("<SEARCH_INDEX_NAME>")
            .returnAllColumnsFromIndex(true)
            .searchQuery(SearchQuery.newBuilder()
                    .limit(5)
                    .query(QueryBuilders.bool()
                            .should(QueryBuilders.match("Col_Text", "hangzhou shanghai")))
                    .highlight(Highlight.newBuilder()
                            .addFieldHighlightParam("Col_Text", HighlightParameter.newBuilder()
                                    .highlightFragmentOrder(HighlightFragmentOrder.TEXT_SEQUENCE)
                                    .preTag("<b>")
                                    .postTag("</b>")
                                    .build())
                            .build())
                    .build())
            .build();
    SearchResponse resp = client.search(searchRequest);

    // Display the highlighted query results. When you query a non-Nested field, set the prefix parameter to null. 
    printSearchHit(resp.getSearchHits(), "");
}

/**
 * Display the content that meets the query conditions. 
 * @param searchHits searchHits
 * If the output uses the @param prefix Nested structure, add the prefix to display the hierarchy information. 
 */
private static void printSearchHit(List<SearchHit> searchHits, String prefix) {
    for (SearchHit searchHit : searchHits) {
        if (searchHit.getScore() != null) {
            System.out.printf("%s Score: %s\n", prefix, searchHit.getScore());
        }

        if (searchHit.getOffset() != null) {
            System.out.printf("%s Offset: %s\n", prefix, searchHit.getOffset());
        }

        if (searchHit.getRow() != null) {
            System.out.printf("%s Row: %s\n", prefix, searchHit.getRow().toString());
        }

        // Display the highlighted fragments for each field. 
        if (searchHit.getHighlightResultItem() != null) {
            System.out.printf("%s Highlight: \n", prefix);
            StringBuilder strBuilder = new StringBuilder();
            for (Map.Entry<String, HighlightField> entry : searchHit.getHighlightResultItem().getHighlightFields().entrySet()) {
                strBuilder.append(entry.getKey()).append(":").append("[");
                strBuilder.append(StringUtils.join(",", entry.getValue().getFragments())).append("]\n");
            }
            System.out.printf("%s   %s", prefix, strBuilder);
        }

        System.out.println();
    }
}

Use the highlight feature when you query nested fields

The following sample code provides an example on how to use nested query to query the rows in which the value of the Level1_Col1_Nested subcolumn of the nested column named Col_Nested matches hangzhou shanghai and highlight the query strings in the query results.

/**
 * Enable the highlight feature by using the innerHits parameter for the nested query. 
 */
public static void nestedQueryWithHighlighting(SyncClient client) {
        SearchRequest searchRequest = SearchRequest.newBuilder()
                .tableName("<TABLE_NAME>")
                .indexName("<SEARCH_INDEX_NAME>")
                .returnAllColumnsFromIndex(true)
                .searchQuery(SearchQuery.newBuilder()
                        .limit(5)
                        .query(QueryBuilders.nested()
                                .path("Col_Nested")
                                .scoreMode(ScoreMode.Min)
                                .query(QueryBuilders.match("Col_Nested.Level1_Col1_Nested", "hangzhou shanghai"))
                                .innerHits(InnerHits.newBuilder()
                                        .highlight(Highlight.newBuilder()
                                                .addFieldHighlightParam("Col_Nested.Level1_Col1_Nested", HighlightParameter.newBuilder().build())
                                                .build())
                                        .build()))
                        .build())
                .build();
        SearchResponse resp = client.search(searchRequest);

        // Display the highlighted results. 
        printSearchHit(resp.getSearchHits(), "");
}

/**
 * Display the content that meets the query conditions. 
 * @param searchHits searchHits
 * If the output uses the @param prefix Nested structure, add the prefix to display the hierarchy information. 
 */
private static void printSearchHit(List<SearchHit> searchHits, String prefix) {
    for (SearchHit searchHit : searchHits) {
        if (searchHit.getScore() != null) {
            System.out.printf("%s Score: %s\n", prefix, searchHit.getScore());
        }

        if (searchHit.getOffset() != null) {
            System.out.printf("%s Offset: %s\n", prefix, searchHit.getOffset());
        }

        if (searchHit.getRow() != null) {
            System.out.printf("%s Row: %s\n", prefix, searchHit.getRow().toString());
        }

        // Display the highlighted text segments of the column in each row. 
        if (searchHit.getHighlightResultItem() != null) {
            System.out.printf("%s Highlight: \n", prefix);
            StringBuilder strBuilder = new StringBuilder();
            for (Map.Entry<String, HighlightField> entry : searchHit.getHighlightResultItem().getHighlightFields().entrySet()) {
                strBuilder.append(entry.getKey()).append(":").append("[");
                strBuilder.append(StringUtils.join(",", entry.getValue().getFragments())).append("]\n");
            }
            System.out.printf("%s   %s", prefix, strBuilder);
        }

        // The highlighted results of the nested column. 
        for (SearchInnerHit searchInnerHit : searchHit.getSearchInnerHits().values()) {
            System.out.printf("%s Path: %s\n", prefix, searchInnerHit.getPath());
            System.out.printf("%s InnerHit: \n", prefix);
            printSearchHit(searchInnerHit.getSubSearchHits(), prefix + "    ");
        }

        System.out.println();
    }
}

For example, the Col_Nested field consists of the following subfields: the Level1_Col1_Text subfield of the Text type and the Level1_Col2_Nested subfield of the Nested type. The Level1_Col2_Nested subfield of the Nested type also consists of the Level2_Col1_Text field.

The following sample code provides an example on how to add a Boolean query to the nested query to highlight the query strings in the Level1_Col1_Text field and the Level2_Col1_Text subfield of the Level1_Col2_Nested field.

public static void nestedQueryWithHighlighting(SyncClient client) {
    SearchRequest searchRequest = SearchRequest.newBuilder()
            .tableName("<TABLE_NAME>")
            .indexName("<SEARCH_INDEX_NAME>")
            .returnAllColumnsFromIndex(true)
            .searchQuery(SearchQuery.newBuilder()
                    .limit(5)
                    .query(QueryBuilders.nested()
                            .path("Col_Nested")
                            .scoreMode(ScoreMode.Min)
                            .query(QueryBuilders.bool()
                                    .should(QueryBuilders.match("Col_Nested.Level1_Col1_Text", "hangzhou shanghai"))
                                    .should(QueryBuilders.nested()
                                            .path("Col_Nested.Level1_Col2_Nested")
                                            .scoreMode(ScoreMode.Min)
                                            .query(QueryBuilders.match("Col_Nested.Level1_Col2_Nested.Level2_Col1_Text", "hangzhou shanghai"))
                                            .innerHits(InnerHits.newBuilder()
                                                    .highlight(Highlight.newBuilder()
                                                            .addFieldHighlightParam("Col_Nested.Level1_Col2_Nested.Level2_Col1_Text", HighlightParame
                                                            .build())
                                                    .build())))
                            .innerHits(InnerHits.newBuilder()
                                    .sort(new Sort(Arrays.asList(
                                            new ScoreSort(),
                                            new DocSort()
                                    )))
                                    .highlight(Highlight.newBuilder()
                                            .addFieldHighlightParam("Col_Nested.Level1_Col1_Text", HighlightParameter.newBuilder().build())
                                            .build())
                                    .build()))
                            .build())
            .build();
    SearchResponse resp = client.search(searchRequest);
    // Display the highlighted results. 
    printSearchHit(resp.getSearchHits(), "");
}

/**
 * Display the content that meets the query conditions. 
 * @param searchHits searchHits
 * If the output uses the @param prefix Nested structure, add the prefix to display the hierarchy information. 
 */
private static void printSearchHit(List<SearchHit> searchHits, String prefix) {
    for (SearchHit searchHit : searchHits) {
        if (searchHit.getScore() != null) {
            System.out.printf("%s Score: %s\n", prefix, searchHit.getScore());
        }

        if (searchHit.getOffset() != null) {
            System.out.printf("%s Offset: %s\n", prefix, searchHit.getOffset());
        }

        if (searchHit.getRow() != null) {
            System.out.printf("%s Row: %s\n", prefix, searchHit.getRow().toString());
        }

        // Display the highlighted text segments of the field in each row. 
        if (searchHit.getHighlightResultItem() != null) {
            System.out.printf("%s Highlight: \n", prefix);
            StringBuilder strBuilder = new StringBuilder();
            for (Map.Entry<String, HighlightField> entry : searchHit.getHighlightResultItem().getHighlightFields().entrySet()) {
                strBuilder.append(entry.getKey()).append(":").append("[");
                strBuilder.append(StringUtils.join(",", entry.getValue().getFragments())).append("]\n");
            }
            System.out.printf("%s   %s", prefix, strBuilder);
        }

        // The highlighted results of the nested column. 
        for (SearchInnerHit searchInnerHit : searchHit.getSearchInnerHits().values()) {
            System.out.printf("%s Path: %s\n", prefix, searchInnerHit.getPath());
            System.out.printf("%s InnerHit: \n", prefix);
            printSearchHit(searchInnerHit.getSubSearchHits(), prefix + "    ");
        }

        System.out.println();
    }
}

Billing rules

The highlight feature that you use when you query data does not affect the existing billing rules of Tablestore.

When you use a search index to query data, you are charged for the read throughput that is consumed. For more information, see Billable items of search indexes.

FAQ

References

When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, fuzzy query, Boolean query, geo query, nested query, KNN vector query, and exists query. You can select query methods based on your business requirements to query data from multiple dimensions.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.