All Products
Search
Document Center

OpenSearch:summary clause

Last Updated:Feb 28, 2024

Overview

You can include a summary clause in a query statement. Then, you can perform the following operations:

  • Perform a phase-2 query to obtain the summaries of the documents that you query. OpenSearch Retrieval Engine Edition allows you to obtain a summary of a document based on the document ID, the hash value of the primary key of the document, and the value of the primary key.

  • Specify the fields to be displayed in a summary.

  • Specify the fields to be highlighted in a summary.

Syntax

{ 
  "summary" : {
  }
}

Perform a phase-2 query

Obtain a summary of a document based on the document ID

{ 
  "config" : {
    "fetch_summary_type" : "docid"
  },
  "summary" : {
    "gids" : [
        "daogou|6|0|0|0|00000000000000004cd645cfd1c63041|184140777",
        "daogou|6|0|0|1|00000000000000005b3ceae33e5ab800|184140777"
    ]
  }
}

You must set the fetch_summary_type parameter to docid in a config clause and specify the gid parameter for the documents that you want to query in a summary clause. You can obtain the value of the gid parameter for each document from the response of the corresponding phase-1 query.

Obtain a summary of a document based on the hash value of the primary key

The method that is used to obtain a summary of a document based on the hash value of the primary key of the document is similar to the method that is used to obtain a summary of a document based on the ID of the document. OpenSearch Retrieval Engine Edition performs the query based on the gid parameter that you specify in the query statement. The following items describe the differences between the two methods:

  • If you want to obtain a summary of a document based on the hash value of the primary key of the document, set the fetch_summary_type parameter to pk in a config clause.

  • Different values of the fetch_summary_type parameter specify different methods that can be used to identify documents. In most cases, you can use the primary key of a document to identify the document, but you cannot use only the ID of a document to identify the document. If you specify docid as the value of the fetch_summary_type parameter, OpenSearch Retrieval Engine Edition identifies a document based on the specified document ID and the version numbers of the full data and the incremental data. If you specify pk as the value of the fetch_summary_type parameter, OpenSearch Retrieval Engine Edition does not require version information to identify a document. If the value of the fetch_summary_type parameter is pk, OpenSearch Retrieval Engine Edition ignores the components that specify the document ID and the version numbers of the full data and the incremental data in each gid parameter that you specify in the query statement.

  • If you want to obtain a summary of a document based on the hash value of the primary key of the document, specify a field as the primary key and set the has_primary_key_attribute parameter to true in the schema of the cluster in which the document is stored.

Example:

{ 
  "config" : {
    "fetch_summary_type" : "pk"
  },
  "summary" : {
    "gids" : [
        "daogou|6|100|100|100|00000000000000004cd645cfd1c63041|184140777",
        "daogou|6|200|200|200|00000000000000005b3ceae33e5ab800|184140777"
    ]
  }
}

Obtain a summary of a document based on the value of the primary key

If you use this method, OpenSearch Retrieval Engine Edition does not identify a document based on the gid parameter of the document. Instead, OpenSearch Retrieval Engine Edition identifies a document based on the primary key value of the document. If you want to use this method to obtain a summary of a document, perform the following operations:

  • Set the fetch_summary_type parameter to rawpk in a config clause.

  • Specify a field as the primary key of the document in the schema of the cluster in which the document is stored and make sure that the primary key field is included in the value of the hash_field parameter.

Example:

{ 
  "config" : {
    "fetch_summary_type" : "rawpk"
  },
  "summary" : {
    "gids" : [
        "cluster1:pk1,pk2",
				"cluster2:pk3,pk4"
    ]
  }
}


config=fetch_summary_type:rawpk&&fetch_summary=cluster1:pk1,pk2;cluster2:pk3,pk4


Note: The primary key value of a document may contain characters that conflict with the reserved characters in the query string. 
Therefore, you must escape the reserved characters in a query string by adding a backslash (\) in front of each reserved character. 
The following characters must be escaped in a summary clause: commas (,), colons (:), semicolons (;), ampersands (&), equal signs (=), and backslashes (\).  
For example, if your primary key value is abc,d:e\, specify abc\,d\:e\\ as the primary key value.

Specify the fields to be displayed in a summary

You can specify the fetch_fields parameter in a summary clause to specify the fields to be displayed in a summary.

Example:

{
  "summary" : {
    "fetch_fields" : ["title", "body", "price"]
  }
}

Specify the fields to be highlighted in a summary

You can use a summary clause to specify the fields to be highlighted in a summary. You can specify the following parameters:

  • highlighter: the name of the highlighter to be used.

  • pre_tag: the prefix tag of the fields to be highlighted.

  • post_tag: the suffix tag of the fields to be highlighted.

  • fields: the fields to be highlighted.

    • fragment_size: the length of each fragment.

    • number_of_fragments: the number of fragments.

Example:

{
  "summary" : {
    "highlight" : {
      "highlighter" : "plain",
      "pre_tag" : "<em>",
      "post_tag" : "</em>",
      "fields" : {
        "title" : {
          "fragment_size" : 100,
          "number_of_fragments" : 3
        }
      }
    }
  }
}

Usage notes

  • A summary clause is optional.

  • Summaries may fail to be obtained due to cluster instability or data updates. If the cluster on which a query is performed is unstable, the operation that you perform to obtain summaries may time out. When the system updates data, the data of the documents whose summaries you want to obtain may be in the deleted state for a short period of time because the system updates data by deleting old data and then loading real-time data.

  • We recommend that you do not use the document ID-based method to obtain summaries because document IDs are not fixed values and may change after the system loads incremental data or updates real-time data.