Each document contains multiple fields, and each field contains a set of words. The purpose of an index is to speed up data retrieval. Indexes can be divided into the following types based on mappings:
Inverted index: stores mappings from terms to document IDs in the following format: term -> (Doc1,Doc2,...,DocN). Inverted indexes are used for retrievals to help users identify the documents that contain specific search keywords.
Forward index: stores mappings from document IDs to fields in the following format: document ID -> (term1,term2,...termn). Forward indexes are divided into single-value indexes and multivalued indexes based on whether single-value attributes or multivalued attributes are specified. A single-value attribute that is not of the STRING data type is fixed in length. This makes data queries efficient. You can also update the data of a single-value index. A multivalued attribute is a field that contains an indefinite number of data values. The length of a multivalued attribute is not fixed. This negatively affects query performance. You cannot update the data of a multivalued index. After a document is retrieved, you can use a forward index to query the attributes of the document based on the document ID for statistics collection, sorting, and filtering. OpenSearch Vector Search Edition supports fields of the following types in forward indexes: INT8, UINT8, INT16, UINT16, INTEGER, UINT32, INT64, UINT64, FLOAT, DOUBLE, and STRING. A multivalued attribute is essentially a series of single-value attributes. Therefore, field types supported for single-value attributes correspond to field types supported for multivalued attributes. For example, INT8 corresponds to multi_int8, and STRING corresponds to multi_string.
Summary index: stores mappings from document IDs to summaries. The format of a summary index is similar to that of a forward index. However, in a summary index, a document ID is mapped to a collection of fields. You can use a summary index to identify the summary that corresponds to a document ID in a short period of time. Summary indexes are used to retrieve results that contain the values of the fields that you want to display. In most cases, the size of a summary is large. Summary indexes are not suitable for searches in which a large amount of summary content needs to be retrieved. Summary content can be retrieved only for documents that contain the values of the fields that you want to display. OpenSearch Vector Search Edition provides a compression mechanism for summary indexes. If you enable compression for a summary index in the schema, OpenSearch Vector Search Edition uses zlib to compress the summary index and then stores the compressed summary index. When OpenSearch Vector Search Edition reads data from the summary index, it decompresses the compressed summary index and then returns the retrieved results to the user.