All Products
Search
Document Center

OpenSearch:Dictionary configurations

Last Updated:Aug 27, 2024

Overview

The dictionary configurations in advanced settings allow you to configure custom dictionaries for text analysis. If the analysis result that built-in analyzers return for a search query cannot meet your business requirements, you can configure a custom dictionary for the appropriate analyzer to intervene in the analysis result. This way, an expected analysis result can be returned.

By default, the system provides two versions of dictionary configurations. The version whose name is suffixed with _offline_adv_v1 is created by the system and contains eight types of dictionaries.

Dictionary type

General-purpose Chinese text analyzer.dict

Chinese e-commerce content analyzer of the general-purpose edition.dict

Chinese game content analyzer of the general-purpose edition.dict

Chinese education content analyzer.dict

Chinese entertainment content analyzer.dict

English e-commerce content analyzer of the general-purpose edition.dict

Chinese IT content analyzer.dict

Chinese e-commerce content analyzer.dict

The version whose name is suffixed with _offline_adv_edit is editable. You can add entries to a specific dictionary, save the modifications to advanced settings, and then click Publish in the Actions column. The system automatically generates a new version of dictionary configurations in advanced settings. The names of new versions are suffixed with _offline_adv_v{n}, where {n} indicates an integer that starts from 2. You can specify a version description each time you publish a new version of dictionary configurations in advanced settings to describe the purpose of the new version.

Add a custom intervention entry

Bad case: The value of a field in a document is "乒乓球拍卖完了". When a user searches for "球拍", the document cannot be retrieved because the analysis result of the field is "乒乓 球 拍卖 完了", and the terms in the search query cannot match the terms in the analysis result of the field.

Solution: To resolve this issue, perform the following steps to add a custom entry "乒乓球拍":

  1. In the left-side navigation pane of the Instance Details page, choose Configuration Center > Advanced Configurations. On the page that appears, find the version whose name is suffixed with _offline_adv_edit, and click Modify in the Actions column.

  1. Find the dictionary used in the index table that contains the field "乒乓球拍", and click Modify in the Actions column.

  1. Select one of the following methods to add a custom intervention entry:

    1. In the panel that appears, enter the following custom intervention entry: 乒乓球拍. Then, click OK.

    2. In the panel that appears, upload a new dictionary file and enter your custom intervention entry in the field. Then, click OK.

      Note: The size of a file cannot exceed 5 MB and the file must be in the .dict or .txt format.

You can add your custom intervention entries in one of the following ways:

1. Not split an intervention entry: Enter only one intervention entry per line. The entry must be encoded in the UTF-8 format and cannot contain spaces or \t. Example:

开放搜索
opensearch

2. Split an intervention entry: Enter the original entry and the words generated after the original entry is split. The words must be encoded in the UTF-8 format and split by \t. Separate the words with spaces. Example:

开放搜索 开放 搜索
opensearch	open search
  1. Publish the edited version of dictionary configurations.

Specify a description for the new version.

After the edited version is published, the system automatically generates a new version of dictionary configurations.

  1. To make the new version of dictionary configurations take effect on a cluster, synchronize the new version to the cluster and trigger a reindexing task.

You can view the progress of the reindexing task on the Data Source Changes tab of the Change History page in the O&M Center module.

After the indexes are rebuilt, the dictionary configurations take effect for online queries immediately.

Delete versions of dictionary configurations

You can delete versions of dictionary configurations that are in the Unused state on the Dictionary Configurations tab of the Advanced Configurations page.

For a version that is in the In Use state, you can only view the dictionary configurations. If you want to delete such a version, perform the following steps: In the left-side navigation pane of the Instance Details page, choose O&M Center > O&M Management. On the page that appears, click Update Configurations and change the value of the Dictionary Configuration Version parameter. Then, synchronize the update to your cluster and trigger a reindexing task. You can delete the version only after the state of the version is changed to Unused.

Usage notes

  • Each instance can have only one editable version of dictionary configurations.

  • You can only view a version of dictionary configurations that is in the In Use state. You cannot delete such a version.

  • On the Advanced Configurations page, you can perform dictionary configurations and query configurations.