Introduction to the Industry Algorithm Edition for content communities
To resolve the tough issues and satisfy the requirements in IT content search scenarios of the content industry, OpenSearch provides Industry Algorithm Edition for content communities that adopts the latest algorithms. This edition provides content-specific capabilities, such as intelligent semantic understanding, vector retrieval, and sorting algorithms. These capabilities ensure the search performance and accuracy for the content industry. This edition provides a multimodal search solution that resolves the tough issues such as the long search latency and high zero-result rate. The long search latency is caused by a large amount of dictionary data. The high zero-result rate is caused by high resource consumption. To improve search accuracy for the content industry, OpenSearch also provides the vector model to implement vector retrieval and multimodal search.
Edition comparison
Feature | General-purpose Edition | Industry Algorithm Edition for content communities |
One-stop configuration | After an application is created, you must create and configure query analysis rules, sort policies, and drop-down suggestion models. | This edition provides the capabilities and features that are required for common search scenarios in the content industry. It also provides templates for you to configure application schemas and index schemas with a few clicks. This way, new users can use this service with ease. |
Query analysis | This edition provides query analysis capabilities for general purpose. The capabilities include synonym configuration, stop word filtering, spelling correction, term weight analysis, and category prediction. | This edition provides enhanced analyzers and the enhanced query analysis feature for the content industry. This edition is developed to satisfy the requirements in content search scenarios and resolve the tough issues in the content industry. This edition can create indexes that are more precise and better identify user needs. Therefore, this edition has better performance than General-purpose Edition. |
Sort policy | After an application is created, you must configure the sort policy and perform debugging based on the business scenarios. | In addition to templates for application schemas and index schemas, this edition also provides common rough sort expressions and fine sort expressions. The templates and expressions can satisfy the sorting requirements in most search scenarios of the content industry. |
Feature iteration | This edition regularly updates built-in dictionaries, such as the dictionaries for the analyzer and query analysis features. | This edition is constantly updated in line with the changes in words and services in the content industry. It keeps optimizing the existing features of analysis and query analysis to adapt to industrial changes. |
Comparison of query analysis performance
Compared with General-purpose Edition, Industry Algorithm Edition for content communities provides better query analysis performance. For example, Industry Algorithm Edition for content communities corrects the invalid entries in General-purpose Edition. Industry Algorithm Edition for content communities enriches the existing dictionaries by incorporating a variety of word usages in the content industry. The following tables compare the performance of the two editions in different query analysis capabilities.
Space-based analysis
query | General-purpose Edition | Industry Algorithm Edition |
为了解压缩 | 为 了解 压缩 | 为了 解 压缩 |
实参与形参 | 实 参与 形参 | 实参 与 形参 |
结构体重载 | 结构 体重 载 | 结构体重载 |
googlechromeframe | googlechromeframe | google chrome frame |
Spelling correction
query | General-purpose Edition | Industry Algorithm Edition |
淘宝只能视觉 | 淘宝只能视觉 | 淘宝智能视觉 |
mybatics代码生成 | mybatics代码生成 | mybatis代码生成 |
计算机网路 | 计算机网路 | 计算机网络 |
微行小程序 | 微型小程序 | 微信小程序 |
深度学西 | 深度学西 | 深度学习 |
Vector-based retrieval for the content industry
This feature provides a high-quality vector-based retrieval model for the distribution of vertical industry data in the content industry to ensure the retrieval effect of long-tail search queries that include spelling error queries and synonym-based rewrite queries.
Vector-based retrieval
query | 美国gmted2010的shuju下载 |
Vector-based retrieval top 1 | gmt43相关代码、资料下载地址 |
Vector-based retrieval top 2 | gmt0054-2010.pdf |
Vector-based retrieval top 3 | gmted2010美国download地址 |
query | 3D游戏画面处理 |
Vector-based retrieval top 1 | 3d游戏动画处理基础 |
Vector-based retrieval top 2 | 3d游戏动画的基础 |
Vector-based retrieval top 3 | 动画游戏处理 |
query | 禁用n卡 |
Vector-based retrieval top 1 | 网卡的禁止和启动 |
Vector-based retrieval top 2 | 禁用网卡 |
Vector-based retrieval top 3 | 禁用及启用网卡 |
Usage notes
For more information about how to create and configure an application of Industry Algorithm Edition for content communities, see Configure an application of Industry Algorithm Edition for content communities.
Exclusive applications can be converted from General-purpose Edition to Industry-specific Enhanced Edition, but not from Industry-specific Enhanced Edition to General-purpose Edition.
Industry Algorithm Edition for content communities is applicable only to exclusive applications.
If you convert a shared application instance to an exclusive application instance, make sure that both the online application and the application instance are exclusive before you adapt the application instance to Industry Algorithm Edition for content communities.
Make sure that each field tag is associated with a specific field in the application schema. Otherwise, an error message is returned.