Build Conversational Search Based on OpenSearch Vector Search Edition Integrated with a Large Language Model

This article explains why users should build a conversational search service based on vector search integrated with LLM using charts, scenarios, and configuration processes.

LLM-Powered Conversational Search Ushers in a New Era

With the release of large language model (LLM) applications in late 2022, the number of monthly active users exceeded 100 million within just two months. The amazing growth rate has set a new record for AI applications and brought the world into the era of LLMs.

The new technology wave has brought about innovations in business scenarios. The participation of large enterprises (such as Google and Bing) has made conversational search a new field for enterprises to explore. The primary problem to solve is how to combine the powerful logical reasoning and conversational capabilities of LLMs with the business data of an enterprise's specific field to create a conversational search service dedicated to the enterprise.

Why Is It Impossible to Directly Use LLMs?

LLMs' excellent performance of understanding everything and being able to talk about everything mainly relies on the general world knowledge learned by LLMs. LLMs can answer universal questions. However, if LLMs are directly used to answer professional questions in specific fields, the obtained results are often completely wrong and irrelevant because the general world knowledge does not contain enterprise-specific data.

In the example shown in the following figure, Havenask is an open-source large-scale search engine developed by Alibaba Cloud. As the underlying engine of Alibaba Cloud OpenSearch, Havenask was open-sourced in November 2022. However, LLMs were unaware of this information. It can be inferred that enterprises need a combined solution to allow LLMs better serve their needs to build conversational search services based on their data.

Why Should We Build a Conversational Search Service Based on Vector Search Integrated with LLM?

Application Scenarios for Conversational Search

Conversational search can be applied to e-commerce, content, education, internal search services of enterprises, and other fields. Based on customer characteristics and questions, conversational search provides accurate Q&A results, so users can efficiently obtain information.

How Can Enterprises Build Conversational Search Services in Enterprise-Specific Fields Based on Their Data?

Most enterprises adopt the document slicing + vector search + LLMs for question answering solution to build conversational search in enterprise-specific fields.

With this solution, vector features are extracted from enterprise data and conversational interaction information and then stored in a vector search engine to build indexes and perform the similarity-based recall. The returned top-N results are imported into LLMs, which integrate information in a conversational manner and return the results to users. This solution has advantages in terms of cost, effectiveness, and business flexibility, making it the preferred solution for enterprises.

What Is a Vector?

Unstructured data generated in the physical world (such as images, audio, video, and dialogue information) is converted into structured multidimensional vectors. These vectors are used to identify entities and relationships between entities. Then, the distance between the vectors is calculated. Generally, the closer the distance is, the higher the similarity is. The top-N results with the highest similarity are recalled to complete the search. We often use vector search in our daily life, for example, image search, price comparison during shopping, personalized search, and semantic understanding.

Why Can Vectors Be Used for Conversational Search?

One typical application scenario of vector search is natural semantic understanding. Similarly, the core of conversational search is also the semantic understanding of questions and answers.

The following figure shows an example. When a user queries about Zhejiang First Hospital, the traditional keyword-based search method finds no results because databases do not contain the keyword Zhejiang First Hospital. In this case, vector analysis is introduced to analyze the correlation between expressions and clicks during historical search behavior, establish a semantic correlation model, and express data features with high-dimensional vectors. After comparing vector distances, it is found that the correlation between Zhejiang First Hospital and The First Affiliated Hospital, Zhejiang University School of Medicine is very high, and the desired information can be retrieved.

It can be seen that vectors can play an important role in semantic analysis and in returning relevant data results in conversational search solutions.

An Introduction to the Solution of OpenSearch Vector Search Edition Integrated With LLM

OpenSearch Vector Search Edition is a large-scale distributed search engine developed by Alibaba Cloud. Its core capabilities are widely used by Alibaba Group and Ant Group in many services. OpenSearch Vector Search Edition focuses on vector search scenarios and allows you to query data within milliseconds, update data within seconds, and write data in real-time. OpenSearch Vector Search Edition supports hybrid search that combines tag-based search and vector search. The same platform returns vector search results in Q&A scenarios for different enterprises in specific service scenarios.

OpenSearch Vector Search Edition integrated with an LLM works in two stages. In the first stage, OpenSearch Vector Search Edition vectorizes business data. In the second stage, the online search service of OpenSearch Vector Search Edition searches for the required content and returns search results.

(1) Business Data Preprocessing

You must preprocess the business data and build a vector index for vector search to allow end users to search for content based on their requirements.

Step 1: Import the business data in the TEXT format into the text vectorization model to obtain business data in the form of vectors.

Step 2: Import business data in the form of vectors to OpenSearch Vector Search Edition to build a vector index.

(2) Online Q&A and Content Search

After the vector search feature is implemented, OpenSearch Vector Search Edition obtains the top-N search results and returns the Q&A results using the LLM.

Step 1: Import the query of an end user to the text vectorization model to obtain the query in the form of vectors

Step 2: Import the query in the form of vectors to OpenSearch Vector Search Edition

Step 3: The built-in vector search engine of OpenSearch Vector Search Edition obtains the top-N search results from the business data.

Step 4: OpenSearch Vector Search Edition integrates the top-N search results as a prompt and imports the prompt into the LLM.

Step 5: The system returns the Q&A results generated by the LLM and the search results retrieved based on vectors to the end user.

(3) Search Result Demonstration

The following figure shows the result of conversational search built using this solution and using OpenSearch product documentation as business data.

What Are the Advantages of OpenSearch Vector Search Edition Integrated with LLM?

(1) High-Performance: The Vector Search Engine Developed by Alibaba Cloud Ensures High Performance

In LLM scenarios, high-dimensional vectors require high performance.
OpenSearch Vector Search Edition can respond to hundreds of billions of data records within milliseconds, update data in real-time, and display the updated results within seconds.
The search performance of OpenSearch Vector Search Edition is several times higher than an open-source vector search engine. The recall rate of OpenSearch Vector Search Edition is significantly higher than an open-source vector search engine in high queries per second (QPS) scenarios.

OpenSearch Vector Search Edition vs. Open-Source Vector Search Engine: Medium Data Scenarios

OpenSearch Vector Search Edition vs. Open-Source Vector Search Engine: Big Data Scenarios

Data Source: Alibaba AI Engine Division Team, November 2022

(2) Cost-Effective: Multiple Methods Are Available to Reduce Storage Costs and Resource Consumption

Data Compression: You can convert raw data into the FLOAT data type for storage and then use efficient algorithms (such as ZSTD) to compress data. This way, storage costs are reduced.
Fine-Grained Index Schema Design: Different optimization strategies can be used to reduce the index size for different indexes.
Index Loading: You can load indexes without locking indexes to memory by using the mmap loading policy, which effectively reduces the memory overhead.
Engine: Compared with open-source vector search engines, OpenSearch Vector Search Edition can build indexes of smaller sizes and consumes fewer GPU resources. Under the same data conditions, OpenSearch Vector Search Edition only occupies about 50% of the memory occupied by an open-source vector search engine.

(3) Rich Vector Search Capabilities

Supports multiple vector search algorithms (such as HNSW, QC, and Linear)
Supports hybrid search that combines tags, inverted indexes for text search, and vector indexes

Supports filtering by expression and filtering while searching

Supports flexible configurations of parameters (such as the similarity threshold and the number of nodes returned by scanning)

(4) Support for Massive Data to Deal with Business Expansion

Supports fast import of large-scale vectors and index creation and supports 100 million 348-dimensional vectors on a single node. Configuration optimization can create indexes for full data within 3.5 hours.
Supports dynamic data updates, immediate query, and automatic reindexing
Supports horizontal data scaling

(5) Flexible and Fast Building of Enterprise-Specific Conversational Search

Stability and Reliability: This system uses your business data instead of public data to generate content. The output results are more stable and reliable.
Improved Interaction Form: This system can return conversational search results to customers. It can also work as a traditional search engine to return top-N search results. This allows you to flexibly respond to various business scenarios.
Streaming Output: LLM interactions after vector search usually take a long time. OpenSearch supports streaming output to alleviate the problem of long waiting times.

Product Configuration Process

When you create an Alibaba Cloud account and log on to the console for the first time, the system prompts you to create an AccessKey pair (an AccessKey ID and an AccessKey secret) before performing subsequent operations.
OpenSearch Vector Search Edition supports MaxCompute data sources and API data sources. You must prepare vector data in advance. (Text embeddings will be integrated into OpenSearch Vector Search Edition in later versions. You can follow the product update announcement.)
After you purchase an OpenSearch Vector Search Edition instance, the system automatically deploys an empty cluster that has the same specifications as the instance purchased. You must configure a data source and an index schema and rebuild the index for the cluster. After you complete the vector data import and index creation, you can use the vector search feature.
Query the test page in the console or use the API or SDK to test the vector search performance.
Download the tool for combining OpenSearch Vector Search Edition and an LLM and configure LLM-related information. (You can modify the code of the tool to select a third-party LLM.)
Start the conversational search service

Read the Detailed Tutorial:
https://www.alibabacloud.com/help/en/opensearch/latest/opensearch-big-model-enterprise-specific-intelligent-question-and-answer-scheme

Learn More about OpenSearch:
https://www.alibabacloud.com/product/opensearch

Buy OpenSearch:
https://common-buy-intl.alibabacloud.com/?commodityCode=opensearch_ha3post_public_intl

Note: The open-source vector embedding model and LLM mentioned in this solution come from third parties (collectively referred to as third-party models). Alibaba Cloud cannot guarantee the compliance and accuracy of third-party models and assumes no responsibility for third-party models or your behavior and the results of using third-party models. Therefore, proceed with caution before visiting or using third-party models. In addition, we remind you that third-party models come with agreements (such as Open Source License and License). Please carefully read and strictly abide by the provisions of these agreements.

Community