Scenarios - Vector Retrieval Service - Alibaba Cloud Documentation Center

This topic describes how to use DashVector in scenarios including smart search and preference recommendation on e-commerce platforms, AI question-and-answer systems like natural language processing, multi-modal search on gallery websites, video search, and molecular detection and screening.

Smart search and preference recommendation on e-commerce platforms

In such scenarios, the search and recommendation capabilities are enabled based on the similarities between vectors in vector databases. For example, an e-commerce platform contains a large amount of product images and description information of products. Users are expected to be allowed to search for products by image or description information. The platform is also expected to automatically recommend products that users are likely to be interested in.

Users only need to embed the image and description of a product into vector data and store the vector data in a vector database. If a user initiates a search request, DashVector converts the request into a vector, calculates the similarity between the vector and the vector data of all products in the vector database, and then returns a number of products that are most similar to the search request. In addition, DashVector converts the browsing and purchasing history of users into vector data to compare against the vector data of all products. Products that are most similar to the historical behaviors and preferences are returned and recommended. This way, more intelligent and personalized service is provided to achieve a better and more efficient purchase experience.

AI question-and-answer systems like natural language processing

Question-and-answer systems are common applications in the area of natural language processing. Typical question-and-answer systems include Tongyi Qianwen, ChatGPT, online customer service systems, and QA chatbots. For example, a question-and-answer system contains a number of predefined questions and corresponding answers. When a user enters a question, the system is expected to find the most similar predefined question and return the corresponding answer. To realize this feature, DashVector converts predefined questions and corresponding answers into vector data and stores the vector data in a vector database. When a user enters a question, DashVector converts the question into a vector and performs a vector-based query to find the most similar predefined question in the vector database. Followed by steps such as model training, question-and-answer reasoning, and optimization, a smart language interaction system like Tongyi Qianwen and ChatGPT is built.

Multi-modal search on gallery websites

At present, large-scale image material websites and social networking applications usually contain hundreds of millions or even tens of billions of images. On such websites or in such applications, users cannot find the images they need by searching by simple words or images alone. By using DashVector, image content and description are stored as vector data in vector databases. This way, users can search for images by text or images, or by a combination of text and images. Search requests are converted into vectors and compared against vector data in the database. This way, users find the images they need in a faster and more user-friendly manner.

Video search

In video search scenarios, platforms such as video surveillance systems, video resource websites, and short video applications carry a large amount of video data. DashVector converts video data into vector data and stores the vector data in a vector database. If a user sees a movie clip or a video screenshot and uses the video similarity search system to perform a content vector-based video search, DashVector finds the video that is most similar to the source video or screenshot and returns the video to the user. At the same time, DashVector allows users to search for videos based on clustering. The video is sorted into a cluster, and the search is conducted within the cluster to improve the efficiency and accuracy of the search.

Molecular detection and screening

In molecular detection scenarios, molecular fingerprints such as Extended-Connectivity Fingerprint (ECFP) and Molecular Access System (MACCS) keys are converted into vector data and stored in a vector database. When a user initiates a search request, DashVector uses the same method to convert the request into a vector, compares the vector to molecular vector data stored in the database to find the most similar one, and then returns the molecular vector data to the user. This way, molecular search and screening based on the similarity of molecular structure is implemented, and more intelligent and efficient solutions are provided for molecular discovery and drug design.