Overview
Vector Retrieval Service is developed based on the kernel of Alibaba Cloud’s vector engine Proxima. It provides efficient vector retrieval services that are fully managed on the cloud-native architecture, and can be horizontally and vertically scaled to cater to business needs. Vector Retrieval Service provides powerful vector management and retrieval capabilities through simple APIs and SDK, so you can easily integrate it with applications of various scenarios, such as intelligent Q&A and multimodal search.
How It Works
-
Applications
Applications from various scenarios such as multimodal search, intelligent Q&A, and LLM-based services, can be integrated with Vector Retrieval Service through low-code APIs and simple SDKs.
-
Vector Retrieval Service
Based on the cloud-native architecture, Vector Retrieval Service can be deployed across clusters and regions, so you can adjust the service volume and performance (such as QPS) by scaling the clusters and services (searchers) to flexibly respond to business changes. You can manage and configure the vector search service (such as clusters, collections, API keys, etc.) in Vector Retrieval Service console. We provide SDKs for mainstream programming languages for you to get started quickly with low code.
-
Cloud Infrastructure
Alibaba Cloud provides robust cloud resources for computing, storage, and networking, as well as powerful services for data processing, and container management, to support a flexible, reliable, and cloud-native architecture.
Benefits
-
Fully-Managed Service
Quickly integrate fully-managed cloud-native vector retrieval service for your business with low O&M costs on the serverless architecture, and only pay for data consumption
-
Designed for Vector Search
Fulfill diverse search needs from various business scenarios with rich features that support conditional filtering, data partitions, different search types, multimodal data retrieval, etc.
-
Scale-Performance Balance
Adjust service capacity and QPS flexibly according to business needs by scaling clusters or search services in easy steps to achieve the balance of scale, accuracy, and performance
-
Low Code
Start vector data management and search services easily with easy-to-use APIs that require minimal coding and out-of-the-box SDK
Features
Fully Managed Vector Retrieval Services on the Cloud
Highly Accurate and Efficient Search
Vector Retrieval Service integrates Alibaba Cloud's vector search engine Proxima that provides high-performance algorithms to achieve low-latency search of large-scale data.
Low O&M Costs
The fully managed, cloud-native vector search service with horizontal and vertical scalability reduces O&M costs, enabling you to focus only on business needs without concerns about the underlying architecture.
Minimalist SDK Design
Low-code APIs and easy-to-use SDK support rapid integration with AI applications of various business scenarios.
Real-Time Indexing of Vector Data
Indexing with Streaming Data
Vector Retrieval Service adopts a flat index architecture and supports online indexing of large-scale streaming data from scratch.
Real-time Online Update
When vector data is added, deleted, or modified, the vector status takes effect instantly. Vector data is checked immediately after being added, and written in hard drives in real-time. Vector data status is updated in real-time too.
Fast Indexing of Massive Data
Vector Retrieval Service optimizes the index structure and loading method in various ways, and supports large-scale vector data import from 2 to 20,000 dimensions.
Filtered Search
Customizable Schema
For filtered search, Vector Retrieval Service uses pre-defined fields to achieve higher retrieval speed and less computing power consumption.
Filtered Search with Multiple Expressions
You can perform combination search with comparison, logic, and string operators including "<", "<=", "=", "!=", ">=", ">", "and", "or", and "like".
Sparse Vector
Keyword Search and Hybrid Search
You can perform keyword search, vector search, or hybrid search (keyword+vector) with Vector Retrieval Service, which supports both sparse vectors with dense vectors to balance semantics and keywords.
Sparse Vector Generator
We recommend using DashText from Vector Retrieval Service for sparse vector encoding. DashText uses BM25 algorithm to convert raw text into sparse vector data, greatly simplifying the process of keyword-based vector search.
Scenarios
You can use Vector Retrieval Service APIs to quickly build semantic search services from scratch based on text indexing and vector search capabilities to support generative AI applications like Tongyi Qianwen. These applications can create text-based content (including translation, re-writing, summarization, etc.), write code, and role-play.
Benefits
-
Highly Efficient
You can add, delete, search, and revise vector data in real-time, and perform incremental or full data synchronization from multiple data sources.
-
Fast and Accurate
You can perform filtered search with a combination of various operators, and customize data fields supported by the Schema Free design to accelerate the vector search process.
Vector Retrieval Service abstracts the single files of images, videos, and text into high-dimensional vector features as embeddings, and then constructs an efficient vector index based on all the features. Users only need to input text or upload photos or videos to search for similar files. This multimodal search service greatly improves user experience.
Benefits
-
Flexible
You can set multiple Collections and Partitions for data and manage them easily
-
Schema Free
You can customize data fields and increase the flexibility and accuracy of vector search.
-
Convenient
You can quickly set up a multimodal search service through low-code APIs and easy-to-use SDKs.
You can combine vector search services with Large Language Models (LLMs) to build domain-specific knowledge Q&A systems. First, convert both user input and the content of the knowledge base into high-quality vectors, then transform the matching process into semantic search with DashVector to extract relevant knowledge more accurately and efficiently. Through corresponding prompts, this service can understand user intent and provide answers with information from the knowledge base.
Benefits
-
Cloud Native
The cloud-native system architecture separates computing resources from storage resources, so you can easily scale up and scale out.
-
Easy to Integrate
You can integrate domain-specific knowledge bases with DashVector to provide accurate Q&A services.
-
Large Scope
Vector Retrieval Service supports fast recall of large-scale vector data to improve the accuracy of vector search.
For scenarios such as intelligent search and advertisement push, user insights such as purchase records are transformed into vector data, and Vector Retrieval Service searches for relevant product information according to similarity in vector databases to recommend to potential buyers, improving purchase rate and user experience.
Benefits
-
High Compatibility
Vector Retrieval Service supports a wide range of data types and various search methods.
-
High Performance
DAMO Academy's vector search engine for large-scale text and vector data and Alibaba Cloud's high-availability architecture provide high performance for various search scenarios.
-
Customizable
You can customize the search distance, and set threshold values for similarities (vector data with a similarity higher than the threshold value will be filtered out).