×
Community Blog An Introduction to the Design and Practice of High-Dimensional Vector Retrieval Technology in PostgresSQL

An Introduction to the Design and Practice of High-Dimensional Vector Retrieval Technology in PostgresSQL

This short article introduces the design and practice of high-dimensional vector retrieval technology in Postgres.

What Is Vector Retrieval (Approximate Nearest Neighbor/ANN)?

Vector retrieval is the process of finding the nearest K points to a given point P from a bunch of known points. These points can be one-dimensional, two-dimensional, or three-dimensional. They are collectively called vectors.

  • High-Dimensional: The vector dimension number is higher than 10 and lower than 1,000.
  • Ultra High-Dimensional: The vector dimension number is higher than 1,000.

1

Application Scenario 1: Search by Image Recognition

Image search, such as Pailitao, is a useful search tool next to the search box on Taobao. It can take photos of the goods you are interested in and then search for similar goods on the platform. The technology uses multiple deep CNN models to extract features from the image, and the extracted features are high-dimensional vectors. Let’s say it is a vector of 256 dimensions. When searching, the same feature is extracted from the photos taken, namely, the 256-dimensional vector. Then, the vector is used to search similar vectors throughout the whole library. The following figure shows the process:

2

Application Scenario 2: Recommendation

The personalized recommendation scenario is similar to searching by image. The difference is that the recommendation is based on the process of searching for products that users are interested in according to users’ features. Its technical implementation uses the twin-tower model, which extracts user features and product features. In the final search stage, the user features are used to retrieve in the product feature library. This is the item retrieval process of personalized recommendation. The following figure shows the process:

3

Application Scenario 3: Semantic Retrieval Based on the Deep Model

Its application scenarios are also very extensive, such as searching on Alipay or Taobao, which uses this technology. The recently popular Boolean model also applies to these scenarios. It uses the deep model to extract the features of the vocabulary, which are used in the retrieval process. It also uses vector retrieval to find products that match the queried content. The following figure shows the process:

4

The scenarios above show how the vector retrieval technology is widely used in search, recommendation, and other scenarios. Combined with deep learning and the development of technologies, vector retrieval has enjoyed rapid development and has been widely used in recent years.

0 1 0
Share on

ApsaraDB

445 posts | 93 followers

You may also like

Comments