Promo Center

50% off for new user

Direct Mail-46% off

Learn More
This topic was translated by AI and is currently in queue for revision by our editors. Alibaba Cloud does not guarantee the accuracy of AI-translated content. Request expedited revision

pgvector use guide

Updated at: 2025-02-26 03:03

ApsaraDB RDS for PostgreSQL supports the pgvector extension, which introduces a new data type for efficient high-dimensional vector searches, enhancing vector similarity searches.

Background

ApsaraDB RDS for PostgreSQL supports the pgvector extension, which enables storage and similarity searches of vector data, providing foundational data support for AI services.

The pgvector extension offers the following advantages:

  • It supports the vector data type, facilitating the storage and querying of vector data.

  • It enables both exact and approximate nearest neighbor (ANN) searches, allowing for the calculation of distances or similarities between vectors using various metrics such as L2 Euclidean distance, cosine similarity, or inner product. Indexing features include support for HNSW index, parallel index IVFFlat, vector element-wise multiplication, L1 distance function, and sum aggregation.

  • Vectors can be created with up to 16,000 dimensions, and indexes can be built for vectors with up to 2,000 dimensions.

Related concepts and implementation principles

Embedding

Embedding translates high-dimensional data into a lower-dimensional representation. Commonly used in machine learning and natural language processing (NLP), it represents sparse symbols or objects as points in a vector space.

In NLP, word embedding converts words into real-valued vectors, enabling computers to process text-based content more effectively by capturing the semantics and grammatical relationships of words.

Note

For more information, see the official documentation of common embedding tools and libraries:

Implementation principles

  1. Embedding abstracts data such as words, images, audio, and videos from multiple dimensions into vector representations.

  2. The pgvector extension supports vector data, enabling ApsaraDB RDS for PostgreSQL instances to store such data.

  3. The extension facilitates exact and ANN searches on stored vector data.

Consider storing three objects: an apple, a banana, and a cat in an RDS instance and using the pgvector extension to calculate their similarities. The following steps outline the process:

  1. Translate the apple, banana, and cat into vectors using embeddings. Below is an example of two-dimensional embeddings:

    Apple: embedding[1,1]
    Banana: embedding[1.2,0.8]
    Cat: embedding[6,0.4]
  2. Store the vector data in the RDS instance. For details on storing two-dimensional vector data, see Usage Examples.

    The following figure illustrates the distribution of the apple, banana, and cat in a two-dimensional space:

    image..png

The apple and banana are closer together as they are both fruits and share similarities. The cat, being an animal, is positioned further from the fruits.

By grouping the attributes of fruits from finer-grained dimensions such as color, origin, and taste, more accurate similarity results can be obtained.

Scenarios

  • Storing vector data.

  • Performing vector similarity searches.

Prerequisites

Your RDS instance must meet the following criteria:

  • The RDS instance is running PostgreSQL 14 or later.

  • The minor engine version of your RDS instance is 20230430 or later. For RDS PostgreSQL 17, the minor engine version must be 20241030 or later.

Note

For guidance on upgrading the major or minor engine version of your RDS instance, see Upgrade the Major Engine Version or Upgrade the Minor Engine Version.

Plug-in management

  • Creating the plug-in

    CREATE EXTENSION IF NOT EXISTS vector;
  • Deleting the plug-in

    DROP EXTENSION vector;
  • Updating the plug-in

    ALTER EXTENSION vector UPDATE [ TO new_version ]
    Note

    The new_version parameter specifies the version of the pgvector extension. For the latest version number and features, refer to the official pgvector documentation.

Usage examples

Below is a simple example of using the pgvector extension. For comprehensive usage details, see the official pgvector documentation.

  1. Create a table named 'items' of the vector type to store embeddings.

    CREATE TABLE items (
      id bigserial PRIMARY KEY, 
      item text, 
      embedding vector(2)
    );
    Note

    The above statements use two-dimensional vectors, but the pgvector extension supports up to 16,000 dimensions.

  2. Insert vectors into the created table.

    INSERT INTO
      items (item, embedding)
    VALUES
      ('Apple', '[1, 1]'),
      ('Banana', '[1.2, 0.8]'),
      ('Cat', '[6, 0.4]');
  3. Use the cosine similarity operator <=> to calculate similarities between the banana and the apple or cat.

    SELECT
      item,
      1 - (embedding <=> '[1.2, 0.8]') AS cosine_similarity
    FROM
      items
    ORDER BY
      cosine_similarity DESC;
    Note
    • The formula cosine_similarity = 1 - cosine_distance is used to calculate similarities, where a smaller distance indicates a higher similarity.

    • Alternatively, the L2 Euclidean distance operator <-> or the inner product operator <#> can be used for similarity calculations.

    Sample output:

    item | cosine_similarity
    ------+--------------------
     Banana |                  1
     Apple |  0.980580680748848
     Cat   |  0.867105556566985

    In the output:

    • A value of 1 for the banana indicates an exact match.

    • A value of 0.98 for the apple suggests a high similarity with the banana.

    • A value of 0.86 for the cat shows a lower similarity with the banana.

    Note

    A similarity threshold can be set based on business needs to filter out results with low similarities.

  4. Create an index for vectors to enhance the efficiency of vector similarity searches using the pgvector extension. The following statement creates an index for the 'embedding' parameter.

    Create an HNSW index
    Create an IVF index
    CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

    Parameter description:

    Parameter

    Description

    m

    Specifies the maximum number of adjacent nodes for each node in each layer when the HNSW index is built.

    The larger the value, the higher the density of the graph. This usually increases the recall rate. However, the time required to build and query the index also increases.

    ef_construction

    Specifies the size of the candidate set when the HNSW index is built. This indicates how many candidate nodes are retained during the build process to select the optimal connections.

    The larger the value, the higher the recall rate. However, the time required to build and query the index also increases.

    CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

    Parameter description:

    Parameter/Value

    Description

    items

    The name of the table for which the index is created.

    embedding

    The name of the column for which the index is created.

    vector_cosine_ops

    The access mode that is specified in the vector index method.

    • For cosine similarity searches, use vector_cosine_ops.

    • For L2 Euclidean distance, use vector_l2_ops.

    • For inner product similarity, use vector_ip_ops.

    lists = 100

    The lists parameter specifies the number of lists into which the dataset is divided. The larger the value, the more the dataset is divided, and the smaller the size of each subset. This improves the speed of index searches. However, as the value of lists increases, the recall rate of searches may decrease.

    Note
    • The recall rate is the ratio of the number of exactly searched samples or classified samples to the total number of relevant samples in an information search or classification task. The recall rate is a metric that is used to measure the capability of the system to identify all relevant samples. It is an important evaluation metric.

    • Building an index requires a large amount of memory. If the value of the lists parameter exceeds 2,000, an error message ERROR: memory required is xxx MB, maintenance_work_mem is xxx MB is returned. You must set a larger value for the maintenance_work_mem parameter to create an index for vector data. If the value is set too large, the instance is at high risk of out of memory (OOM). For more information about how to set the parameter, see Set instance parameters.

    • You must adjust the value of the lists parameter to balance the query speed and recall rate to meet the requirements of specific scenarios.

    To specify the number of lists to search within the index, you can set the ivfflat.probes parameter using one of the following methods. Raising the ivfflat.probes value allows you to search additional lists, enhancing the recall rate and yielding more relevant query results.

    • Session level

      SET ivfflat.probes = 10;
    • Transaction level

      BEGIN; SET LOCAL ivfflat.probes = 10; SELECT ... COMMIT;

    A higher ivfflat.probes value increases the recall rate but slows down the query speed. Adjust the lists and ivfflat.probes values based on business requirements and dataset characteristics for optimal performance and recall rate.

    Note

    If the ivfflat.probes value equals the lists value specified during index creation, the vector indexis ignored, resulting in a full table scan. This can lead to decreased search performance.

Performance data

When creating an index for vector data, it's important to balance query speed and recall rate based on the volume and nature of the business data.

Best practices

  • On this page (1, M)
  • Background
  • Related concepts and implementation principles
  • Embedding
  • Implementation principles
  • Scenarios
  • Prerequisites
  • Plug-in management
  • Usage examples
  • Performance data
  • Best practices
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare