Promo Center

50% off for new user

Direct Mail-46% off

Learn More

Vector search

Updated at: 2024-10-18 03:13

Practices in pure vector search scenarios using OpenSearch

1. Overview

AI algorithms can abstract various unstructured data generated by sources in the physical world, such as people, things, and scenes, into multi-dimensional vectors. The unstructured data can be speech, image, video, text, and behavior. These vectors are like coordinates in mathematical space, identifying various entities and entity relationships. The process of changing unstructured data into vectors is generally called embedding, while unstructured search is the process of searching these generated vectors for the corresponding entities.

image

Unstructured search, in essence, is the vector search technology. The technology is mainly applied to fields such as facial recognition, recommendation system, image search, video fingerprint, voice processing, Natural Language Processing (NLP), and file search. With the wide application of AI technology and the continuous growth of data scale, vector search has gradually become an indispensable part of AI technology links, and a supplement to the traditional search technology. Vector search also supports multi-modal searching.

To meet the requirements of more diversified and complex multi-modal search scenarios, OpenSearch provides the vector search feature, which can build a high-performance vector search system in one stop.

2. Create an OpenSearch instance

Step 1: Click Buy Now.

Step 2: Configure the specifications and parameters of an instance.

Parameter configuration

  • Product Type: You can set this parameter to Pay-as-you-go when you purchase the instance for test only.

  • Region and Zone: Set this parameter to China (Hangzhou) (customizable).

  • Application Name: Set this parameter to test_vector_opensearch (customizable).

  • Industry Type: Set this parameter to General Industry .

  • Specifications: Set this parameter to Exclusive Computing with 30GB, 1000LCU selected for Storage Capacity and Computing Resource, respectively. Then, click Buy Now.

Step 3: On the displayed Confirm Order page, select I have read and agree to Open Search (Subscription) on International Site Agreement of Service. Then, click Activate Now to confirm the order.

image

The OpenSearch instance is created.

3. Configure a vector retrieval service instance

On the Configure Application page, set the parameters in the Feature Selection, Application Schema, Index Schema, Data Source, and Complete steps in sequence.

3.1. Application Schema

Find the corresponding application on the Applications page of OpenSearch console and click Configure in the Actions column.

image

Step 1: Create an application schema.

You can manually create an application schema, or create an application schema by connecting to a data source, by uploading templates, or by uploading files. For example, to select MaxCompute as the data source, select Use Data Source for Application Schema Creation Method and then select MaxCompute for Use Data Source. Then, click Create Database.

image

Enter the database connection information.

image

Step 2: Select a table and click OK.

image

Step 3: Select a primary table and a primary key. If you need to join multiple tables, see Join multiple tables.

image

Note: The vector field must be set to the DOUBLE_ARRAY type.

3.2. Index Scheme

step 1:Index fields

After the application schema is configured, the system automatically generates index fields, analyzers, index tags, and the field contained.

image

Note: You need to configure the vector index for the vector field (vector_field). You can select dimensions based on your needs. By default, OpenSearch supports 64-, 128-, 256-, 512-, and 1536-dimensional vectors.

step 2:Attribute Field List

image

3.3. Data Source

If you select a MaxCompute data source when you configure an application schema, the corresponding project table is automatically mapped. You only need to specify the corresponding import conditions as required. By default, all partition data of the table is imported.

image

If the name of the data source table field is inconsistent with that in the application schema you configured, you can click Modify to manually modify the mapping field.

image

Confirm and click Finish.

image

3.4. Configuration completed

image

4. Online query

For more information about the vector query syntax, click Vector search.

  • To view the search test page, choose Feature Extensions > Run search tests.

image

# 1536-dimensional vector is used as an example.
vector_index:'-0.01786,0.03692,0.03710,0.01668,0.03655,-0.03515,0.02017,-0.00653,-0.01419,-0.01708,-0.00091,-0.03528,0.02821,-0.02194,-0.01609,-0.02045,0.02209,0.06413,0.06233,0.03064,-0.00863,-0.06810,0.00729,0.07912,-0.03948,0.06932,0.02051,-0.00688,-0.01138,0.03207,0.03040,-0.00050,0.06220,-0.03895,0.04575,-0.00259,0.04358,0.02027,0.03342,-0.02916,0.04793,-0.02954,0.04327,0.06156,-0.00230,0.00653,0.01515,-0.00287,0.03546,-0.01551,-0.03049,0.07542,-0.01563,0.00680,0.00598,-0.00396,0.00330,0.00359,-0.03395,-0.00825,-0.02175,0.04479,0.04008,0.03558,-0.03011,-0.00015,0.03086,-0.00941,0.03113,0.00758,-0.04333,0.04607,-0.02520,-0.01260,-0.04726,0.00564,-0.02423,-0.00439,-0.02739,-0.01674,0.06426,-0.05995,0.01762,0.04370,0.02211,-0.03174,0.04465,0.00475,-0.03577,0.01111,-0.00963,0.03510,-0.02533,-0.00444,0.00161,0.00561,0.00066,-0.04074,0.00682,0.03293,-0.01630,-0.02575,0.02834,0.02679,-0.04558,0.02395,0.00531,0.01240,0.04064,0.03599,0.00172,0.00413,-0.06839...&sf=0.8'

  • On this page (1, O)
  • 1. Overview
  • 2. Create an OpenSearch instance
  • 3. Configure a vector retrieval service instance
  • 3.1. Application Schema
  • 3.2. Index Scheme
  • 3.3. Data Source
  • 3.4. Configuration completed
  • 4. Online query
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare