Create a table for a DLF data source - OpenSearch - Alibaba Cloud Documentation Center

0.0.201

This topic describes how to create a table for a Data Lake Formation (DLF) data source.

Prerequisites

You are familiar with DLF. For more information about DLF, see What is DLF?
A catalog, a database, and a table are created in DLF. They are used when you configure data synchronization.

Add a DLF data source

Log on to the OpenSearch Vector Search Edition console. In the left-side navigation pane, click Instances. On the Instances page, find the desired instance and click its ID. On the instance details page, click Table Management in the left-side navigation pane. On the page that appears, click Add Table.
In the Basic Table Information step, configure the parameters and click Next.
Parameters:

Table Name: the name of the table. You can enter a custom table name.
Data Shards: the number of data shards in the table. If you create multiple index tables in an OpenSearch instance, make sure that the index tables contain the same number of shards. Alternatively, make sure that at least one index table contains one shard and other index tables contain the same number of shards.
Number of Resources for Data Updates: the number of resources that are used for data updates. By default, a free quota of two resources for data updates is provided for each index. Each resource consists of 4 CPU cores and 8 GB of memory. You are charged for resources that exceed the free quota. For more information, see Billing overview of OpenSearch Vector Search Edition.
Scenario Template: the template that is used to create the table. Valid values: Common Template, Vector: Image Search, and Vector: Semantic Search for Text.

In the Data Synchronization step, configure the following parameters to add a data source, and click Check to check the data source information. If the data source information passes the check, click Next.

Full Data Source: Select DLF.
Catalog ID: the ID of the DLF catalog that you want to access.
Database: the name of the database in the catalog.
Data Table: the name of the table in the database.
Note
- If you want to use DLF data sources for existing instances, you must first upgrade the engine versions of the instances.
- Only catalogs of the Paimon type are supported.
- For a primary key table in Paimon, you can add data to the table, delete data from the table, and modify and query data in the table. For an append-only table in Paimon, you can only add data to the table. You are not allowed to modify data or delete data from the table.

In the Field Configuration step, configure fields for the table and click Next.
Note
- The primary key field and vector field are required. For the primary key field, you must set the Type parameter to an integer type or the STRING type and select the option button in the Primary Key column. For the vector field, you must set the Type parameter to FLOAT and select the check box in the Vector Field column.
- By default, the vector field is a multi-value field of the FLOAT type.
- If a field does not exist or is empty in the source data, the system automatically sets the field to the default value. By default, a field of the numeric type is set to 0 and a field of the STRING type is set to an empty string. You can also specify custom default values.
In the Index Schema step, configure the parameters and click Next.

The primary key field and vector field are required. The namespace field is optional and can be left empty.
You can configure only the three fixed fields for the Fields Contained parameter and cannot add fields.
Vector Dimension: the dimension of vectors. Specify a vector dimension based on the vector model that you select.
Distance Type: the type of vector distance. Valid values: SquareEuclidean and InnerProduct. Specify a distance type based on the vector model that you select.
Vector Index Algorithm: the algorithm that is used to create the vector index. Valid values: Qc, Linear, and HNSW. Specify an algorithm based on the vector model that you select.
Real-time Indexing: specifies whether to build real-time indexes for incremental data that is pushed by using API operations. Valid values: true and false. Default value: true.
You can also configure parameters for the advanced configurations of the vector index. For more information, see Common configurations of vector indexes.

Confirm the creation. Then, the system automatically creates the configured table. You can view the table creation progress on the Change History page.
After the table enters the In Use state, you can perform a query test on the Query Test page.

Precautions

When new data is written to the Paimon table in DLF, OpenSearch automatically creates an index in real time based on the new data. If you manually write data to the Paimon table by calling an API, data inconsistency may occur. Therefore, proceed with caution.

Feedback

Previous: OSS + API data sourceNext: Modify a table

On this page （1, T）

Prerequisites

Add a DLF data source

Precautions

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

Asia Accelerator Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Lingma

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Cloud Phone Beta

Elastic Desktop Service (EDS) Featured

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)