All Products
Search
Document Center

OpenSearch:Update data

Last Updated:Jun 27, 2024

This topic describes how to use OpenSearch Vector Search Edition SDKs for Java and Python to update table data. You can upload, update, and delete documents.

Dependencies

Java in asynchronous mode

<dependency>
  <groupId>com.aliyun</groupId>
  <artifactId>aliyun-sdk-ha3engine-async</artifactId>
  <version>1.1.0</version>
</dependency>

Java

<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>aliyun-sdk-ha3engine-vector</artifactId>
    <version>1.1.3</version>
</dependency>

Python

#Requires: Python >=3.6
pip install alibabacloud_ha3engine_vector

Go

go get github.com/aliyun/alibabacloud-ha3-go-sdk@v1.1.3-vector

Parameter description

You must specify the following parameters in the SDKs for Java and Python: endpoint, instance_id, access_user_name, access_pass_word, and data_source_name.

  • endpoint: the internal or public endpoint.

You can view the endpoints in the Network Information and API Endpoint sections of the Instance Details page.

image.png

After you turn on Public Access, you can access the OpenSearch Vector Search Edition instance by using a public endpoint on your on-premises machine. A public endpoint contains the word "public". For more information about how to configure a whitelist of IP addresses, see Instance details.

If you use an Elastic Compute Service (ECS) instance to access the OpenSearch Vector Search Edition instance, you can specify the same vSwitch as that of the ECS instance to access the OpenSearch Vector Search Edition instance by using the API endpoint.

  • instance_id: the ID of the OpenSearch Vector Search Edition instance.

image.png

  • access_user_name: the username.

  • access_pass_word: the password.

You can view the username and password in the API Endpoint section of the Instance Details page. The password is specified when you purchase the instance and can be modified.

image.png

  • data_source_name: the name of the API data source. The default value is in the format of Instance ID_Table name.

image.png

Example: ha-cn-zpr3dgzxg04_test_image_vector.

Data update demo

Upload a document

Java in asynchronous mode

The demo dynamically encapsulates document data into Map objects and calls the add() method to add these Map objects to the cache. Then, the demo calls the pushDocuments() method to submit the document data in these Map objects at a time.

package com.aliyun.ha3engine;


import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;

import com.aliyun.ha3engine.async.AsyncClient;
import com.aliyun.ha3engine.async.models.PushDocumentsRequest;
import com.aliyun.ha3engine.async.models.PushDocumentsResponse;
import com.aliyun.sdk.ha3engine.async.core.AsyncConfigInfoProvider;
import com.aliyun.tea.TeaException;

import darabonba.core.client.ClientOverrideConfiguration;


/**
 * @author alibaba
 */
public class PushDoc {

    public static void main(String[] args) {

        try {
            // The username and password that are used to access the instance. You can view the username and password in the API Endpoint section of the Instance Details page.
            AsyncConfigInfoProvider provider = AsyncConfigInfoProvider.create("username", "password");

            // Initialize the asynchronous client.
            AsyncClient client = AsyncClient.builder()
                    .credentialsProvider(provider)
                    .overrideConfiguration(
                            ClientOverrideConfiguration.create()
                                    // The public endpoint of the instance. You can view the public endpoint in the Network Information section of the Instance Details page.
                                    .setEndpointOverride("ha-cn-***********.public.ha.aliyuncs.com")
                            .setProtocol("http")
                    ).build();

            // The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
            String tableName = "<instance_datasource_table_name>";

            // The primary key field of the document whose data is to be pushed.
            String pkField = "<field_pk>";

            // The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
            ArrayList<Map<String, ?>> documents = new ArrayList<>();

            // The document to be uploaded.
            Map<String, Object> add2Document = new HashMap<>();
            Map<String, Object> add2DocumentFields = new HashMap<>();

            // The content of the document. Keys and values must be matched in pairs.
            // The value of the field_pk field must be the same as the value of the pkField field.
            add2DocumentFields.put("<field_pk>", "<field_pk_value>");
            add2DocumentFields.put("<field_map_key_1>", "<field_map_value_1>");
            add2DocumentFields.put("<field_map_key_2>", "<field_map_value_2>");

            // The content can be of multi-value attribute types supported by OpenSearch Vector Search Edition. Set the multi_value parameter to true when you configure an index table.
            ArrayList<Object> addDocumentMultiFields = new ArrayList<>();
            addDocumentMultiFields.add("multi_value_1");
            addDocumentMultiFields.add("multi_value_2");
            add2DocumentFields.put("<multi_value_key>", addDocumentMultiFields);

            // Add the document content to an add2Document structure.
            add2Document.put("fields", add2DocumentFields);
            // Run the add command to upload the document.
            add2Document.put("cmd", "add");
            documents.add(add2Document);

            // Push data.
            PushDocumentsRequest request = PushDocumentsRequest.builder().body(documents).build();
            CompletableFuture<PushDocumentsResponse> responseCompletableFuture = client.pushDocuments(tableName, pkField, request);
            String responseBody = responseCompletableFuture.get().getBody();

            System.out.println("result:" + responseBody);

        } catch (ExecutionException | InterruptedException e) {
            System.out.println(e.getMessage());
        } catch (TeaException e) {
            System.out.println(e.getMessage());
            Map<String, Object> abc = e.getData();
            System.out.println(com.aliyun.teautil.Common.toJSONString(abc));
        }
        
    }
}

Java

package com.aliyun.ha3engine;

import com.aliyun.ha3engine.vector.Client;
import com.aliyun.ha3engine.vector.models.*;
import com.aliyun.tea.TeaException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;

/**
 * @author alibaba
 */
public class PushDoc {

  public static void main(String[] args) throws Exception {
    Config config = new Config();

    // The API endpoint of the instance. You can view the API endpoint in the API Endpoint section of the Instance Details page.
    config.setEndpoint("<instance_services_domain>");
    // The name of the OpenSearch Vector Search Edition instance. You can view the instance name in the upper-left corner of the Instance Details page. Example: ha-cn-i7*****605.
    config.setInstanceId("<instance_id>");

    // The username. You can view the username in the API Endpoint section of the Instance Details page.
    config.setAccessUserName("<user_name>");
    // The password. You can modify the password in the API Endpoint section of the Instance Details page.
    config.setAccessPassWord("<user_password>");

    Client client = new Client(config);

    // The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
    String tableName = "<instance_id>_<table_name>";

    // The primary key field of the document whose data is to be pushed.
    String pkField = "<field_pk>";

    try {
      // The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
      ArrayList<Map<String, ?>> documents = new ArrayList<>();

      // The document to be uploaded.
      Map<String, Object> add2Document = new HashMap<>();
      Map<String, Object> add2DocumentFields = new HashMap<>();

      // The content of the document. Keys and values must be matched in pairs.
      // The value of the field_pk field must be the same as the value of the pkField field.
      add2DocumentFields.put("<field_pk>", "<field_pk_value>");
      add2DocumentFields.put("<field_map_key_1>", "<field_map_value_1>");
      add2DocumentFields.put("<field_map_key_2>", "<field_map_value_2>");

      // The content can be of multi-value attribute types supported by OpenSearch Vector Search Edition. Set the multi_value parameter to true when you configure an index table.
      ArrayList<Object> addDocumentMultiFields = new ArrayList<>();
      addDocumentMultiFields.add("multi_value_1");
      addDocumentMultiFields.add("multi_value_2");
      add2DocumentFields.put("<multi_value_key>", addDocumentMultiFields);

      // Add the document content to an add2Document structure.
      add2Document.put("fields", add2DocumentFields);
      // Run the add command to upload the document.
      add2Document.put("cmd", "add");
      documents.add(add2Document);

      // Push data.
      PushDocumentsRequest request = new PushDocumentsRequest();
      request.setBody(documents);
      PushDocumentsResponse response = client.pushDocuments(
        tableName,
        pkField,
        request
      );
      String responseBody = response.getBody();

      System.out.println("result:" + responseBody);
    } catch (TeaException e) {
      System.out.println(e.getMessage());

      Map<String, Object> abc = e.getData();

      System.out.println(com.aliyun.teautil.Common.toJSONString(abc));
    }
  }
}

Python

# -*- coding: utf-8 -*-

from alibabacloud_ha3engine_vector import models, client
from Tea.exceptions import TeaException, RetryError

Config = models.Config(
    endpoint="<API endpoint>",  # // The API endpoint of the instance. You can view the API endpoint in the API Endpoint section of the Instance Details page. You must remove the http:// prefix when you specify the endpoint.
    instance_id="<Instance ID>",  # // The ID of the OpenSearch Vector Search Edition instance. You can view the instance ID in the upper-left corner of the Instance Details page. Example: ha-cn-i7*****605.
    protocol="http",
    access_user_name="<Username>",  # // The username. You can view the username in the API Endpoint section of the Instance Details page.
    access_pass_word="<Password>"  # // The password. You can modify the password in the API Endpoint section of the Instance Details page.

)

# Initialize the engine client.
ha3EngineClient = client.Client(Config)

def push():
    # The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
    tableName = "<instance_id>_<table_name>";

    try:
        # The document to be uploaded.
        # If the document already exists, the existing document is deleted and then the specified document is uploaded. 
        # =====================================================
        # The content of the document.
        add2DocumentFields = {
            "id": 1,                          # The ID of the primary key field. The value is of the INT type.
            "name": "Search",                    # The value of this single-value field is of the STRING type.
            "str_arr": "a\x1Db\x1Dc\x1Dd"     # The value of this multi-value field is of the STRING type.
        }
        
        # Add the document content to an add2Document structure.
        add2Document = {
            "fields": add2DocumentFields,
            "cmd": "add"                      # Run the add command to upload the document.
        }

        optionsHeaders = {}
        # The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
        documentArrayList = []
        documentArrayList.append(add2Document)
        pushDocumentsRequest = models.PushDocumentsRequest(optionsHeaders, documentArrayList)

        # The primary key field of the document whose data is to be pushed.
        pkField = "id"
        # Use the default runtime parameters for the request.
        response = ha3EngineClient.push_documents(tableName, pkField, pushDocumentsRequest)
        print(response.body)
        print(response.body)
    except TeaException as e:
        print(f"send request with TeaException : {e}")
    except RetryError as e:
        print(f"send request with Connection Exception  : {e}")

if __name__ == "__main__":
    push()

Go

The demo dynamically encapsulates document data into Map objects and calls the add() method to add these Map objects to the cache. Then, the demo calls the pushDocuments() method to submit the document data in these Map objects at a time.

package main

import (
	"fmt"
	"github.com/alibabacloud-go/tea/tea"
	ha3engine "github.com/aliyun/alibabacloud-ha3-go-sdk/client"
)

func main() {
  // Create a Config instance.
	config := &ha3engine.Config{
    // The internal or public endpoint.
		Endpoint: tea.String("<endpoint>"),
    // The name of the OpenSearch Vector Search Edition instance. You can view the instance name in the upper-left corner of the Instance Details page. Example: ha-cn-i7*****605.
		InstanceId: tea.String("<InstanceId>"),
    // The username. You can view the username in the API Endpoint section of the Instance Details page.
		AccessUserName: tea.String("<AccessUserName>"),
    // The password. You can modify the password in the API Endpoint section of the Instance Details page.
		AccessPassWord: tea.String("<AccessPassWord>"),
	}

	// Initialize a client for sending requests.
	client, _clientErr := ha3engine.NewClient(config)

	// If an error occurs when the system creates the client, _clientErr and an error message are returned.
	if _clientErr != nil {
		fmt.Println(_clientErr)
		return
	}
	docPush(client)
}

func docPush(client *ha3engine.Client) {
	pushDocumentsRequestModel := &ha3engine.PushDocumentsRequest{}
	// The name of the data source from which you want to push document data. To view the data source name, go to the Instance Details page in the OpenSearch console. In the left-side pane, choose Configuration Center > Data Source. You can view the data source name on the Data Source page.
	dataSourceName := "{instanceId}_api"
	keyField := "id"

	a := [20]int{}
	array := []map[string]interface{}{}
	for x := range a {
		filed := map[string]interface{}{
			"fields": map[string]interface{}{
				"id":          tea.ToString(x),
				"fb_boolean":  tea.BoolValue(nil),
				"fb_datetime": "2167747200000",
				"fb_string":   "409a6b18-a10b-409e-af91-07121c45d899",
			},
			"cmd": tea.String("add"),
		}
		array = append(array, filed)

		pushDocumentsRequestModel.SetBody(array)

		// Call the method for sending a request.
		response, _requestErr := client.PushDocuments(tea.String(dataSourceName), tea.String(keyField), pushDocumentsRequestModel)

		// If an error occurs when the system sends the request, _requestErr and an error message are returned.
		if _requestErr != nil {
			fmt.Println(_requestErr)
			return
		}

		// Display the response if no error occurs.
		fmt.Println(response)

	}

}
Note

If you call the add() method to push data, new data that uses the same primary key as the old data overwrites the old data.

Delete a document

Java in asynchronous mode

package com.aliyun.ha3engine;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;

import com.aliyun.ha3engine.async.AsyncClient;
import com.aliyun.ha3engine.async.models.PushDocumentsRequest;
import com.aliyun.ha3engine.async.models.PushDocumentsResponse;
import com.aliyun.sdk.ha3engine.async.core.AsyncConfigInfoProvider;
import com.aliyun.tea.TeaException;

import darabonba.core.client.ClientOverrideConfiguration;


/**
 * @author alibaba
 */
public class PushDoc {

    public static void main(String[] args) throws Exception {
        try {
            // The username and password that are used to access the instance. You can view the username and password in the API Endpoint section of the Instance Details page.
            AsyncConfigInfoProvider provider = AsyncConfigInfoProvider.create("username", "password");

            // Initialize the asynchronous client.
            AsyncClient client = AsyncClient.builder()
                    .credentialsProvider(provider)
                    .overrideConfiguration(
                            ClientOverrideConfiguration.create()
                                    // The public endpoint of the instance. You can view the public endpoint in the Network Information section of the Instance Details page.
                                    .setEndpointOverride("ha-cn-***********.public.ha.aliyuncs.com")
                            .setProtocol("http")
                    ).build();

            // The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
            String tableName = "<instance_datasource_table_name>";

            // The primary key field of the document whose data is to be pushed.
            String pkField = "<field_pk>";

            // The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
            ArrayList<Map<String, ?>> documents = new ArrayList<>();

            // The document to be deleted.
            Map<String, Object> deleteDocument = new HashMap<>();
            Map<String, Object> deleteDocumentFields = new HashMap<>();

            // The content of the document. Keys and values must be matched in pairs.
            // The value of the field_pk field must be the same as the value of the pkField field.
            deleteDocumentFields.put("<field_pk>", "<field_pk_value>");

            // Add the document content to a deleteDocument structure.
            deleteDocument.put("fields", deleteDocumentFields);
            // Run the delete command to delete the document.
            deleteDocument.put("cmd", "delete");
            documents.add(deleteDocument);

            // Push data.
            PushDocumentsRequest request = PushDocumentsRequest.builder().body(documents).build();
            CompletableFuture<PushDocumentsResponse> responseCompletableFuture = client.pushDocuments(tableName, pkField, request);
            String responseBody = responseCompletableFuture.get().getBody();

            System.out.println("result:" + responseBody);

        } catch (ExecutionException | InterruptedException e) {
            System.out.println(e.getMessage());
        } catch (TeaException e) {
            System.out.println(e.getMessage());
            Map<String, Object> abc = e.getData();
            System.out.println(com.aliyun.teautil.Common.toJSONString(abc));
        }
    }
}

Java

package com.aliyun.ha3engine;

import com.aliyun.ha3engine.vector.Client;
import com.aliyun.ha3engine.vector.models.*;
import com.aliyun.tea.TeaException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;

/**
 * @author alibaba
 */
public class PushDoc {

  public static void main(String[] args) throws Exception {
    Config config = new Config();

    // The API endpoint of the instance. You can view the API endpoint in the API Endpoint section of the Instance Details page.
    config.setEndpoint("<instance_services_domain>");
    // The name of the OpenSearch Vector Search Edition instance. You can view the instance name in the upper-left corner of the Instance Details page. Example: ha-cn-i7*****605.
    config.setInstanceId("<instance_id>");

    // The username. You can view the username in the API Endpoint section of the Instance Details page.
    config.setAccessUserName("<user_name>");
    // The password. You can modify the password in the API Endpoint section of the Instance Details page.
    config.setAccessPassWord("<user_password>");

    Client client = new Client(config);

    // The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
    String tableName = "<instance_id>_<table_name>";

    // The primary key field of the document whose data is to be pushed.
    String pkField = "<field_pk>";

    try {
      // The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
      ArrayList<Map<String, ?>> documents = new ArrayList<>();

      // The document to be deleted.
      Map<String, Object> delete2Document = new HashMap<>();
      Map<String, Object> delete2DocumentFields = new HashMap<>();

      // The content of the document. Keys and values must be matched in pairs.
      // The value of the field_pk field must be the same as the value of the pkField field.
      delete2DocumentFields.put("<field_pk>", "<field_pk_value>");

      // Add the document content to a delete2Document structure.
      delete2Document.put("fields", delete2DocumentFields);
      // Run the delete command to delete the document.
      delete2Document.put("cmd", "delete");
      documents.add(delete2Document);

      // Push data.
      PushDocumentsRequest request = new PushDocumentsRequest();
      request.setBody(documents);
      PushDocumentsResponse response = client.pushDocuments(
        tableName,
        pkField,
        request
      );
      String responseBody = response.getBody();

      System.out.println("result:" + responseBody);
    } catch (TeaException e) {
      System.out.println(e.getMessage());

      Map<String, Object> abc = e.getData();

      System.out.println(com.aliyun.teautil.Common.toJSONString(abc));
    }
  }
}

Python

# -*- coding: utf-8 -*-


from alibabacloud_ha3engine_vector import models, client
from alibabacloud_tea_util import models as util_models
from Tea.exceptions import TeaException, RetryError


Config = models.Config(
    endpoint="<API endpoint>",  # // The API endpoint of the instance. You can view the API endpoint in the API Endpoint section of the Instance Details page. You must remove the http:// prefix when you specify the endpoint.
    instance_id="<Instance ID>",  # // The ID of the OpenSearch Vector Search Edition instance. You can view the instance ID in the upper-left corner of the Instance Details page. Example: ha-cn-i7*****605.
    protocol="http",
    access_user_name="<Username>",  # // The username. You can view the username in the API Endpoint section of the Instance Details page.
    access_pass_word="<Password>",  # // The password. You can modify the password in the API Endpoint section of the Instance Details page.
)

# Initialize the engine client.
ha3EngineClient = client.Client(Config)


def pushDoc():
    # The name of the table to which the document is pushed. Format: <Instance ID>_<Table name>.
    tableName = "<instance_id>_<table_name>";

    try:
        # The structure added to specify document operations in the outer structure that is used to push document data. You can specify one or more document operations in the structure.
        documentArrayList = []

        # The document to be deleted.
        # If you want to delete a document, you must specify the primary key field of the document.
        # If you build an index by using the multi-level hashing structure, you must specify the primary key field of each level of hash. 
        delete2DocumentFields = {
            "id": 1                 # The ID of the primary key field. The value is of the INT type.
        }
        delete2Document = {
            "fields": delete2DocumentFields, # Add the document content to a delete2Document structure.
            "cmd": "delete"                  # Run the delete command to delete the document.
        }

        optionsHeaders = {}
        documentArrayList.append(delete2Document)
        pushDocumentsRequest = models.PushDocumentsRequest(
            optionsHeaders, documentArrayList
        )
        
        # The primary key field of the document whose data is to be pushed.
        pkField = "id"
        # Use the default runtime parameters for the request.
        response = ha3EngineClient.push_documents(
            tableName, pkField, pushDocumentsRequest
        )
        print(response)
        print(response)

    except TeaException as e:
        print(f"send request with TeaException : {e}")
    except RetryError as e:
        print(f"send request with Connection Exception  : {e}")


if __name__ == "__main__":
    pushDoc()

Note

  • For more information about the response to a request, see Update data.

  • Do not run the go get github.com/aliyun/alibabacloud-ha3-go-sdk command to pull dependencies. The SDK dependencies for OpenSearch Vector Search Edition and OpenSearch Retrieval Engine Edition are classified into the same tag in GitHub. You must specify the corresponding edition based on the instance edition when you pull dependencies.