OpenSearch LLM-Based Conversational Search Edition allows you to push multiple documents at a time.
URL
POST /v3/openapi/apps/[app_group_identity]/actions/knowledge-bulk
[app_group_identity] specifies the application that you want to access. You can specify the name of an application that is in service to access the application.
The sample URL omits information such as the request headers and encoding method.
The sample URL also omits the endpoint that is used to connect to the OpenSearch application.
POST
Request parameters
Parameter | Type | Location | Description |
app_group_identity | String | Path | The name of the application. |
Object | Body | The request body of the API operation that pushes multiple documents at a time. |
Document-related parameters
Parameter | Type | Required | Description |
cmd | String | Yes | The operation to be performed on the document. |
fields | Map | Yes | A collection of fields. |
Field-related parameters
Parameter | Type | Required | Description |
id | Int | Yes | The primary key ID of the document. |
title | String | No | The title of the document. |
url | String | No | The URL of the document. |
content | String | Yes | The content of the document. |
Sample requests
1. Upload a document to the main table
[
{
"cmd": "ADD",
"fields": {
"id": "13579",
"title": "Something interesting", // Optional.
"url": "https://www.aliyun.com", // Optional.
"content": "No, it's not."
}
},
{
"cmd": "DELETE",
"fields": {
"id": "2468"
}
}
]
Upload an unstructured document such as a PDF, DOC, TXT, or HTML file
[
{
"cmd" : "URL/BASE64",
"fields": {
"id": "Optional. The document ID.",
"type": "Optional. The file type such as PDF, DOC, TXT, or HTML.",
"title": "Optional. The file name.",
"content": "URL/Base64-encoded data.",
"url": "Optional. The file URL."
}
}
]
Note: The url parameter supports only URLs generated for Object Storage Service (OSS) objects by calling the generatePresignedUrl method by using the OSS client.
2. Upload a document to a custom table
[
{
"cmd" : "ADD/DELETE",
"fields": { # The fields. Specify fields based on the schema of the custom table.
"key1": "value1",
"key2": "value2",
"key3": "value3"
},
"table_name": "The name of the custom table." # The table name. If this parameter is left empty, the operation is performed on the main table.
}
]
cmd: Required. The operation to be performed on the document. Valid values: ADD and DELETE. We recommend that you send a request to perform the operation on multiple documents at a time. This improves the interaction efficiency over the network and the processing efficiency. A value of ADD specifies that the document is to be created. If a document with the specified primary key value exists, the original document is deleted before the new document is created. A value of DELETE specifies that the document is to be deleted. If no document with the specified primary key value exists, the operation is considered successful.
fields: Required. The fields on which the operation is to be performed in the document. You must specify the primary key field. OpenSearch identifies a document based on its primary key value. If you want to delete a document, you need to only specify the primary key field of the document.
In the preceding sample code, the outermost layer is a JSON array that is used to manage multiple documents at a time.
Response parameters
Parameter | Type | Description |
errors | List | The error message. |
status | String | The execution result of the request. Valid values: OK and FAIL. A value of OK indicates that the request is successful. A value of FAIL indicates that the request failed. In this case, troubleshoot errors based on the error code. |
request_id | String | The request ID. |
result | Boolean | The result of the request. A value of true is returned if the request is successful. This parameter is not returned if the request fails. |
total | Int | The total number of documents that are to be uploaded. |
success | Int | The total number of documents that are successfully uploaded. |
failure | Int | The total number of documents that fail to be uploaded. |
failed_ids | Array | The IDs of documents that fail to be uploaded. |
Sample response
{
"request_id" : "abc123-ABC",
"result" : {
"total": 100,
"success": 50,
"failure": 50,
"failed_ids": [
"id1",
"id2",
"id3",
"..."
]
}
"errors" : [
{
"code" : "The error code that is returned if an error occurs",
"message" : "The error message that is returned if an error occurs"
}
]
}
Usage notes
When you push data by calling API operations or using OpenSearch SDKs, the field names of the application are not case-sensitive.
You can push data for a limited number of times and of a limited size by calling API operations or using OpenSearch SDKs. Different limits are imposed on different applications. For more information, see Limits.
After data is uploaded, make sure that you check the return values. If an error code is returned, especially 3007, troubleshoot the error based on the error code and try again. Otherwise, data loss may occur. OpenSearch processes data in an asynchronous manner. A return value of OK indicates that OpenSearch receives the data. This does not indicate that data is properly processed. If an error occurs during data processing, the relevant error message is displayed in the OpenSearch console. Check error messages at the earliest opportunity.
You can upload only a limited size of data when you send an HTTP POST request. If the size of data before encoding exceeds 2 MB, OpenSearch rejects the request and returns an error.
If the body of an HTTP POST request to push data contains Chinese characters, you must encode the body in UTF-8. In addition, the value of the Content-MD5 header must also be encoded in UTF-8 before the MD5 value is calculated. Otherwise, an error is returned and the request fails.