All Products
Search
Document Center

AnalyticDB:GetUploadDocumentJob

Last Updated:Dec 20, 2024

Queries the progress and result of an asynchronous document upload job based on the job ID.

Operation description

This operation is related to the UploadDocumentAsync operation. You can call the UploadDocumentAsync operation to create an upload job and obtain the job ID, and then call the GetUploadDocumentJob operation to query the execution information of the job.

Note Suggestions:
  • Determine whether the document upload job times out based on the document complexity and the number of tokens after chunking. In most cases, a job that lasts more than 2 hours is considered timeout.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

  • Operation: the value that you can use in the Action element to specify the operation on a resource.
  • Access level: the access level of each operation. The levels are read, write, and list.
  • Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
    • The required resource types are displayed in bold characters.
    • If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
  • Condition Key: the condition key that is defined by the cloud service.
  • Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
OperationAccess levelResource typeCondition keyAssociated operation
gpdb:GetUploadDocumentJobcreate
*Document
acs:gpdb:{#regionId}:{#accountId}:document/{#DBInstanceId}
    none
none

Request parameters

ParameterTypeRequiredDescriptionExample
DBInstanceIdstringYes

The ID of the instance for which vector engine optimization is enabled.

Note You can call the DescribeDBInstances operation to query the information about all AnalyticDB for PostgreSQL instances within a region, including instance IDs.
gp-xxxxxxxxx
NamespacestringNo

The name of the namespace. Default value: public.

Note You can call the CreateNamespace operation to create a namespace and call the ListNamespaces operation to query a list of namespaces.
mynamespace
CollectionstringYes

The name of the document collection.

Note You can call the CreateDocumentCollection operation to create a document collection and call the ListDocumentCollections operation to query a list of document collections.
document
RegionIdstringYes

The region ID of the instance.

cn-hangzhou
NamespacePasswordstringYes

The password of the namespace.

Note The value of this parameter is specified when you call the CreateNamespace operation.
testpassword
JobIdstringYes

The ID of the document upload job. You can call the UploadDocumentAsync operation to query the job ID.

bf8f7bc4-9276-44f7-9c22-1d06edc8dfd1

Response parameters

ParameterTypeDescriptionExample
object
RequestIdstring

The request ID.

ABB39CC3-4488-4857-905D-2E4A051D0521
Messagestring

The returned message.

success
Statusstring

The status of the operation. Valid values:

  • success
  • fail
success
Jobobject

The information about the document upload job.

Idstring

The job ID.

231460f8-75dc-405e-a669-0c5204887e91
Completedboolean

Indicates whether the operation is complete.

false
CreateTimestring

The time when the job was created.

2024-01-08 16:52:04.864664
UpdateTimestring

The time when the job was updated.

2024-01-08 16:53:04.864664
Statusstring

The status of the job. Valid values: Valid values:

  • Success
  • Failed (See the Error parameter for failure reasons.)
  • Cancelling
  • Cancelled
  • Start
  • Running
  • Pending
Running
Errorstring

The error message.

Failed to connect database.
Progressinteger

The progress of the document upload job. Unit: %. A value of 100 indicates that the job is complete.

20
ErrorCodestring

The error code.

InternalError
ChunkResultobject

The chunking result.

ChunkFileUrlstring

The URL of the file after chunking. The validity period of the URL is 2 hours. The file is in the JSONL format. Each line is in the {"page_content":"*****", "metadata": {"**":"***","**":"***"} format.

http://xxx/test.jsonl
PlainChunkFileUrlstring

The URL of the file that does not contain metadata after chunking. The validity period of the URL is 2 hours. The file is in the TXT format. Each line is a chunk. The file can be easily used for embedding.

http://xxx/test.txt
Usageobject

The number of tokens that are used for document understanding or embedding.

EmbeddingTokensinteger

The number of tokens that are used for vectorization.

Note A token is the minimum unit for splitting text. A token can be a word, phrase, punctuation, or character.
475
EmbeddingEntriesinteger

The count of embedding entries.

10

Examples

Sample success responses

JSONformat

{
  "RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
  "Message": "success",
  "Status": "success",
  "Job": {
    "Id": "231460f8-75dc-405e-a669-0c5204887e91",
    "Completed": false,
    "CreateTime": "2024-01-08 16:52:04.864664",
    "UpdateTime": "2024-01-08 16:53:04.864664",
    "Status": "Running",
    "Error": "Failed to connect database.",
    "Progress": 20,
    "ErrorCode": "InternalError"
  },
  "ChunkResult": {
    "ChunkFileUrl": "http://xxx/test.jsonl",
    "PlainChunkFileUrl": "http://xxx/test.txt"
  },
  "Usage": {
    "EmbeddingTokens": 475,
    "EmbeddingEntries": 10
  }
}

Error codes

For a list of error codes, visit the Service error codes.

Change history

Change timeSummary of changesOperation
2024-10-15The response structure of the API has changedView Change Details
2024-01-18The response structure of the API has changedView Change Details