All Products
Search
Document Center

AnalyticDB:CreateDocumentCollection

Last Updated:Feb 04, 2026

Creates a knowledge base.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

The table below describes the authorization required to call this API. You can define it in a Resource Access Management (RAM) policy. The table's columns are detailed below:

  • Action: The actions can be used in the Action element of RAM permission policy statements to grant permissions to perform the operation.

  • API: The API that you can call to perform the action.

  • Access level: The predefined level of access granted for each API. Valid values: create, list, get, update, and delete.

  • Resource type: The type of the resource that supports authorization to perform the action. It indicates if the action supports resource-level permission. The specified resource must be compatible with the action. Otherwise, the policy will be ineffective.

    • For APIs with resource-level permissions, required resource types are marked with an asterisk (*). Specify the corresponding Alibaba Cloud Resource Name (ARN) in the Resource element of the policy.

    • For APIs without resource-level permissions, it is shown as All Resources. Use an asterisk (*) in the Resource element of the policy.

  • Condition key: The condition keys defined by the service. The key allows for granular control, applying to either actions alone or actions associated with specific resources. In addition to service-specific condition keys, Alibaba Cloud provides a set of common condition keys applicable across all RAM-supported services.

  • Dependent action: The dependent actions required to run the action. To complete the action, the RAM user or the RAM role must have the permissions to perform all dependent actions.

Action

Access level

Resource type

Condition key

Dependent action

gpdb:CreateDocumentCollection

create

*Collection

acs:gpdb:{#regionId}:{#accountId}:collection/{#DBInstanceId}

None None

Request parameters

Parameter

Type

Required

Description

Example

DBInstanceId

string

Yes

The ID of the AnalyticDB for PostgreSQL instance.

Note

You can call the DescribeDBInstances operation to view details of all AnalyticDB for PostgreSQL instances in a region, including their instance IDs.

gp-xxxxxxxxx

ManagerAccount

string

Yes

The name of the management account that has the rds_superuser permission.

Note

You can create an account in the console by choosing Account Management. You can also call the CreateAccount operation to create an account.

testaccount

ManagerAccountPassword

string

Yes

The password of the management account.

testpassword

Namespace

string

No

The namespace. Default value: public.

Note

You can call the CreateNamespace operation to create a namespace. You can call the ListNamespaces operation to view the list of namespaces.

mynamespace

Collection

string

Yes

The name of the document collection to create.

Note

The name must comply with PostgreSQL object naming rules.

document

RegionId

string

Yes

The ID of the region where the instance resides.

cn-hangzhou

EmbeddingModel

string

No

The embedding algorithm. Default value: text-embedding-v3.

Note

Supported algorithms:

  • text-embedding-v3 (recommended and default): 1024, 768, or 512 dimensions

  • multimodal-embedding-v1 (recommended): 1024 dimensions, multimodal embedding algorithm

  • text-embedding-v1: 1536 dimensions

  • text-embedding-v2: 1536 dimensions

  • text2vec (not recommended): 1024 dimensions

  • m3e-base (not recommended): 768 dimensions

  • m3e-small (not recommended): 512 dimensions

  • clip-vit-b-32 (not recommended): CLIP ViT-B/32 model, 512 dimensions, image embedding algorithm

  • clip-vit-b-16 (not recommended): CLIP ViT-B/16 model, 512 dimensions, image embedding algorithm

  • clip-vit-l-14 (not recommended): CLIP ViT-L/14 model, 768 dimensions, image embedding algorithm

  • clip-vit-l-14-336px (not recommended): CLIP ViT-L/14@336px model, 768 dimensions, image embedding algorithm

  • clip-rn50 (not recommended): CLIP RN50 model, 1024 dimensions, image embedding algorithm

  • clip-rn101 (not recommended): CLIP RN101 model, 512 dimensions, image embedding algorithm

  • clip-rn50x4 (not recommended): CLIP RN50x4 model, 640 dimensions, image embedding algorithm

  • clip-rn50x16 (not recommended): CLIP RN50x16 model, 768 dimensions, image embedding algorithm

  • clip-rn50x64 (not recommended): CLIP RN50x64 model, 1024 dimensions, image embedding algorithm

text-embedding-v1

Dimension

integer

No

The vector dimension. Default value is the dimension supported by the embedding algorithm.

1024

FullTextRetrievalFields

string

No

The fields used for full-text search. Separate multiple fields with commas (,). Each field must be a key defined in the Metadata parameter.

title,page

Metadata

string

No

The metadata of vector data, formatted as a JSON string in MAP format. Keys represent field names. Values represent data types.

Note

Supported data types:

  • For a complete list, see Data types.

  • The money type is not supported.

Warning The fields id, vector, doc_name, content, loader_metadata, source, and to_tsvector are reserved. Do not use them.

{"title":"text","page":"int"}

Parser

string

No

The tokenizer used for full-text search. Default value: zh_cn.

zh_cn

Metrics

string

No

The method used to build the vector index.

Valid values:

  • l2: Euclidean distance.

  • ip: Inner product distance.

  • cosine (default): Cosine similarity.

cosine

HnswM

integer

No

The maximum number of neighbors in the HNSW algorithm. The API sets this value automatically based on the vector dimension. Manual configuration is usually unnecessary.

Note

Valid values:

  • AnalyticDB for PostgreSQL 6.0 instance: 1 to 1000.

  • AnalyticDB for PostgreSQL 7.0 instance: 2 to 100. Default value: 16.

Note

We recommend setting this value based on the vector dimension:

  • ≤ 384: 16

  • > 384 and ≤ 768: 32

  • > 768 and ≤ 1024: 64

  • > 1024: 128

64

HnswEfConstruction

string

No

The candidate set size used when building the HNSW index. Valid values: 4 to 1000. Default value: 64.

Note

This parameter applies only to AnalyticDB for PostgreSQL 7.0 instances. Its value must be ≥ 2 × HNSW_M.

128

PqEnable

integer

No

Whether to enable product quantization (PQ) to accelerate indexing. We recommend enabling PQ if your data volume exceeds 500,000 items.

  • 0: Disabled.

  • 1: Enabled (default).

1

ExternalStorage

integer

No

Whether to use memory-mapped files (mmap) to build the HNSW index. Default value: 0. Set this to 1 if you do not need to delete data and require high upload performance.

Valid values:

  • 0: Use segment-page storage to build the index. This mode uses PostgreSQL shared_buffer for caching and supports operations such as deletion and updates.

  • 1: Use mmap to build the index. This mode does not support deletion or updates.

Important Only AnalyticDB for PostgreSQL 6.0 supports the ExternalStorage parameter. AnalyticDB for PostgreSQL 7.0 does not support it.

0

MetadataIndices

string

No

The scalar index fields. Separate multiple fields with commas (,). Each field must be a key defined in the Metadata parameter.

title

EnableGraph

boolean

No

Whether to enable knowledge graph construction. Default value: false.

Note

Before using this parameter, upgrade your instance to a version that supports the graph engine. During public preview, submit a ticket to request an upgrade.

true

LLMModel

string

No

The large language model (LLM) name.

  • knowledge-extract-standard: Default value.

  • knowledge-extract-mini

Note

This parameter takes effect only when EnableGraph is set to true.

knowledge-extract-standard

Language

string

No

The language used for knowledge graph construction.

  • Simplified Chinese: Simplified Chinese. Default value.

  • English: English.

Note

This parameter takes effect only when EnableGraph is set to true.

Simplified Chinese

EntityTypes

array

No

The list of entity types.

Note

This parameter is required when EnableGraph is set to true.

string

No

The entity type.

地点

RelationshipTypes

array

No

The list of relationship edge types.

Note

This parameter is required when EnableGraph is set to true.

string

No

The relationship edge type.

发生

SupportSparse

boolean

No

Whether to support sparse vectors. Default value: false.

true

SparseVectorIndexConfig

object

No

The sparse vector index configuration. If provided, a sparse vector index is created.

HnswM

integer

No

The maximum number of neighbors in the HNSW algorithm. The API sets this value automatically based on the vector dimension. Manual configuration is usually unnecessary.

Note

Valid values:

  • AnalyticDB for PostgreSQL 6.0 instance: 1 to 1000.

  • AnalyticDB for PostgreSQL 7.0 instance: 2 to 100. Default value: 16.

Note

We recommend setting this value based on the vector dimension:

  • ≤ 384: 16

  • > 384 and ≤ 768: 32

  • > 768 and ≤ 1024: 64

  • > 1024: 128

64

HnswEfConstruction

integer

No

The candidate set size used when building the HNSW index. Valid values: 4 to 1000. Default value: 64.

Note

This parameter applies only to AnalyticDB for PostgreSQL 7.0 instances. Its value must be ≥ 2 × HNSW_M.

128

SparseRetrievalFields

string

No

The metadata fields used to build sparse vectors. Separate multiple fields with commas (,). Each field must be a key defined in the Metadata parameter.

title,abstract

Response elements

Element

Type

Description

Example

object

RequestId

string

The ID of the request.

ABB39CC3-4488-4857-905D-2E4A051D0521

Message

string

The response message.

Successful

Status

string

The status of the API execution.

  • success: The operation succeeded.

  • fail: The operation failed.

successs

Examples

Success response

JSON format

{
  "RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
  "Message": "Successful",
  "Status": "successs"
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.