CreateIndex - Alibaba Cloud Model Studio - Alibaba Cloud Documentation Center

Creates an unstructured knowledge base and imports one or more parsed documents into the knowledge base. You cannot create a structured knowledge base by calling an API operation. Use the console instead.

Operation description

You must first upload documents to Data Management and obtain the FileId. The documents are the knowledge source of the knowledge base. For more information, see Import Data.
This operation only initializes a knowledge base creation job. You must also call the SubmitIndexJob operation to complete the job.
This interface is not idempotent.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Debug

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

Operation: the value that you can use in the Action element to specify the operation on a resource.
Access level: the access level of each operation. The levels are read, write, and list.
Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- For mandatory resource types, indicate with a prefix of * .
- If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
Condition Key: the condition key that is defined by the cloud service.
Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.

Operation	Access level	Resource type	Condition key	Associated operation

Operation	Access level	Resource type	Condition key	Associated operation
sfm:CreateIndex	create	All Resources ``	none	none

Request syntax

            
POST /{WorkspaceId}/index/create HTTP/1.1

Request parameters

Parameter	Type	Required	Description	Example

Parameter	Type	Required	Description	Example
WorkspaceId	string	Yes	The ID of the workspace to which the knowledge base belongs. To view the workspace ID, you can click the Workspace Details icon in the upper-left corner on the homepage of the console.	ws_3Nt27MYcoK191ISp
Name	string	Yes	The name of the knowledge base. The name must be 1 to 20 characters in length and can contain characters classified as letter in Unicode, including English letters, Chinese characters, digits, among others. The name can also contain colons (:), underscores (_), periods (.), and hyphens (-).
StructureType	string	Yes	The data type of the knowledge base. For more information, see Create a knowledge base. Valid value: unstructured Note After a knowledge base is created, its data type cannot be changed. You cannot create a structured knowledge base by calling an API operation. Use the console instead.	structured
EmbeddingModelName	string	No	The name of the embedding model. The embedding model converts the original input prompt and knowledge text into numerical vectors for similarity comparison. The default and only model available is DashScope text-embedding-v2. It supports multiple languages including Chinese and English and normalizes the vector results. For more information, see Create a knowledge base. Valid value: text-embedding-v2 The default value is null, which means using the text-embedding-v2 model.	text-embedding-v2
RerankModelName	string	No	The name of the rank model. The rank model is a scoring system outside the knowledge base. It calculates the similarity score of each text chunk in the input question and knowledge base and ranks them in descending order. Then, the model returns the top K chunks with the highest scores. For more information, see Create a knowledge base. Valid values: gte-rerank-hybrid gte-rerank The default value is empty, which means using the official gte-rerank-hybrid model. Note If you need only semantic ranking, we recommend that you use gte-rerank. If you need both semantic ranking and text matching features to ensure relevance, we recommend that you use gte-rerank-hybrid.	gte-rerank-hybrid
RerankMinScore	double	No	Similarity Threshold. The lowest similarity score of chunks that can be returned. This parameter is used to filter text chunks returned by the rank model. For more information, see Create a knowledge base. Valid values: [0.01-1.00]. Default value: 0.20.	0.20
ChunkSize	integer	No	The estimated length of chunks. The maximum number of characters for a chunk. Texts exceeding this limit are splited. For more information, see Create a knowledge base. Valid values: [1-2048]. The default value is empty, which means using the intelligent splitting method. Note If you specify the `ChunkSize` parameter, you must also specify the `OverlapSize` and `Separator` parameters. If you do not specify these three parameters, the system uses the intelligent splitting method by default.	128
OverlapSize	integer	No	The overlap length. The number of overlapping characters between two consecutive chunks. For more information, see Create a knowledge base. Valid values: 0 to 1024. The default value is empty, which means using the intelligent splitting method.	16
Separator	string	No	The clause identifier. The document is split into chunks based on this identifier. For more information, see Create a knowledge base. You can specify multiple identifiers and do not need to add any other characters to separate them. For example: !,\\n. Valid values: \n: line break ，: Chinese comma ,: English comma 。 : Chinese full stop .: English full stop ！ : Chinese exclamation point ! : English exclamation point ；: Chinese semicolon ;: English semicolon ？: Chinese question mark ?: English question mark The default value is empty, which means using the intelligent splitting method.	,
SourceType	string	Yes	The data type of Data Management. For more information, see Create a knowledge base. Valid values: DATA_CENTER_CATEGORY: The category type. Import all documents from one or more categories in Data Center. DATA_CENTER_FILE: The document type. Import one or more documents from Data Center. Note If this parameter is set to DATA_CENTER_CATEGORY, you must specify the `CategoryIds` parameter. If this parameter is set to DATA_CENTER_FILE, you must specify the `DocumentIds` parameter. Note If you want to create an empty knowledge base, you can use an empty category. Set this parameter to DATA_CENTER_CATEGORY. And specify the ID of an empty category for the `CategoryIds` parameter.	DATA_CENTER_FILE
DocumentIds	array	No	The list of primary key IDs of the documents to be imported into the knowledge base.
	string	No	The primary key ID of the document. To view the ID, you can click the ID icon next to the file name on the Data Management page.	file_9a65732555b54d5ea10796ca5742ba22_XXXXXXXX
CategoryIds	array	No	The list of primary key IDs of the categories to be imported into the knowledge base.
	string	No	The primary key ID of the category. To view the ID, you can click the icon next to the ID category on the Data Management page. All documents in specified categories are imported into the knowledge base.	ca_hiu2383nf934j
DataSource	object	No	Note This parameter is not available. Do not specify this parameter.
CredentialId	string	No	Note This parameter is not available. Do not specify this parameter.
CredentialKey	string	No	Note This parameter is not available. Do not specify this parameter.
Database	string	No	Note This parameter is not available. Do not specify this parameter.
Endpoint	string	No	Note This parameter is not available. Do not specify this parameter.
IsPrivateLink	boolean	No	Note This parameter is not available. Do not specify this parameter.
Region	string	No	Note This parameter is not available. Do not specify this parameter.
SubPath	string	No	Note This parameter is not available. Do not specify this parameter.
SubType	string	No	Note This parameter is not available. Do not specify this parameter.
Table	string	No	Note This parameter is not available. Do not specify this parameter.
Type	string	No	Note This parameter is not available. Do not specify this parameter.
SinkType	string	Yes	The vector storage type of the knowledge base. For more information, see Create a knowledge base. Valid values: DEFAULT: The built-in vector database. ADB: AnalyticDB for PostgreSQL database. If you need advanced features, such as managing, auditing, and monitoring, we recommend that you specify ADB. Note If you have not used AnalyticDB for AnalyticDB in Model Studio before, go to the Create Knowledge Base page, select ADB-PG as Vector Storage Type, and follow the instructions to grant permissions. If you specify ADB, you must also specify the `SinkInstanceId` and `SinkRegion` parameters.	DEFAULT
SinkInstanceId	string	No	The ID of the vector storage instance. This parameter is available only when SinkType is set to ADB. You can view the ID on the Instances page of AnalyticDB for PostgreSQL.	gp-bp321093j84
SinkRegion	string	No	The region of the vector storage instance. This parameter is available only when SinkType is set to ADB. You can call the DescribeRegions operation to query the most recent region list.	cn-hangzhou
Columns	array<object>	No	Note This parameter is not available. Do not specify this parameter.
	object	No
Column	string	No	Note This parameter is not available. Do not specify this parameter.	source_column_name1
IsRecall	boolean	No	Note This parameter is not available. Do not specify this parameter.	true
IsSearch	boolean	No	Note This parameter is not available. Do not specify this parameter.	true
Name	string	No	Note This parameter is not available. Do not specify this parameter.	index_column_name1
Type	string	No	Note This parameter is not available. Do not specify this parameter.	string
Description	string	No	The description of the knowledge base. The description must be 0 to 1,000 characters in length. This parameter is empty by default.

Response parameters

Parameter	Type	Description	Example

Parameter	Type	Description	Example
	object	Schema of Response
Code	string	HTTP status code	Forbidden
Data	object	The returned data.
Id	string	The primary key ID of the knowledge base, `IndexId`. Note We recommend that you store this ID. It is required for all subsequent API operations related to this knowledge base.	jkurxhju6b
Message	string	The error message.	Invalid input, variable name is missing
RequestId	string	The request ID.	17204B98-7734-4F9A-8464-2446A84821CA
Status	string	The status code.	200
Success	string	Indications whether the API call is successful. Valid values: true false	true

Examples

Sample success responses

JSONformat

            
            
          
{
  "Code": "Forbidden",
  "Data": {
    "Id": "jkurxhju6b"
  },
  "Message": "Invalid input, variable name is missing",
  "RequestId": "17204B98-7734-4F9A-8464-2446A84821CA",
  "Status": "200",
  "Success": "true\n"
}

Error codes

For a list of error codes, visit the Service error codes.

Change history

Change time	Summary of changes	Operation

Change time	Summary of changes	Operation

No change history

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

Asia Accelerator Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Cloud Phone Beta

Elastic Desktop Service (EDS) Featured

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)

Function Compute (FC)