Large language models (LLMs) lack private and real-time knowledge, which can be solved by Retrieval-Augmented Generation (RAG). RAG retrieves information from external sources based on user input to enhance the accuracy of LLM responses. Alibaba Cloud Model Studio provides the knowledge base feature that uses RAG capabilities to retrieve private and real-time knowledge.
Application without private knowledge base Without a private knowledge base, the LLM cannot accurately answer questions about "Bailian phones". | Application with private knowledge base With a private knowledge base, the LLM can provide accurate answers to questions about "Bailian phones". |
Supported formats
The knowledge base feature supports documents in the following formats:
Unstructured Data (pdf, docx, doc, txt, markdown, pptx, ppt, png, jpg, jpeg, bmp, gif, and others)
Structured Data (xlsx, xls, and others)
The list above is not exhaustive. A complete list of supported types is displayed on the Import Data page of Data Management.
You can import local files (unstructured and structured data), from Object Storage Service OSS (unstructured data) or from ApsaraDB RDS (structured data). Data sources outside Alibaba Cloud, such as GitHub or Notion, are not supported.
Supported models
The following models support knowledge base:
Qwen-Max/Plus/Turbo
Open Source Qwen 2
The list above is not exhaustive and is subject to change. For the latest information, go to the application management page of My Applications. All models in the Select Model drop-down list support RAG.
Create and reference a knowledge base
Step 1: Import data
Before creating a knowledge base, import your documents into Data Management and choose from Unstructured Data and Structured Data.
If you want to build a knowledge base from an RDS data table, go to Step 2: Create a knowledge base.
The choice between structured or unstructured data depends on the format of your documents, see Supported formats.
You can import data by using the console or the API. However, the API supports only unstructured data. For more information about the API, see AddFile.
Unstructured data
Go to the Data Management page of the console and select the Unstructured Data tab.
In the Category Management section on the left, select the desired category for data import.
Select the Default Category or click to create a new category. The number of categories is not limited.
You can upload up to 10,000 documents to each workspace.
Click Import Data to go to the Import Data page.
Configure Document Recognition. Use the default value Intelligent Document Parsing.
The parser can detect and extract text from images within the document to create text summaries. These summaries, along with other content, are segmented and transformed into vectors for knowledge base retrieval.
(Optional) Configure tags for the document.
When calling applications by using API, tags can be specified in the request parameter
tags
to filter related documents, enhancing retrieval efficiency.Click Confirm to initiate the document parsing and importing process. This may take some time.
Document parsing converts the uploaded document into a format that Model Studio can process. During peak periods, this process may take longer.
Once parsing and importing are complete, click Details on the right side of the corresponding document to review the imported content.
Structured data
Go to the Data Management page of the console and select the Structured Data tab.
Create a new data table or select an existing one.
You can create up to 1,000 data tables in each workspace. Each table can contain up to 10,000 rows, including the table header. Exceeding this limit will result in a failed import, so you may need to split the data in advance.
Create a new data table
Click to create a data table.
Enter a name for the data table.
Configure the table by Upload Excel File or Custom Header.
Upload Excel File: Model Studio automatically detects the table header in the uploaded Excel file and create the data table structure accordingly, importing the remaining content as data records.
Custom Header: Column Name and Type is necessary and Description is optional.
NoteOnce the data table is created, you cannot modify the Column Name, Description, or Type.
Make sure the table schema matches the schema of the data to be imported. For example, if the data table to be imported has 2 columns, the structure here must also have 2 fields with corresponding column names. Click New Columns or Delete in the Actions column to adjust the fields.
When you set Type to link, make sure the link directs to an image file that is publicly accessible and valid. Otherwise, the knowledge base cannot recognize the image.
Example link format: https://example.com/downloads/pic.jpg
When creating a knowledge base, the link type field is used to generate an image index. Model Studio accesses the image, extracts its features, and saves them as vectors after image embedding. These vectors are used for similarity comparison during knowledge base retrieval.
Upload your document.
Click to select and upload an Excel document (xlsx or xls format).
The document must have a table header that matches the header structure of the current data table. Otherwise, the import will fail.
Click Preview to review the imported data.
Click Confirm. The new data table appears in the Table Management pane on the left.
Select an existing data table
Select an existing table from the Table Management pane on the left and click Import Data.
Select Upload and Overwrite or Incremental Upload as the Import Type.
You can click Download Template to download a blank document with the table header. Then, insert data to the template and upload it directly.
Click to select and upload an Excel document (xlsx or xls format).
The document must have a table header that matches the header structure of the current data table. Otherwise, the import will fail.
Click Preview to review the imported data.
Step 2: Create a knowledge base
The number of knowledge bases is not limited for both enterprise and individual accounts.
Console
Go to the Knowledge Base Index page. Click Create Knowledge Base.
Data Type: Select Unstructured Data or Structured Data.
After the knowledge base is created, the data type cannot be changed. A single knowledge base cannot support both unstructured and structured data.
The choice between structured or unstructured data depends on the format of your documents. If your documents are in pdf, docx, doc, txt, md, pptx, ppt, png, jpg, jpeg, bmp, gif formats, select Unstructured Data; if your documents are in xlsx, xls formats, select Structured Data.
Configuration Mode: We recommend that you select Recommended, which is the best practice from the Model Studio team. If you select Custom, you can configure the retrieval and recall-related parameters.
After the knowledge base is created, all parameters in Configuration Mode, except for Similarity Threshold, cannot be changed.
Custom Parameter Settings
Multi-round Conversation Rewriting: When enabled, Model Studio automatically adjust the original prompt based on the context to improve retrieval effectiveness.
Embedding Model: Converts the original prompt and knowledge text into numerical vectors for similarity comparison. The DashScope text-embedding-v2 model is the default and only option. The model supports multiple languages, including English and Chinese, and normalizes the vector results.
Rank Model: An external scoring system that calculates the similarity score between the query and each chunk in the knowledge base, sorts them in descending order, and returns the top K chunks with the highest scores. For semantic ranking, select GTE-ReRank. For semantic ranking and text matching features, choose Official Ranking (recommended).
Similarity Threshold: The minimum similarity score required for recalled chunks, used to filter the chunks returned by the Rank model. Lowering this threshold may recall more chunks, including less relevant ones. Increasing it reduces the number of recalled chunks, and may discard relevant ones.
Click Next Step, select the documents or data source to import:
Unstructured Data
Select the files you want to import from Data Management. If no file is available, you must first import your files to Data Management. For more information, see Knowledge base.
Select Category: Import all documents under the selected category. You can select multiple categories at a time.
Select File: Select the files you want to import.
You can select up to 50 documents at a time. Each document can be up to 100 MB in size or contain up to 1,000 pages.
Structured Data
Select the Data Source from Data Management or Associate RDS.
If you import structured data from Data Management, you will need to manually synchronize updates to the data table to the knowledge base later. For more information, see Update the knowledge base.
If you import structured data from ApsaraDB RDS, data updates in the RDS table will be automatically synchronized to the knowledge base (usually within seconds, but slight delays may occur during peak request periods).
Data Management
Select the table you want to import from Data Management. If no table is available, you must first import your table to Data Management. For more information, see Unstructured Data.
Associate RDS
Synchronize data from specific data tables in the RDS instance to your knowledge base.
Instance limitations:
Only RDS instances with MySQL Engine (no version restrictions) are supported. PostgreSQL and other engines are not supported.
No restrictions on instance regions.
Only Basic Edition and High-availability Edition are supported.
Database proxy is not supported.
Database and Table Limitations:
The knowledge base has no limit on the amount of data in the associated RDS database and data table, but the size of each row must be less than 10 MB.
DDL operations on the source table are not recommended after creating the knowledge base (For example, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, DROP COLUMN), because they may cause data synchronization failures between RDS and the knowledge base. For more information, see DDL operations.
To ensure the knowledge base can import data from RDS, you need to configure whitelist for the RDS instance. After the RDS data table is associated, the table header and the first ten rows of data are displayed.
Click Next Step to configure the Data Processing strategy.
Unstructured Data
Metadata Extraction: Metadata consists of additional attributes related to the content of unstructured documents, presented in key-value pairs.
The role of metadata: Metadata provides context for documents, significantly enhancing the precision of knowledge base retrieval. For example, search for "Feature Overview of Product A" in the knowledge base. If all documents include "Feature Overview" but none mention "Product A", the knowledge base may recall numerous unrelated chunks. However, if you associate product name as metadata with all documents and their related chunks, the knowledge base can accurately filter out chunks related to "Product A" and containing "Feature Overview", thereby improving retrieval accuracy and reducing input token consumption.
How to use metadata: When calling the application using API, specify the
metadata_filter
parameter. During knowledge base retrieval, the application will filter documents based on metadata.Note: You cannot configure Metadata Extraction again after the knowledge base is created.
Document Splitting: Select Intelligent Splitting (recommended) or Custom Splitting.
Role of document splitting: The knowledge base splits your documents into chunks and converts these chunks into vectors through the embedding model. The chunks and vectors are then stored in a vector database as key-value pairs. You can view the content of each chunk in the knowledge base. For more information, see View the knowledge base.
Note: You cannot configure Document Splitting after the knowledge base is created. An inappropriate splitting strategy may reduce retrieval and recall performance. For more information, see How to evaluate the quality of chunks?.
Intelligent Splitting: Uses the built-in chunking strategy, evaluated to deliver the best retrieval performance for most documents.
Custom Splitting: If intelligent splitting does not work properly, you can customize the document splitting strategy.
Structured Data
The following index configurations cannot be changed after the knowledge base is created.
Parameter Name
Parameter Description
Used for Retrieval
When enabled, that the knowledge base is allowed to search in this column data.
Used for Model Reply
When enabled, the retrieval results of this column will be used as input information when the LLM generates responses.
Click Import.
API
To build an unstructured knowledge base, use the CreateIndex API.
You can not use the API to create a structured knowledge base. Please create such knowledge bases in the console.
In the request return, the value of
Data.Id
is the primary key ID of the knowledge base. Please keep this value safe, as it will be used for all subsequent API operations related to the knowledge base.In the
StructureType
field, specify the data structure type used to create the knowledge base. For unstructured data, pass in unstructured.In the
RerankModelName
field, specify the name of the ranking model. For official reranking, pass in gte-rerank-hybrid.The rerank model is used to reorder the knowledge text results recalled from the knowledge base based on semantic relevance. Official Reranking is recommended.
In the
SinkType
field, specify the vector storage type of the knowledge base.The built-in vector database can meet basic needs. For advanced features such as management, audit, and monitoring of the database, ADB-PG (AnalyticDB for PostgreSQL) is recommended.
To specify the Built-in vector database, pass in DEFAULT.
To specify the ADB-PG (AnalyticDB for PostgreSQL) database, pass in ADB.
The CreateIndex step initiates the knowledge base construction, but to finalize it, you must invoke the SubmitIndexJob interface. As this task requires some time to complete, you can monitor its progress by calling the GetIndexJobStatus interface. A return status of
Data.Status
as COMPLETED indicates the successful creation of the knowledge base.
Step 3: (Optional) Test the knowledge base
Hit Test is used to evaluate the semantic retrieval performance of a knowledge base under a given Similarity Threshold, for example, to check whether chunks are correctly recalled. The test helps determine whether further adjustment of the similarity threshold is needed to ensure that the LLM can obtain valid knowledge from the knowledge base. To perform a hit test, expand Hit Test (Optional) and take the steps.
Similarity Threshold: The minimum similarity score required for recalled chunks, used to filter the chunks returned by the Rank model. Lowering this threshold may recall more chunks, including less relevant ones. Increasing it reduces the number of recalled chunks, and may discard relevant ones.
Step 4: Associate the knowledge base
Go to My Applications and associate your knowledge base with your Agent Application or Workflow Application. Both applications support multiple knowledge bases simultaneously based on the multi-channel recall strategy.
Multi-channel recall strategy: If the application is associated with three knowledge bases, the system retrieves chunks related to the input from these bases, ranks them with the Rank model, and selects the top K most relevant ones as the reference for the LLM.
Agent Application
Scenario
This is an example of a Q&A agent application based on a knowledge base. The knowledge base effectively provides private and the latest information for the LLM. Such an application is suitable in scenarios such as personal assistants, customer service, and technical support.
Reference the knowledge base in an agent application
Go to My Applications and click Manage on the desired agent application card. Enable Knowledge Base Retrieval Augmentation. The corresponding prompt is automatically filled in Prompt. Click Configure Knowledge Base and add one or more knowledge bases.
Retrieval Configuration (Optional)
Workflow Application
Scenario
This is an example of a Q&A workflow application based on a knowledge base. The execution logic of the process is: First, perform knowledge retrieval in the knowledge base based on user query. The recalled chunks are then passed into the LLM node along with the query for answer generation.
Reference the knowledge base in a workflow application
Go to My Applications and click Manage on the desired workflow application card.
Configure upstream node: Create a Knowledge Base node and connect it to the Start node.
Select query variable: In the Input dropdown list of the Knowledge Base node, select
query
.For Q&A workflow applications, the
sys.query
variable of the Start node is usually selected as the query variable.Select Knowledge Base: In the Select Knowledge Base dropdown list, select the knowledge base to be referenced.
Adjust topK (optional): The K value of the multi-channel Recall Strategy. It specifies the quantity of text chunks that the Rank model passes to the LLM. The value must not exceed the maximum length. Increasing the K value can enhance the precision of LLM responses at the cost of increased token consumption.
Configure downstream node: Create an LLM node and set it as the downstream node of the Knowledge Base node. In the Prompt of the large model node, guide the LLM to refer to the knowledge base.
System Prompt: # Knowledge base Remember the following materials that may help you answer questions:${Retrieval_xxxx.result} User Prompt ${sys.query}
Enter
/
to replace {{Retrieval_xxxx.result}} and {{sys.query}} to the actual variables in your workflow.Click Test or Publish. When the user asks a question, if the knowledge base node matches related chunks, the chunks are filled into the system variable
sys.query
to assist the LLM node in generating a response. If no related chunks are matched, the LLM node directly respond to the system variablesys.query
.