Integrate ApsaraDB RDS for PostgreSQL with RAG - Platform For AI

This topic describes how to associate a Retrieval-Augmented Generation (RAG)-based LLM chatbot with an ApsaraDB RDS for PostgreSQL instance when you deploy the RAG-based large language model (LLM) chatbot. This topic also describes the basic features provided by a RAG-based LLM chatbot and the special features provided by ApsaraDB RDS for PostgreSQL.

Background information

Introduction to EAS

Elastic Algorithm Service (EAS) is an online model service platform of PAI that allows you to deploy models as online inference services or AI-powered web applications. EAS provides features such as auto scaling and blue-green deployment. These features reduce the costs of developing stable online model services that can handle a large number of concurrent requests. In addition, EAS provides features such as resource group management and model versioning and capabilities such as comprehensive O&M and monitoring. For more information, see EAS overview.

Introduction to RAG

With the rapid development of AI technology, generative AI has made remarkable achievements in various fields such as text generation and image generation. However, the following inherent limits gradually emerge while LLMs are widely used:

Field knowledge limits: In most cases, LLMs are trained by using large-scale general datasets. In this case, LLMs struggle to provide in-depth and targeted processing for specialized vertical fields.
Information update delay: The static nature of the training datasets prevents LLMs from accessing and incorporating real-time information and knowledge updates.
Misleading outputs: LLMs are prone to hallucinations, producing outputs that appear plausible but are factually incorrect. This is attributed to factors such as data bias and inherent model limits.

To address these challenges and enhance the capabilities and accuracy of LLMs, RAG is developed. RAG integrates external knowledge bases to significantly mitigate the issue of LLM hallucinations and enhance the capabilities of LLMs to access and apply up-to-date knowledge. This enables the customization of LLMs for greater personalization and accuracy.

Introduction to ApsaraDB RDS for PostgreSQL

ApsaraDB RDS supports PostgreSQL. PostgreSQL is fully compatible with SQL and supports a diverse range of data formats such as JSON, IP, and geometric data. In addition to support for features such as transactions, subqueries, multi-version concurrency control (MVCC), and data integrity check, ApsaraDB RDS for PostgreSQL provides a series of features, such as high availability, backup, and restoration, to ease O&M loads. For more information about the advanced features of ApsaraDB RDS for PostgreSQL, see What is ApsaraDB RDS for PostgreSQL?

Procedure

EAS provides a self-developed RAG systematic solution with flexible parameter configurations. You can access RAG services by using a web user interface (UI) or calling API operations to configure a custom RAG-based LLM chatbot. The technical architecture of RAG focuses on retrieval and generation.

Retrieval: EAS integrates a range of vector databases, including open source Faiss and Alibaba Cloud services such as Milvus, Elasticsearch, Hologres, OpenSearch, and AnalyticDB for PostgreSQL.
Generation: EAS supports various open source models such as Qwen, Meta Llama, Mistral, and Baichuan, and also integrates ChatGPT.

In this example, an ApsaraDB RDS for PostgreSQL instance is used to show how to use EAS and ApsaraDB RDS for PostgreSQL to deploy a RAG-based LLM chatbot by performing the following steps:

Prepare a vector database by using ApsaraDB RDS for PostgreSQL
Create an ApsaraDB RDS for PostgreSQL instance and prepare configuration items on which a RAG-based LLM chatbot depends to associate with the ApsaraDB RDS for PostgreSQL instance.
Deploy a RAG-based LLM chatbot and associate it with the ApsaraDB RDS for PostgreSQL instance
Deploy a RAG-based LLM chatbot on EAS and associate it with the ApsaraDB RDS for PostgreSQL instance.
Use the RAG-based LLM chatbot
You can connect to the ApsaraDB RDS for PostgreSQL instance in the RAG-based LLM chatbot, upload business data files, and then perform knowledge Q&A.

Prerequisites

A virtual private cloud (VPC), a vSwitch, and a security group are created. For more information, see Create a VPC with an IPv4 CIDR block and Create a security group.

Precautions

This practice is subject to the maximum number of tokens of an LLM service and is designed to help you understand the basic retrieval feature of a RAG-based LLM chatbot.

The chatbot is limited by the server resource size of the LLM service and the default number of tokens. The conversation length supported by the chatbot is also limited.
If you do not need to perform multiple rounds of conversations, we recommended that you disable the with chat history feature of the chatbot on the WebUI page. This effectively reduces the possibility of reaching the limit. For more information, see How do I disable the with chat history feature of the RAG-based chatbot?

Prepare a vector database by using ApsaraDB RDS for PostgreSQL

Step 1: Create an ApsaraDB RDS for PostgreSQL instance and database

Create an ApsaraDB RDS for PostgreSQL instance.
1. Go to the ApsaraDB RDS buy page of the new version.
2. On the buy page, configure the following parameters. For information about other parameters, see Create an ApsaraDB RDS for PostgreSQL instance.
  - Database Engine: Select PostgreSQL.
  - VPC: Select a created virtual private cloud (VPC).
  - Privileged Account: In the More section, select Configure Now and configure the database account and password.
3. Follow the instructions in the console to complete the payment and activation operations.
Create a database.
1. On the Instances page, find the desired instance and click the name of the instance in the Instance ID/Name column. In the left-side navigation pane of the instance details page, click Databases. On the page that appears, click Create Database.
2. In the Create Database panel, specify a database name for the Database Name parameter and select the created priviledged account for the Authorized By parameter. For information about other parameters, see Create a database and an account.
3. After you configure the parameters, click Create.

Step 2: Prepare configuration items

View the connection information about the database.
In the left-side navigation pane of the details page for the ApsaraDB RDS for PostgreSQL instance, click Database Connection. On the page that appears, view the internal endpoint, public endpoint, and port number of the database.
- Use an internal endpoint: The RAG-based LLM chatbot must be in the same VPC as the database.
- Use a public endpoint: When EAS accesses an ApsaraDB RDS for PostgreSQL instance over the Internet, EAS must be able to access the Internet. To ensure that an ApsaraDB RDS for PostgreSQL instance can receive requests from an EAS instance over the Internet, you must apply for a public endpoint for the ApsaraDB RDS for PostgreSQL instance and add the related Elastic IP Address (EIP) or 0.0.0.0/0 to the whitelist. Perform the following steps:
  1. Apply for a public endpoint for an ApsaraDB RDS for PostgreSQL instance. For more information, see Apply for or release a public endpoint.
  2. To enable Internet access for EAS, you must associate a NAT gateway and an EIP with the VPC that is configured when you deploy a RAG-based LLM chatbot. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
    Note
    The RAG-based LLM chatbot can use the same VPC as the ApsaraDB RDS for PostgreSQL instance or a different VPC.
  3. Add 0.0.0.0/0 or the preceding EIP to the public IP address whitelist of the ApsaraDB RDS for PostgreSQL instance. For more information, see Configure an IP address whitelist.
View the privileged account and password.
In the left-side navigation pane of the details page for the ApsaraDB RDS for PostgreSQL instance, click Accounts. On the page that appears, you can view the privileged account. The password is configured when you create the instance. If you forget the password, you can click Reset Password to change the password.

Deploy a RAG-based LLM chatbot and associate it with the ApsaraDB RDS for PostgreSQL instance

Log on to the Platform for AI (PAI) console and select a region. In the left-side navigation pane, choose Model Deployment > Elastic Algorithm Service (EAS). On the page that appears, select a workspace and click Enter Elastic Algorithm Service (EAS).
On the Elastic Algorithm Service (EAS) page, click Deploy Service. In the Scenario-based Model Deployment section, click RAG-based Smart Dialogue Deployment.

On the RAG-based LLM Chatbot Deployment page, configure the key parameters described in the following table. For information about other parameters, see Step 1: Deploy the RAG service.

Parameter		Description
Basic Information	Model Source	Select Open Source Model.
Basic Information	Model Type	Select Qwen1.5-1.8b.
Resource Configuration	Resource Configuration	The system recommends the appropriate resource specifications based on the selected model type. If you use other resource specifications, the model service may fail to start.
Vector Database Settings	Vector Database Type	Select RDS PostgreSQL.
	Host Address	The internal endpoint or public endpoint of the ApsaraDB RDS for PostgreSQL instance.
	Port	The port number of the ApsaraDB RDS for PostgreSQL instance. Example: 5432.
	Database	The name of the database that you created.
	Table Name	The name of the table. You can enter a new table name or an existing table name. If you use an existing table name, the table schema must meet the requirements of the RAG-based chatbot. For example, you can enter the name of the table that is automatically created when you deploy the RAG-based chatbot by using EAS.
	Account	The privileged account that you created.
	Password	The password of the privileged account that you created.
VPC Configuration (Optional)	VPC	If the host address is an internal endpoint, you must configure the RAG-based chatbot in the same VPC as the ApsaraDB RDS for PostgreSQL instance. If the host address is a public endpoint, you must configure a VPC for the RAG-based chatbot. Make sure that the VPC can access the Internet. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet. In addition, you must add the EIP or `0.0.0.0/0` to the public IP address whitelist of the ApsaraDB RDS for PostgreSQL instance. For more information, see Configure an IP address whitelist.
	vSwitch
	Security Group Name

After you configure the parameters, click Deploy.

Use the RAG-based LLM chatbot

1. Connect to the vector database

The following section describes how to use a RAG-based LLM chatbot. For more information, see RAG-based LLM chatbot.

After you deploy the RAG-based chatbot, click View Web App in the Service Type column to enter the web UI.
Check whether the vector database is connected.
The system recognizes and applies the vector database settings that are configured when you deploy the chatbot. Click Connect PostgreSQL to check whether the ApsaraDB RDS for PostgreSQL instance is connected. If the connection fails, check whether the vector database settings are correct based on Step 2: Prepare configuration items. If the settings are incorrect, modify the configuration items and click Connect PostgreSQL to reconnect the ApsaraDB RDS for PostgreSQL instance.

2. Upload business data files

Upload your knowledge base files. The system automatically stores the knowledge base in the PAI-RAG format to the vector database for retrieval. You can also use existing knowledge bases in the database, but the knowledge bases must meet the PAI-RAG format requirements. Otherwise, errors may occur during retrieval.

On the Upload tab, configure the chunk parameters.

The following parameters controls the granularity of document chunking and whether to enable Q&A extraction.

Parameter	Description
Chunk Size	The size of each chunk. Unit: bytes. Default value: 500.
Chunk Overlap	The overlap between adjacent chunks. Default value: 10.
Process with QA Extraction Model	Specifies whether to extract Q&A information. If you select Yes, the system automatically extracts questions and corresponding answers in pairs after knowledge files are uploaded. This way, more accurate answers are returned in data queries.

On the Files tab or Directory tab, upload one or more business data files. You can also upload a directory that contains the business data files. Supported file types: txt,. pdf, Excel (.xlsx or. xls),. csv, Word (.docx or. doc), Markdown, or. html. For example: rag_chatbot_test_doc.txt.
Click Upload. The system performs data cleansing and semantic-based chunking on the business data files before uploading the business data files. Data cleansing includes text extraction and hyperlink replacement.

3. Perform knowledge Q&A

The RAG-based LLM chatbot enters the results returned from the vector database and the query into the selected prompt template and sends the template to the LLM application to provide an answer.

Special features provided by ApsaraDB RDS for PostgreSQL

View the ApsaraDB RDS for PostgreSQL instance list, select the region in which the instance resides, find the desired instance, and click the name of the instance in the Instance ID/Name column.
In the left-side navigation pane of the instance details page, click Databases. On the page that appears, find the database that you want to manage and click SQL Query in the Actions column.
On the Log on to Database Instance page, enter the privileged account and password that you configure when you created the ApsaraDB RDS for PostgreSQL instance for the Database Account and Database Password parameters, and click Login.
After the logon is successful, query the list of imported business data files in the instance on which you have logged on to the database.

References

EAS provides simplified deployment methods for typical cutting-edge scenarios of AI-Generated Content (AIGC) and LLM. You can easily deploy model services by using deployment methods such as ComfyUI, Stable Diffusion WebUI, ModelScope, Hugging Face, Triton Inference Server, and TensorFlow Serving. For more information, see Scenario-based deployment.
You can configure various inference parameters on the web UI of a RAG-based LLM chatbot to meet diverse requirements. You can also use the RAG-based LLM chatbot by calling API operations. For more information about implementation details and parameter settings, see RAG-based LLM chatbot.
A RAG-based LLM chatbot can also be associated with other types of vector databases, such as OpenSearch and Elasticsearch. For more information, see Use EAS and Elasticsearch to deploy a RAG-based LLM chatbot or Use EAS and OpenSearch to deploy a RAG-based chatbot.