This article describes how to associate a Retrieval-Augmented Generation (RAG)-based LLM chatbot with an ApsaraDB RDS for PostgreSQL instance when you deploy the RAG-based large language model (LLM) chatbot. This article also describes the basic features provided by a RAG-based LLM chatbot and the special features provided by ApsaraDB RDS for PostgreSQL.
Elastic Algorithm Service (EAS) is an online model service platform of PAI that allows you to deploy models as online inference services or AI-powered web applications. EAS provides features such as auto scaling and blue-green deployment. These features reduce the costs of developing stable online model services that can handle a large number of concurrent requests. In addition, EAS provides features such as resource group management and model versioning and capabilities such as comprehensive O&M and monitoring. For more information, see EAS overview.
With the rapid development of AI technology, generative AI has made remarkable achievements in various fields such as text generation and image generation. However, the following inherent limits gradually emerge while LLMs are widely used:
To address these challenges and enhance the capabilities and accuracy of LLMs, RAG is developed. RAG integrates external knowledge bases to significantly mitigate the issue of LLM hallucinations and enhance the capabilities of LLMs to access and apply up-to-date knowledge. This enables the customization of LLMs for greater personalization and accuracy.
ApsaraDB RDS supports PostgreSQL. PostgreSQL is fully compatible with SQL and supports a diverse range of data formats such as JSON, IP, and geometric data. In addition to support for features such as transactions, subqueries, multi-version concurrency control (MVCC), and data integrity check, ApsaraDB RDS for PostgreSQL provides a series of features, such as high availability, backup, and restoration, to ease O&M loads. For more information about the advanced features of ApsaraDB RDS for PostgreSQL, see What is ApsaraDB RDS for PostgreSQL?
EAS provides a self-developed RAG systematic solution with flexible parameter configurations. You can access RAG services by using a web user interface (UI) or calling API operations to configure a custom RAG-based LLM chatbot. The technical architecture of RAG focuses on retrieval and generation.
In this example, an ApsaraDB RDS for PostgreSQL instance is used to show how to use EAS and ApsaraDB RDS for PostgreSQL to deploy a RAG-based LLM chatbot by performing the following steps:
1. Prepare a vector database by using ApsaraDB RDS for PostgreSQL
Create an ApsaraDB RDS for PostgreSQL instance and prepare configuration items on which a RAG-based LLM chatbot depends to associate with the ApsaraDB RDS for PostgreSQL instance.
2. Deploy a RAG-based LLM chatbot and associate it with the ApsaraDB RDS for PostgreSQL instance
Deploy a RAG-based LLM chatbot on EAS and associate it with the ApsaraDB RDS for PostgreSQL instance.
3. Use the RAG-based LLM chatbot
You can connect to the ApsaraDB RDS for PostgreSQL instance in the RAG-based LLM chatbot, upload business data files, and then perform knowledge Q&A.
A virtual private cloud (VPC), a vSwitch, and a security group are created. For more information, see Create a VPC with an IPv4 CIDR block and Create a security group.
This practice is subject to the maximum number of tokens of an LLM service and is designed to help you understand the basic retrieval feature of a RAG-based LLM chatbot.
1. Create an ApsaraDB RDS for PostgreSQL instance.
a) Go to the ApsaraDB RDS buy page of the new version.
b) On the buy page, configure the following parameters. For information about other parameters, see Create an ApsaraDB RDS for PostgreSQL instance.
c) Follow the instructions in the console to complete the payment and activation operations.
2. Create a database.
a) On the Instances page, find the desired instance and click the name of the instance in the Instance ID/Name column. In the left-side navigation pane of the instance details page, click Databases. On the page that appears, click Create Database.
b) In the Create Database panel, specify a database name for the Database Name parameter and select the created priviledged account for the Authorized By parameter. For information about other parameters, see Create a database and an account.
c) After you configure the parameters, click Create.
1. View the connection information about the database.
In the left-side navigation pane of the details page for the ApsaraDB RDS for PostgreSQL instance, click Database Connection. On the page that appears, view the internal endpoint, public endpoint, and port number of the database.
i) Apply for a public endpoint for an ApsaraDB RDS for PostgreSQL instance. For more information, see Apply for or release a public endpoint.
ii) To enable Internet access for EAS, you must associate a NAT gateway and an EIP with the VPC that is configured when you deploy a RAG-based LLM chatbot. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
Note: The RAG-based LLM chatbot can use the same VPC as the ApsaraDB RDS for PostgreSQL instance or a different VPC.
iii) Add 0.0.0.0/0 or the preceding EIP to the public IP address whitelist of the ApsaraDB RDS for PostgreSQL instance. For more information, see Configure an IP address whitelist.
2. View the privileged account and password.
In the left-side navigation pane of the details page for the ApsaraDB RDS for PostgreSQL instance, click Accounts. On the page that appears, you can view the privileged account. The password is configured when you create the instance. If you forget the password, you can click Reset Password to change the password.
1. Log on to the Platform for AI (PAI) console and select a region. In the left-side navigation pane, choose Model Deployment > Elastic Algorithm Service (EAS). On the page that appears, select a workspace and click Enter Elastic Algorithm Service (EAS).
2. On the Elastic Algorithm Service (EAS) page, click Deploy Service. In the Scenario-based Model Deployment section, click RAG-based Smart Dialogue Deployment.
3. On the RAG-based LLM Chatbot Deployment page, configure the key parameters described in the following table. For information about other parameters, see Step 1: Deploy the RAG service.
Parameter |
Description |
|
Basic Information |
Model Source |
Select Open Source Model. |
Model Type |
Select Qwen1.5-1.8b. |
|
Resource Configuration |
Resource Configuration |
The system recommends the appropriate resource specifications based on the selected model type. If you use other resource specifications, the model service may fail to start. |
Vector Database Settings |
Vector Database Type |
Select RDS PostgreSQL. |
Host Address |
The internal endpoint or public endpoint of the ApsaraDB RDS for PostgreSQL instance. |
|
Port |
The port number of the ApsaraDB RDS for PostgreSQL instance. Example: 5432. |
|
Database |
The name of the database that you created. |
|
Table Name |
The name of the table. You can enter a new table name or an existing table name. If you use an existing table name, the table schema must meet the requirements of the RAG-based chatbot. For example, you can enter the name of the table that is automatically created when you deploy the RAG-based chatbot by using EAS. |
|
Account |
The privileged account that you created. |
|
Password |
The password of the privileged account that you created. |
|
VPC Configuration (Optional) |
VPC |
o If the host address is an internal endpoint, you must configure the RAG-based chatbot in the same VPC as the ApsaraDB RDS for PostgreSQL instance. o If the host address is a public endpoint, you must configure a VPC for the RAG-based chatbot. Make sure that the VPC can access the Internet. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet. In addition, you must add the EIP or 0.0.0.0/0 to the public IP address whitelist of the ApsaraDB RDS for PostgreSQL instance. For more information, see Configure an IP address whitelist. |
vSwitch |
||
Security Group Name |
4. After you configure the parameters, click Deploy.
The following section describes how to use a RAG-based LLM chatbot. For more information, see RAG-based LLM chatbot.
The system recognizes and applies the vector database settings that are configured when you deploy the chatbot. Click Connect PostgreSQL to check whether the ApsaraDB RDS for PostgreSQL instance is connected. If the connection fails, check whether the vector database settings are correct based on Step 2: Prepare configuration items. If the settings are incorrect, modify the configuration items and click Connect PostgreSQL to reconnect the ApsaraDB RDS for PostgreSQL instance.
Upload your knowledge base files. The system automatically stores the knowledge base in the PAI-RAG format to the vector database for retrieval. You can also use existing knowledge bases in the database, but the knowledge bases must meet the PAI-RAG format requirements. Otherwise, errors may occur during retrieval.
1. On the Upload tab, configure the chunk parameters.
The following parameters controls the granularity of document chunking and whether to enable Q&A extraction.
Parameter | Description |
---|---|
Chunk Size | The size of each chunk. Unit: bytes. Default value: 500. |
Chunk Overlap | The overlap between adjacent chunks. Default value: 10. |
Process with QA Extraction Model | Specifies whether to extract Q&A information. If you select Yes, the system automatically extracts questions and corresponding answers in pairs after knowledge files are uploaded. This way, more accurate answers are returned in data queries. |
2. On the Files tab or Directory tab, upload one or more business data files. You can also upload a directory that contains the business data files. Supported file types: txt,. pdf, Excel (.xlsx or. xls),. csv, Word (.docx or. doc), Markdown, or. html. For example: rag_chatbot_test_doc.txt.
3. Click Upload. The system performs data cleansing and semantic-based chunking on the business data files before uploading the business data files. Data cleansing includes text extraction and hyperlink replacement.
The RAG-based LLM chatbot enters the results returned from the vector database and the query into the selected prompt template and sends the template to the LLM application to provide an answer.
Use PAI-Blade and TensorRT Plug-Ins to Optimize a RetinaNet Model
42 posts | 1 followers
FollowAlibaba Cloud Data Intelligence - December 27, 2024
Alibaba Cloud Data Intelligence - June 20, 2024
Alibaba Cloud Community - November 16, 2023
ApsaraDB - July 13, 2023
Alibaba Cloud Community - August 18, 2023
ApsaraDB - May 15, 2024
42 posts | 1 followers
FollowAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreA financial-grade distributed relational database that features high stability, high scalability, and high performance.
Learn MoreA database engine fully compatible with Apache Cassandra with enterprise-level SLA assurance.
Learn MoreTop-performance foundation models from Alibaba Cloud
Learn MoreMore Posts by Alibaba Cloud Data Intelligence