This guide will walk you through the process of creating a Retrieval-Augmented Generation (RAG) service using Compute Nest with Large Language Models (LLM) on Alibaba Cloud's Platform for AI – Elastic Algorithm Service (PAI-EAS), AnalyticDB for PostgreSQL as the vector store, Gradio for the web UI, and Langchain for orchestration.
Ensure you have an Alibaba Cloud account. Sign up here if you still need to do so.
Find the service GenAI-LLM-RAG in Alibaba Cloud->Console->Compute Nest with your Alibaba Cloud credentials. And press the Offical Use.
Set up the necessary parameters of the instance:
Deploy a pre-trained LLM on PAI-EAS:
1. The default username is admin. You could choose another username.
2. You need to create a strong password, for instance.
3. As VPC can be chosen from existing VPC. To create a new VPC, you can activate the slider and put related information.
4. After, press Next: Confirm Order.
Create a web UI with Gradio:
After checking all related information and accepting the Terms of Service by pressing Create Now, the service can be deployed. Need to wait for a while to finish all the steps.
Users can ask questions through the Gradio web UI, and the LLM will process and provide answers.
Users can upload documents converted into vector store and save them in AnalyticDB for PostgreSQL.
Authorized users can access ECS to make changes or updates to the service.
For more detailed information, check the following documentation:
Additional tutorials:
By following this guide, you should be able to set up a functional RAG service on Compute Nest, leveraging the powerful features of PAI-EAS, AnalyticDB, Gradio, and Langchain.
Streamlined Deployment and Integration of Large Language Models with PAI-EAS
Deploy Your Own AI Chat Buddy - The Qwen Chat Model Deployment with Hugging Face Guide
ApsaraDB - May 15, 2024
Regional Content Hub - February 1, 2024
Alibaba Cloud Community - September 6, 2024
Farruh - July 18, 2024
Regional Content Hub - August 19, 2024
Regional Content Hub - August 12, 2024
A platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreAnalyticDB for MySQL is a real-time data warehousing service that can process petabytes of data with high concurrency and low latency.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreMore Posts by Farruh