All Products
Search
Document Center

AnalyticDB:Create a dedicated ChatBot using Compute Nest

Last Updated:Feb 03, 2026

A ChatBot is an intelligent conversational system that uses natural language to communicate with people. You can use a ChatBot to build an intelligent customer service system or create an enterprise knowledge base for AI chat. This topic describes how to create a dedicated ChatBot using Compute Nest and an AnalyticDB for PostgreSQL instance.

Billing

When you create the One-stop Enterprise-specific Chatbot Community Edition (Large Language Model + Vector Database) service, the system automatically creates an ECS instance and an AnalyticDB for PostgreSQL instance in elastic storage mode. You are charged for these resources. For more information about billing, see the following documents:

Benefits

  • Multi-model support: The service supports models such as Qwen-7b, ChatGLM-6b, Llama2-7b, Llama2-13b, . You can switch models after the service is created.

  • GPU cluster management: During testing, you can use GPU-accelerated instances with lower specifications. As your business grows, you can configure dynamic GPU cluster management based on resource usage to reduce GPU overhead.

  • Fine-grained permission controls based on the full database capabilities of AnalyticDB for PostgreSQL: You can customize permission queries using the open source code. The service also supports knowledge base management APIs for AnalyticDB for PostgreSQL, which provides greater flexibility for calling services.

  • API and WebUI availability: The service provides APIs and a WebUI to help you quickly and flexibly integrate the AIGC backend with your applications.

  • Data security: All data, algorithms, and GPU resources are isolated within your environment. This prevents data from leaving your environment and ensures that your core enterprise data is protected from leakage.

RAM user authorization

If you use a RAM user to perform the operations in this topic, you must grant the required permissions to the RAM user in advance. For more information about the RAM permissions that Compute Nest requires and how to grant them, see Grant permissions to a RAM user.

Operation video

Create a service instance

This topic uses GenAI-LLM-RAG as an example.

  1. Go to the Create Service Instance page. In the Quick Trial section, click GenAI-LLM-RAG.

  2. On the Create Service Instance page, configure the following parameters.

    Type

    Parameter

    Description

    Service Instance Name

    Set a name for the service instance. The system generates a random name. We recommend that you specify a name that is easy to identify.

    Region

    The region where the service instance, ECS instance, and AnalyticDB for PostgreSQL instance reside.

    Billing Method Configuration

    Billing Method

    Select Pay-As-You-Go or Subscription as needed.

    This topic uses pay-as-you-go as an example.

    ECS Configuration

    Instance Type

    Select the Specifications for the ECS instance.

    Instance Password

    The logon password for the ECS instance.

    Whitelist Settings

    The whitelist for the ECS instance.

    We recommend that you add the IP addresses of the servers that need to access the Large Language Model (LLM) to the whitelist.

    PAI-EAS Model Configuration

    Select Large Model

    Select a pre-configured large language model. This topic uses llama2-7b as an example.

    PAI Instance Type

    Select the GPU specifications for the PAI service. If a specification is out of stock, you cannot select it.

    AnalyticDB PostgreSQL

    Instance Type

    The node specifications of the AnalyticDB for PostgreSQL instance.

    Segment Storage Size

    The storage space of the compute nodes for the AnalyticDB for PostgreSQL instance, in GB.

    Database Account Name

    The initial account name for the AnalyticDB for PostgreSQL instance.

    Database Password

    The password for the initial account of the AnalyticDB for PostgreSQL instance.

    Application Configuration

    Software Logon Name

    The logon name for the LLM software. This is used to log on to the Langchain web service.

    Software Logon Password

    The logon password for the LLM software.

    Zone Configuration

    vSwitch Zone

    Select the zone where the service instance resides.

    Network Configuration

    Create a New VPC

    You can create a new VPC or use an existing one. This topic uses a new VPC as an example.

    VPC IPv4 CIDR Block

    Enter the IPv4 CIDR block for the VPC.

    vSwitch Subnet CIDR Block

    Enter the CIDR block for the vSwitch.

    Tags and Resource Groups

    Tag

    Select a tag to attach to the service instance.

    Resource Group

    Select the resource group to which the service instance belongs. For more information, see What is Resource Management?.

  3. Click Next: Confirm Order.

  4. Review the information in the Dependency Check, Service Instance Information, and Price Preview sections before you proceed.

    Note

    If a role permission is disabled in the Dependency Check section, click Enable Now on the right. After the permission is enabled, click the refresh button in this section.

  5. Click Create Now.

  6. Click View Service.

It takes about 10 minutes to create the service instance. You can view its status on the Service Instance page. The service instance is created when its status changes to Deployed.

Use the ChatBot

Before you use the ChatBot, you must upload files that contain your Q&A data to the knowledge base. The following steps describe how to upload files and use the ChatBot.

  1. On the Service Instance page of the Compute Nest console, click the ID of the target service instance to open the service instance details page.

  2. On the service instance details page, in the Use Now section, click the link to the right of Endpoint.

  3. In the Log On dialog box, enter the Software Logon Name and Software Logon Password that you set when you created the service instance, and then click Log On.

  4. In the upper-right corner of the page, in the Please select a usage mode section, select Knowledge Base Q&A.

  5. In the Configure Knowledge Base section on the right side of the page, under Please select a knowledge base to load, select Create Knowledge Base, enter a name for the new knowledge base, and click Add to Knowledge Base Options.

  6. Set Sentence Length Limit for Text Storage based on your requirements. The recommended value is 500.

  7. Add files to the new knowledge base.

    • The supported upload methods are Upload File, Upload File and URL, and Upload Folder.

    • Supported file formats include PDF, Markdown, TXT, and Word.

    • To delete a file, go to the Delete File interface.

  8. After the upload is complete, you can ask a question in the lower-left corner of the page and click Submit to obtain an answer.

Resource Management

View resources associated with the service instance

  1. On the Service Instance page in the Compute Nest console, click the ID of the target service instance to open the service instance details page.

  2. Click the Resources tab.

AnalyticDB for PostgreSQL resource management

On the Resources page, find the resource whose Product is AnalyticDB for PostgreSQL and click its Resource ID to open the AnalyticDB for PostgreSQL instance management page.

For more information about vector analysis for AnalyticDB for PostgreSQL instances, see the following documents:

If you need additional storage and compute resources, see the following documents to learn how to manage your instance:

View knowledge base data on an AnalyticDB for PostgreSQL instance

  1. On the AnalyticDB for PostgreSQL instance management page, click Log On to Database in the upper-right corner. For more information, see Use DMS to log on to a database.

    Note

    The database account and password are the Database Account Name and Database Password that you specified when you created the service instance.

  2. After you log on, in the Logged-in Instances list on the left, find the target AnalyticDB for PostgreSQL instance and double-click the public schema under the chatglmuser database.

    • The list of knowledge bases is stored in the langchain_collections table.

    • The data for a single knowledge base, including the enterprise knowledge chunks created after a document is uploaded, is stored in a table named after that knowledge base. This includes information such as embeddings, chunks, file metadata, and original file names.

For more information about how to use DMS, see What is Data Management (DMS).

PAI-EAS resource management

Enable auto scaling

PAI-EAS provides a wide range of serverless elastic scaling features, including horizontal auto-scaling, scheduled auto-scaling, and elastic resource pools. If your business workload has significant peaks and troughs, you can enable horizontal auto-scaling to prevent resource waste. After you enable this feature, the service automatically adjusts the number of instances to dynamically manage the computing resources of your online service. This ensures business stability and improves resource utilization.

  1. On the Resources page of the service instance, find the resource whose Product is Platform for AI (PAI) and click the Resource ID to open the Service Details page of Platform for AI.

  2. Click the Auto Scaling tab.

  3. In the Elastic Scaling section, click Enable Auto Scaling.

  4. In the Auto Scaling Settings dialog box, configure Minimum Instances, Maximum Instances, and Scaling Metric.

    • If your service has a low request volume and you want it to start on demand and stop when idle, set Minimum Instances to 0, Maximum Instances to 1, and Scaling Metric to QPS-based Scaling Threshold Per Instance. Set the value of QPS-based Scaling Threshold Per Instance to 1. With this configuration, the service automatically stops when there are no service requests and restarts when new requests are received.

    • If your business handles a large daily volume and experiences frequent workload fluctuations, set Minimum Instances to 5, Maximum Instances to 50, and Scaling Metric to QPS-based Scaling Threshold Per Instance. Set the value of QPS-based Scaling Threshold Per Instance to 2. In this scenario, the service automatically scales between 5 and 50 instances based on your business requests.

  5. Click Enable.

Replace the open source LLM

  1. On the Resources page of the service instance, find the resource whose Product is Platform for AI (PAI) and click its Resource ID to open the Service Details page of Platform for AI.

  2. Click Update Service in the upper-right corner of the page.

  3. On the deployment page, modify the Run Command and the GPU Instance Type. Leave the other parameters at their default values.

    The following table shows the Run Command and the recommended GPU Instance Type for different models.

    Model

    Run Command

    Recommended Instance Type

    llama2-13b

    python api/api_server.py --port=8000 --model-path=meta-llama/Llama-2-13b-chat-hf --precision=fp16

    V100 (gn6e)

    llama2-7b

    python api/api_server.py--port=8000 --model-path=meta-llama/Llama-2-7b-chat-hf

    GU30, A10

    chatglm2-6b

    python api/api_server.py --port=8000 --model-path=THUDM/chatglm2-6b

    GU30, A10

    Qwen-7b

    python api/api_server.py --port=8000 --model-path=Qwen/Qwen-7B-Chat

    GU30, A10

  4. Click Deploy.

  5. In the Deploy Service dialog box, click OK.

FAQ

  • Q: How do I use the vector search APIs?

    A: For more information, see Java.

  • Q: How do I check the deployment progress of a service instance?

    A: After you create a service instance, the Compute Nest service instance is created in about 10 minutes. This process includes initializing the ECS instance and the AnalyticDB for PostgreSQL vector database. The large language model (LLM) is downloaded asynchronously. This process takes about 30 to 60 minutes. To check the model download progress, log on to the ECS instance and view the download logs. After the LLM is downloaded, you can log on to the web UI to access the Chatbot application.

  • Q: After I create a Compute Nest service, how do I log on to the ECS instance?

    A: On the Resources tab of the service instance, find the resource whose Resource Type is securitygroup, and click the Resource ID of the target resource. On the basic information page of the ECS instance, click Remote Connection. For more information about connection methods, see Connect to an instance.

  • Q: How do I restart the Langchain service?

    A: Log on to the ECS instance and run the following command to restart the service:

    systemctl restart langchain-chatglm
  • Q: How do I query Langchain logs?

    A: Log on to the ECS instance and run the following command to view the logs:

    journalctl -ef -u langchain-chatglm
  • Q: Why does the model fail to load after service deployment?

    A: After you enable the service, the system downloads the LLM from Hugging Face to the ECS instance. The download can take a long time (30 to 60 minutes), especially in China regions. After the download is complete, log on to the web UI and try to reload the model.

  • Q: How do I view the details of the deployment code?

    A: For more information, see the langchain-ChatGLM documentation.

  • Q: How do I request backend support from the product team?

    A: You can subscribe to the One-stop Enterprise-specific Chatbot Managed Service to obtain support.

  • Q: Why do I see a blank page when I access the service?

    A: This is a Compute Nest service for the Alibaba Cloud China Website (www.aliyun.com). This issue can occur if you are using a proxy to access the service from a region outside China. To resolve this, disable the proxy before you create and access the service.

  • Q: Where is Langchain deployed on the ECS instance?

    A: Langchain is deployed in the /home/admin/langchain-ChatGLM path.

  • Q: How do I enable the Langchain API?

    A: Run the following commands on the ECS instance to enable the API:

    #Create a systemd file for langchain-chatglm-api
    cp /lib/systemd/system/langchain-chatglm.service /lib/systemd/system/langchain-chatglm-api.service
    #Modify ExecStart in /lib/systemd/system/langchain-chatglm-api.service
    #For PAI-EAS
    ExecStart=/usr/bin/python3.9 /home/langchain/langchain-ChatGLM/api.py
    #For a single GPU-accelerated instance
    ExecStart=/usr/bin/python3.9 /home/admin/langchain-ChatGLM/api.py
    #Reload systemd
    systemctl daemon-reload
    #Start the API
    systemctl restart langchain-chatglm-api
    # If you see the following log entry, the API has started successfully:
    INFO:     Uvicorn running on http://0.0.0.0:7861 (Press CTRL+C to quit)
    # View all APIs:
    curl http://0.0.0.0:7861/openapi.json