Container Compute Service (ACS) provides container computing resources that comply with the container specifications of Kubernetes. ACS provides serverless computing resources, which allow you to run containerized applications with high efficiency. This topic describes how to deploy and expose a containerized generative AI-powered chat application in the ACS console and by using an ACS cluster certificate. This topic also describes how to monitor the application.
Background information
In this topic, the following open source projects are used: RWKV-Runner and ChatGPT-Next-Web. RWKV-Runner is a 0.1-billion-parameter model that provides online inference by using RESTful APIs. ChatGPT-Next-Web is a web UI for chat applications. You can use images to deploy RWKV-Runner and ChatGPT-Next-Web in an ACS cluster to build a generative AI-powered chat application on top of an architecture that decouples the frontend from the backend. After you complete the steps in this topic, a generative AI-powered chat application is created.
For more information about the terms used in Kubernetes, see Course jointly developed by CNCF and Alibaba Cloud on cloud-native technologies.
Procedure
If this is the first time you use ACS, you must activate ACS and grant ACS the permissions to access cloud resources. Then, you can create an ACS cluster and deploy a generative AI-powered application in the cluster.
Activate and grant permissions to ACS
If this is the first time you use ACS, you must activate ACS and grant ACS the permissions to access cloud resources.
Log on to the ACS console and click Activate.
Go to the ACS activation page and follow the on-screen instructions to activate ACS.
Return to the ACS console and refresh the page. Click Authorize Now.
Go to the ACS authorization page and follow the on-screen instructions to grant permissions to ACS.
After you complete the preceding operations, refresh the ACS console. Then, you can get started with ACS.
Step 2: Create an ACS cluster.
This step shows how to configure cluster parameters when you create an ACS cluster.
In the upper-left corner of the Clusters page, click Create Cluster.
Log on to the ACS console. In the left-side navigation pane, click Clusters.
On the Create Cluster page, set the parameters described in the following table.
Use default settings for parameters that are not listed in the table.
Parameter
Description
Example
Cluster Name
Enter a name for the cluster.
ACS-Demo
Region
Select a region to deploy the cluster.
China (Beijing)
VPC
ACS clusters can be deployed only in virtual private clouds (VPCs). You must specify a VPC in the same region as the cluster.
Click Create VPC to create a VPC named vpc-acs-demo in the China (Beijing) region. For more information, see Create and manage a VPC.
vpc-acs-demo
vSwitch
Select vSwitches for nodes in the cluster to communicate with each other.
Click Create vSwitch and create a vSwitch named vswitch-ack-demo in the vpc-ack-demo VPC. Then, select vswitch-ack-demo in the vSwitch list. For more information, see Create and manage a vSwitch.
vswitch-acs-demo
API Server Access Settings
Specify whether to expose the Kubernetes API server of the cluster to the Internet. If you want to manage the cluster over the Internet, you must expose the Kubernetes API server with an elastic IP address (EIP).
Select Expose API Server with EIP.
Service Discovery
Specify whether to enable service discovery for the cluster. To enable service discovery, select CoreDNS.
Select CoreDNS.
Click Confirm Order, read and select Terms of Service, and then click Create Cluster.
NoteIt requires approximately 10 minutes to create a cluster. After the cluster is created, you can view the cluster on the Clusters page.
Step 3: Deploy RWKV-Runner in the ACS cluster
This step shows how to deploy RWKV-Runner in the ACS cluster by creating a general-purpose Deployment and how to expose RWKV-Runner within the cluster by using RESTful APIs. For more information about the parameters used to create a Deployment, see Create a stateless application by using a Deployment.
Log on to the ACS console. On the Clusters page, click the name of the cluster you created, which is ACS-Demo in this example.
In the left-side navigation pane, choose .
On the Deployments page, click Create from Image.
On the Basic Information wizard page, set Name to rwkv-runner, select General-purpose for Instance type, select default for QoS Type, and then click Next.
On the Container wizard page, configure the container and click Next.
Parameter
Description
Example
Image Name
You can enter an untagged image address or click Select images to select the image that you want to use.
registry.cn-beijing.aliyuncs.com/acs-demo-ns/rwkv-runner
Select Image Tag
Click Select Image Version and select an image version.
1.0.0
CPU
Specify the number of vCPUs required by the application.
1 Core
Memory
Specify the amount of memory required by the application.
2 GiB
Port Number
Configure container ports.
Name: runner.
Container Port: 8000.
Protocol: TCP.
On the Advanced wizard page, click Create on the right side of Services.
In the Create Service dialog box, configure the following parameters and click Create to expose the rwkv-runner application within the cluster by using RESTful APIs.
Parameter
Description
Example
Application Name
Enter a name for the Service.
rwkv-runner-svc
Status
The type of Service. This parameter specifies how the Service is accessed.
Cluster IP
Port Mapping
Specify a Service port and a container port. The container port must be the same as the port that is exposed in the backend pod.
Name: runner.
Service Port: 80.
Container Port: 8000.
Protocol: TCP.
In the lower-right corner of the Advanced wizard page, click Create.
After you create the application, you are directed to the Complete wizard page, which displays the objects of the application. Click View Details to view the details of the application.
Step 4: Deploy ChatGPT-Next-Web by using the cluster certificate
This step shows how to use the cluster certificate to deploy ChatGPT-Next-Web in the cluster by creating a general-purpose Deployment and how to expose RWKV-Runner to the Internet. For more information about the parameters used to create a Deployment, see Create a stateless application by using a Deployment.
Log on to the ACS console. On the Clusters page, click the name of the cluster you created, which is ACS-Demo in this example.
On the Cluster Information page, click the Connection Information tab. Obtain the cluster certificate for Internet access and follow the on-screen instructions to save the certificate to your on-premises machine.
Create a file named chat-next-web.yaml and copy the following content to the file.
Run the following command to create the preceding resources in the cluster:
kubectl apply -f chat-next-web.yaml
Step 5: Use the cluster certificate to create an initialization Job for the application
This step shows how to use the cluster certificate to create an initialization Job for the RWKV-Runner model. The QoS class of the pods created by the Job is BestEffort. For more information about the parameters used to create a Job, see Create a Job.
Create a file named rwkv-init-job.yaml and copy the following content to the file.
Run the following command to deploy the initialization Job:
kubectl apply -f rwkv-init-job.yaml
Run the following command to check whether the initialization Job is completed:
kubectl get pod
Expected output:
Step 6: Test the application
This step shows how to access the application by using the Service.
Log on to the ACS console. On the Clusters page, click the name of the cluster you created, which is ACS-Demo in this example.
In the left-side navigation pane, choose
.On the Services page, find the Service you created, which is chat-frontend-svc in this example. Click the IP address in the External IP column to access the application.
Release resources
When you use an ACS cluster, you are charged the following fees:
The fee for the computing power used by the workloads in the cluster. The fee is charged by ACS.
The fees for other cloud resources used by the cluster. The fees are charged by Alibaba Cloud services based on their billing rules.
Take note of the following items after you create an ACS cluster.
If you no longer need to use the cluster, delete the cluster and relevant resources. For more information, see Delete an ACS cluster.
If you need to keep the cluster, top up your account once your account balance is less than CNY 100. For more information about the billing rules of Alibaba Cloud services used by ACS, see Billing.