All Products
Search
Document Center

Container Service for Kubernetes:Deploy a FastChat application for AIGC in an ACK Serverless cluster

Last Updated:Aug 28, 2023

This topic describes how to deploy a FastChat application to experience AI-generated content (AIGC) in a ACK Serverless cluster. You can use the Container Service for Kubernetes (ACK) console or kubectl to deploy the application and then access FastChat through an external endpoint to experience AIGC.

Prerequisites

An ACK Serverless cluster is created in the China (Beijing), China (Hangzhou), China (Shanghai), or China (Shenzhen) region and Internet access is enabled for the cluster. For more information, see Create an ACK Serverless cluster.

Introduction to FastChat

FastChat is an intelligent and easy-to-use chatbot for training, serving, and evaluating large language models. FastChat is a RESTful API-compatible distributed multi-model service system developed based on advanced large language models, such as Vicuna and FastChat-T5. FastChat provides a web interface and OpenAI.

Important
  • Alibaba Cloud does not guarantee the legitimacy, security, or accuracy of the third-party model FastChat. Alibaba Cloud shall not be held liable for any damages caused by the use of FastChat.

  • You must abide by the user agreements, usage specifications, and relevant laws and regulations of FastChat. You shall bear all consequences resulting from the legitimacy and compliance requirements of FastChat.

Step 1: Deploy the FastChat application

You can use the ACK console to deploy the FastChat application, or connect a kubectl client to the ACK Serverless cluster and then create a YAML file to deploy the application.

Use the ACK console

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Workloads > Deployments in the left-side navigation pane.

  3. On the Deployments page, click Create from Image.image.png

  4. On the Basic Information wizard page, enter an application name, such as fastchat, add annotations, and then click Next.image.png

  5. On the Container wizard page, configure the General, Health Check, and Lifecycle settings, and then click Next.

    Category

    Parameter

    Example

    Screenshot

    General

    Image Name

    yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/fastchat

    image.png

    Image Version

    v1.1.0

    Health Check

    Readiness

    Enable readiness check, use TCP connections, and set the pod port to 7860.

    image.png

    Lifecycle

    Command

    Set the container startup command to ["sh","-c","/root/webui.sh"].

    image.png
  6. On the Advanced wizard page, click Create to the right of Services.image.png

  7. In the Create Service dialog box, configure the Service parameters and click Create to use the Service to expose the FastChat application.

    Parameter

    Example

    Screenshot

    Name

    fastchat-svc

    image.png

    Type

    Select Server Load Balancer, Public Access, and Create SLB Instance.

    Service Port

    7860

    Container Port

    7860

  8. In the Labels and Annotations section, add the pod annotations in the following table and click Create in the lower-right part of the page.

    Name

    Value

    k8s.aliyun.com/eci-use-specs

    ecs.gn6i-c8g1.2xlarge,ecs.gn5-c8g1.2xlarge,ecs.gn6v-c8g1.8xlarge,ecs.gn6i-c16g1.4xlarge

    k8s.aliyun.com/eci-extra-ephemeral-storage

    100Gi

    image.png

    If the following information is returned, the application is created.image.png

Use kubectl

  1. A kubectl client is connected to the ACK Serverless cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

  2. Create a file named fastchat.yaml and copy the following content to the file:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: fastchat
      name: fastchat
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: fastchat
      template:
        metadata:
          labels:
            app: fastchat
            alibabacloud.com/eci: "true" 
          annotations:
            k8s.aliyun.com/eci-use-specs: ecs.gn6i-c8g1.2xlarge,ecs.gn5-c8g1.2xlarge,ecs.gn6v-c8g1.8xlarge,ecs.gn6i-c16g1.4xlarge
            k8s.aliyun.com/eci-extra-ephemeral-storage: 100Gi
        spec:
          dnsPolicy: Default
          containers:
          - command:
            - sh
            - -c 
            - "/root/webui.sh"
            image: yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/fastchat:v1.1.0
            imagePullPolicy: IfNotPresent
            name: fastchat
            ports:
            - containerPort: 7860
              protocol: TCP
            readinessProbe:
              failureThreshold: 3
              initialDelaySeconds: 5
              periodSeconds: 10
              successThreshold: 1
              tcpSocket:
                port: 7860
              timeoutSeconds: 1
            resources:
              requests:
                cpu: "8"
                memory: 16Gi
              limits:
                nvidia.com/gpu: 1
    ---
    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: internet
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type: PayByCLCU
      name: fastchat-svc
      namespace: default
    spec:
      externalTrafficPolicy: Local
      ports:
      - port: 7860
        protocol: TCP
        targetPort: 7860
      selector:
        app: fastchat
      type: LoadBalancer
  3. Run the following command to deploy the FastChat application:

    kubectl apply -f fastchat.yaml
  4. Run the following command to query the status of the application:

    kubectl get deployment fastchat

    Expected output:

    NAME       READY   UP-TO-DATE   AVAILABLE   AGE
    fastchat   1/1     1            1           38m

Step 2: Access the Service

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Network > Services in the left-side navigation pane.

  3. Click the external endpoint in the External Endpoint column of the Service named fastchat-svc in the Service list, such as 47.107.XX.XX:7860.image.png

    You can then access the FastChat application and experience AIGC.

    image.png

Step 3: Release resources

To avoid incurring unexpected fees, delete the resources at your earliest convenience after you use the application.

Delete the application and Service

  1. On the Clusters page of the ACK console, click the name of your cluster. In the left-side navigation pane, choose Workloads > Deployments, find the FastChat application, and then choose More > Delete in the Actions column.345.png

  2. In the Confirm dialog box, select Delete Associated Service fastchat-svc and click OK.

Delete a cluster

ACK Serverless clusters are in public preview. You can use ACK Serverless clusters free of charge. If your ACK Serverless clusters use other Alibaba Cloud services, you need to pay for these services based on their billing rules. Fees are charged by these services separately. After you complete the configuration, you can manage the cluster in one of the following ways:

  • If you no longer need the cluster, log on to the ACK console. On the Clusters page, choose More > Delete in the Actions column of the cluster to delete the cluster. In the Delete Cluster dialog box, select Delete ALB Instances Created by the Cluster, Delete Alibaba Cloud DNS PrivateZone instances Created by the Cluster, and I understand the above information and want to delete the specified cluster, and then click OK. For more information about how to delete an ACK Serverless cluster, see Delete a cluster.

  • Continue to use the cluster. For more information about the billing rules of other cloud services that may be used by ACK Serverless Pro clusters, see Cloud service fee.

Contact Us

If you have any questions about enabling AIGC services for ACK, join the DingTalk group 31850017754.