All Products
Search
Document Center

Container Service for Kubernetes:Use HPA to achieve application auto scaling based on QPS

Last Updated:Jun 28, 2024

If your application needs to dynamically adjust the total amount of computing resources based on the number of requests received per unit of time, you can use the QPS data collected by an Application Load Balancer (ALB) instance to set up auto scaling for the pods of the application.

Before you start

Before you start, we recommend that you read Get started with ALB Ingresses to learn about the basic features of ALB ingresses.

How it works

Queries per second (QPS) is the number of requests received per second. ALB instances can record client access data through Simple Log Service (SLS). Horizontal Pod Autoscaler (HPA) can monitor the QPS data of the service based on these access records and scale the corresponding workloads (such as Deployment and StatefulSet).

Prerequisites

Step 1: Create an AlbConfig and associate an SLS project

  1. Check the SLS project associated with the cluster.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, click Cluster Information.

    3. On the Cluster Resources tab, find the Log Service Project resource and record the name of the SLS project on the right.

  2. Create an AlbConfig.

    1. Create and copy the following content into alb-qps.yaml, and fill in the SLS project information in the accessLogConfig field.

      apiVersion: alibabacloud.com/v1
      kind: AlbConfig
      metadata:
        name: alb-qps
      spec:
        config:
          name: alb-qps
          addressType: Internet
          zoneMappings:
          - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # ID of the virtual switch
          - vSwitchId: vsw-uf6nun9tql5t8nh15****
          accessLogConfig:
            logProject: <LOG_PROJECT> # Name of the log project associated with the cluster
            logStore: <LOG_STORE> # Custom logstore name, must start with "alb_"
        listeners:
          - port: 80
            protocol: HTTP

      The following table describes the fields in the preceding code block:

      Field

      Value

      Default

      Description

      logStore

      string

      ""

      The name of the Simple Log Service Logstore.

      logProject

      string

      ""

      The name of the Simple Log Service project.

    2. Run the following command to create AlbConfig:

       kubectl apply -f alb-qps.yaml

      Expected output:

      albconfig.alibabacloud.com/alb-qps created

Step 2: Create sample resources

In addition to AlbConfig, ALB Ingress requires four types of resources: Deployment, Service, IngressClass, and Ingress to work as expected. You can use the following example to quickly create these resources.

  1. Create the qps-quickstart.yaml file with the following content:

    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: qps-ingressclass
    spec:
      controller: ingress.k8s.alibabacloud/alb
      parameters:
        apiGroup: alibabacloud.com
        kind: AlbConfig
        name: alb-qps # Same as the name of AlbConfig
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: qps-ingress
    spec:
      ingressClassName: qps-ingressclass # Same as the name of Ingress Class
      rules:
       - host: demo.alb.ingress.top # Replace with your domain name
         http:
          paths:
          - path: /qps
            pathType: Prefix
            backend:
              service:
                name: qps-svc
                port:
                  number: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: qps-svc
      namespace: default
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 80
      selector:
        app: qps-deploy
      type: NodePort
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: qps-deploy
      labels:
        app: qps-deploy
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: qps-deploy
      template:
        metadata:
          labels:
            app: qps-deploy
        spec:
          containers:
          - name: qps-container
            image: nginx:1.7.9
            ports:
            - containerPort: 80
  2. Run the following command to create the sample resources:

    kubectl apply -f qps-quickstart.yaml

Step 3: Create an HPA

  1. Create the qps-hpa.yaml file, then copy the following content into it, and save it:

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: qps-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: qps-deploy # Name of the workload controlled by HPA
      minReplicas: 2 # Minimum number of pods
      maxReplicas: 10 # Maximum number of pods
      metrics:
        - type: External
          external:
            metric:
              name: sls_alb_ingress_qps # Metric name for QPS data, do not modify
              selector:
                matchLabels:
                  sls.project: <LOG_PROJECT> # Name of the log project associated with the cluster
                  sls.logstore: <LOG_STORE> # Custom logstore name
                  sls.ingress.route: default-qps-svc-80 # Path of the service, parameter format is <namespace>-<svc>-<port>
            target:
              type: AverageType
              averageValue: 2 # Expected target for the metric, in this example, the average QPS for all pods is 2

    The following table describes the fields in the preceding code block:

    Field

    Description

    scaleTargetRef

    The workload used by the application. This example uses the Deployment named qps-deployment created in Step 1.

    minReplicas

    The minimum number of containers that the Deployment can be scale to. This value needs to be set to an integer greater than or equal to 1.

    maxReplicas

    The maximum number of containers that the Deployment can be scale to. This value needs to be greater than the minimum number of replicas.

    external.metric.name

    The QPS-based metric for HPA. Do not modify the value.

    sls.project

    The SLS project for the metric. Set the value to the SLS project specified in the AlbConfig.

    sls.logstore

    The Logstore for the metric. Set the value to the Logstore specified in the AlbConfig.

    sls.ingress.route

    The path of the Service. Specify the value in the <namespace>-<svc>-<port> format. This example uses the qps-svc Service created in Step 1.

    external.target

    The target value for the metric. In this example, the average QPS for all pods is 2. The HPA will adjust the number of pods to make the QPS as close to the target value as possible.

  2. Run the following command to create HPA.

    kubectl apply -f qps-hpa.yaml
  3. Run the following command to check the HPA deployment status.

    kubectl get hpa

    Expected output:

    NAME      REFERENCE               TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    qps-hpa   Deployment/qps-deploy   0/2 (avg)   2         10        2          5m41s
  4. Run the following command to view HPA configuration information.

    kubectl describe hpa qps-hpa

    Expected output:

    Name:                                            qps-hpa
    Namespace:                                       default
    Labels:                                          <none>
    Annotations:                                     <none>
    CreationTimestamp:                               ******** # Timestamp of HPA, can be ignored
    Reference:                                       Deployment/qps-deployment
    Metrics:                                         ( current / target )
      "sls_alb_ingress_qps" (target average value):  0 / 2
    Min replicas:                                    2
    Max replicas:                                    10
    Deployment pods:                                 2 current / 2 desired

(Optional) Step 4: Verify auto scaling

  1. Verify application scale-out.

    1. Run the following command to view Ingress information.

      kubectl get ingress

      Expected output:

      NAME            CLASS                HOSTS                  ADDRESS                         PORTS     AGE
      qps-ingress     qps-ingressclass     demo.alb.ingress.top   alb-********.alb.aliyuncs.com   80        10m31s

      Record the values of HOSTS and ADDRESS for use in subsequent steps.

    2. Run the following command to perform stress testing on the application. Replace demo.alb.ingress.top and alb-********.alb.aliyuncs.com with the values obtained in the previous step.

      ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qps
    3. Run the following command to check the scaling status of the application.

      kubectl get hpa

      Expected output:

      NAME      REFERENCE               TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
      qps-hpa   Deployment/qps-deploy   14375m/2 (avg)   2         10        10         15m

      The result shows that REPLICAS is 10, indicating that as the QPS data increases, the number of pods of the application are scaled out to 10.

  2. Verify application scale-in.

    After the stress testing is complete, run the following command to check the scaling status of the application.

    kubectl get hpa

    Expected output:

    NAME      REFERENCE               TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    qps-hpa   Deployment/qps-deploy   0/2 (avg)   2         10        2          28m

    The result shows that REPLICAS is 2, indicating that after the QPS data drops to 0, the application is scaled in to 2 pods.

References