All Products
Search
Document Center

Container Compute Service:HPA

Last Updated:Nov 04, 2024

You can create an application with Horizontal Pod Autoscaler (HPA) enabled in the Alibaba Cloud Container Compute Service (ACS) console or by using kubectl. This topic describes how to create an application with HPA enabled in an ACS cluster and how to test HPA.

Prerequisites

Create an application with HPA enabled

Use the ACS console

  1. Log on to the ACS console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its ID. In the left-side pane, choose Workloads > Deployments.

  3. On the Deployments tab, click Create from Image.

  4. In the Basic Information step, enter a name for your application, set the parameters, and then click Next.

    Parameter

    Description

    Namespace

    Select the namespace to which the application belongs. The default namespace is automatically selected.

    Name

    Enter a name for the application.

    Replicas

    The number of pods that you want to provision for the application. Default value: 2.

    Type

    The type of the resource object. Valid values: Deployment, StatefulSet, Job, and CronJob.

    Label

    The label that you want to add to the application to identify the application.

    Annotations

    The annotations that you want to add to the application.

    Instance Type

    The instance type of the pod. Valid values: General-purpose, BestEffort, and Performance-enhanced.

  5. In the Container step, set the container parameters, select an image, and then configure the required computing resources. Click Next. For more information, see Configure the containers.

    Note

    You must configure the computing resources required by the Deployment. Otherwise, you cannot enable HPA.

  6. In the Access Control section of the Advanced step, click Create to create a Service. For more information, see Advanced settings.

  7. In the Advanced step, select Enable for HPA and configure the scaling threshold and related settings.

    • Metric: Select CPU Usage or Memory Usage. The selected resource type must be the same as the one you specified in the Required Resources parameter.

    • Condition: Specify the resource usage threshold. HPA triggers scaling events when the threshold is exceeded. For more information about the algorithms that are used to perform horizontal pod autoscaling, see Algorithm details.

    • Max. Replicas: Specify the maximum number of pods to which the Deployment can be scaled.

    • Min. Replicas: Specify the minimum number of pods to which the Deployment can be scaled.

  8. In the lower-right corner of the Advanced step, click Create to create the application with HPA enabled.

Use kubectl

You can also deploy HPA by using an orchestration template and associate HPA with the Deployment for which you want to enable HPA. Then, you can run kubectl commands to enable HPA.

In the following example, HPA is enabled for an NGINX application.

  1. Create a file named nginx.yaml and copy the following content to the file:

    Example:

    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx  
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/nginx:1.7.9 # replace it with your exactly <image_name:tags>
            ports:
            - containerPort: 80
            resources:
              requests:                         ## This parameter is required to run HPA. 
                cpu: 500m
  2. Run the following command to create an NGINX application:

    kubectl apply -f nginx.yaml
  3. Create a file named hpa.yaml and copy the following content to the file:

    Use the scaleTargetRef parameter to associate HPA with the nginx Deployment.

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: nginx-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx
      minReplicas: 1  # The minimum number of pods to which the Deployment can be scaled. The value must be an integer greater than or equal to 1. 
      maxReplicas: 10 # The maximum number of pods to which the Deployment can be scaled. The value must be greater than the value of minReplicas. 
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50 # The average resource utilization, which is the ratio of the average resource usage to the amount of the requested resource. 
                   
  4. Run the following command to deploy HPA:

    kubectl apply -f hpa.yaml
  5. After HPA is deployed, run the kubectl describe hpa <HPA name> command again.

    If the following output is returned, HPA is running as expected:

     Type    Reason             Age    From                       Message
      ----    ------             ----   ----                       -------
      Normal  SuccessfulRescale  4m53s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

References

For more information about Cron Horizontal Pod Autoscaler (CronHPA), see CronHPA.