All Products
Search
Document Center

Container Compute Service:Horizontal Pod Autoscaling (HPA)

Last Updated:Feb 02, 2026

Alibaba Cloud Container Compute Service (ACS) enables you to create applications that support Horizontal Pod Autoscaling (HPA) through the console or kubectl. HPA automatically adjusts the number of pod replicas based on resource utilization, ensuring your application scales to meet demand. This topic explains how to create an HPA-enabled application and verify its autoscaling behavior.

Prerequisites

Create an application that supports HPA

Console

  1. Log on to the ACS console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left navigation pane, choose Workloads > Deployments.

  3. On the Deployments page, click Create with Image.

  4. On the Basic Information page, enter an application name and click Next.

    Parameter

    Description

    Namespace

    Select the namespace where you want to deploy the application. The default namespace is selected by default.

    Application Name

    Enter a name for your application.

    Number of Replicas

    Specify the initial number of pods for your application. The default is 2.

    Workload

    Select the workload type: Stateless, Stateful, Task, or Scheduled Task.

    Label

    Add labels to identify and organize your application.

    Annotation

    Add annotations to attach metadata to your application.

    Instance Type

    Select the pod instance type: General-purpose, BestEffort, or Compute-optimized.

  5. On the Container Configuration page, configure your container settings, select an image, and specify resource requirements. Then click Next. For details, see Container configuration.

    Note

    You must specify resource requests for the deployment. HPA cannot scale containers without defined resource requests.

  6. On the Advanced Settings page, in the Access Settings section, click Create to configure service settings. For details, see Advanced Configuration.

  7. On the Advanced Settings page, set Metric-based Scaling to Enable and configure the scaling parameters.

  8. Click Create in the lower-right corner. Your HPA-enabled deployment is now created.

kubectl

You can use kubectl to create an HPA resource from a YAML manifest. The HPA resource references and scales your target deployment.

The following example demonstrates how to configure HPA for an Nginx application.

  1. Create a file named nginx.yaml and add the following deployment manifest.

    This manifest defines an Nginx deployment with CPU resource requests.

    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx  
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/nginx:1.7.9 # Replace with your image
            ports:
            - containerPort: 80
            resources:
              requests:                         # Required for HPA to function
                cpu: 500m
  2. Run the following command to create the Nginx deployment.

    kubectl apply -f nginx.yaml
  3. Create a file named hpa.yaml and add the following HPA manifest.

    The scaleTargetRef field specifies the deployment that HPA monitors and scales. In this example, HPA targets the nginx deployment.

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: nginx-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx
      minReplicas: 1  # Minimum replicas. Must be 1 or greater.
      maxReplicas: 10 # Maximum replicas. Must be greater than minReplicas.
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50 # Target CPU utilization percentage
  4. Run the following command to create the HPA resource.

    kubectl apply -f hpa.yaml
  5. Run the following command to verify that HPA is working correctly.

    kubectl describe hpa nginx-hpa

    If HPA is running correctly, you see output similar to the following:

    Type    Reason             Age    From                       Message
    ----    ------             ----   ----                       -------
    Normal  SuccessfulRescale  4m53s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

Related information

To schedule autoscaling based on time rather than resource metrics, see CronHPA.