Features and uses of HPA - Container Compute Service - Alibaba Cloud Documentation Center

You can create an application with Horizontal Pod Autoscaler (HPA) enabled in the Alibaba Cloud Container Compute Service (ACS) console or by using kubectl. This topic describes how to create an application with HPA enabled in an ACS cluster and how to test HPA.

Prerequisites

An ACS cluster is created. For more information, see Create an ACS cluster.
A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Create an application with HPA enabled

Use the ACS console

Log on to the ACS console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its ID. In the left-side pane, choose Workloads > Deployments.
On the Deployments tab, click Create from Image.

In the Basic Information step, enter a name for your application, set the parameters, and then click Next.

Parameter	Description
Namespace	Select the namespace to which the application belongs. The default namespace is automatically selected.
Name	Enter a name for the application.
Replicas	The number of pods that you want to provision for the application. Default value: 2.
Type	The type of the resource object. Valid values: Deployment, StatefulSet, Job, and CronJob.
Label	The label that you want to add to the application to identify the application.
Annotations	The annotations that you want to add to the application.
Instance Type	The instance type of the pod. Valid values: General-purpose, BestEffort, and Performance-enhanced.

In the Container step, set the container parameters, select an image, and then configure the required computing resources. Click Next. For more information, see Configure the containers.
Note
You must configure the computing resources required by the Deployment. Otherwise, you cannot enable HPA.
In the Access Control section of the Advanced step, click Create to create a Service. For more information, see Advanced settings.
In the Advanced step, select Enable for HPA and configure the scaling threshold and related settings.
- Metric: Select CPU Usage or Memory Usage. The selected resource type must be the same as the one you specified in the Required Resources parameter.
- Condition: Specify the resource usage threshold. HPA triggers scaling events when the threshold is exceeded. For more information about the algorithms that are used to perform horizontal pod autoscaling, see Algorithm details.
- Max. Replicas: Specify the maximum number of pods to which the Deployment can be scaled.
- Min. Replicas: Specify the minimum number of pods to which the Deployment can be scaled.
In the lower-right corner of the Advanced step, click Create to create the application with HPA enabled.

Use kubectl

You can also deploy HPA by using an orchestration template and associate HPA with the Deployment for which you want to enable HPA. Then, you can run kubectl commands to enable HPA.

In the following example, HPA is enabled for an NGINX application.

Create a file named nginx.yaml and copy the following content to the file:

Example:

apiVersion: apps/v1 
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx  
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/nginx:1.7.9 # replace it with your exactly <image_name:tags>
        ports:
        - containerPort: 80
        resources:
          requests:                         ## This parameter is required to run HPA. 
            cpu: 500m

Run the following command to create an NGINX application:
```
kubectl apply -f nginx.yaml
```

Create a file named hpa.yaml and copy the following content to the file:

Use the scaleTargetRef parameter to associate HPA with the nginx Deployment.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 1  # The minimum number of pods to which the Deployment can be scaled. The value must be an integer greater than or equal to 1. 
  maxReplicas: 10 # The maximum number of pods to which the Deployment can be scaled. The value must be greater than the value of minReplicas. 
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50 # The average resource utilization, which is the ratio of the average resource usage to the amount of the requested resource.

Run the following command to deploy HPA:
```
kubectl apply -f hpa.yaml
```

After HPA is deployed, run the kubectl describe hpa <HPA name> command again.

If the following output is returned, HPA is running as expected:

 Type    Reason             Age    From                       Message
  ----    ------             ----   ----                       -------
  Normal  SuccessfulRescale  4m53s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

References

For more information about Cron Horizontal Pod Autoscaler (CronHPA), see CronHPA.