Alibaba Cloud Container Compute Service (ACS) enables you to create applications that support Horizontal Pod Autoscaling (HPA) through the console or kubectl. HPA automatically adjusts the number of pod replicas based on resource utilization, ensuring your application scales to meet demand. This topic explains how to create an HPA-enabled application and verify its autoscaling behavior.
Prerequisites
You have created an ACS cluster.
Create an application that supports HPA
Console
Log on to the ACS console. In the left navigation pane, click Clusters.
On the Deployments page, click Create with Image.
On the Basic Information page, enter an application name and click Next.
Parameter
Description
Namespace
Select the namespace where you want to deploy the application. The default namespace is selected by default.
Application Name
Enter a name for your application.
Number of Replicas
Specify the initial number of pods for your application. The default is 2.
Workload
Select the workload type: Stateless, Stateful, Task, or Scheduled Task.
Label
Add labels to identify and organize your application.
Annotation
Add annotations to attach metadata to your application.
Instance Type
Select the pod instance type: General-purpose, BestEffort, or Compute-optimized.
On the Container Configuration page, configure your container settings, select an image, and specify resource requirements. Then click Next. For details, see Container configuration.
NoteYou must specify resource requests for the deployment. HPA cannot scale containers without defined resource requests.
On the Advanced Settings page, in the Access Settings section, click Create to configure service settings. For details, see Advanced Configuration.
On the Advanced Settings page, set Metric-based Scaling to Enable and configure the scaling parameters.
Click Create in the lower-right corner. Your HPA-enabled deployment is now created.
kubectl
You can use kubectl to create an HPA resource from a YAML manifest. The HPA resource references and scales your target deployment.
The following example demonstrates how to configure HPA for an Nginx application.
Create a file named nginx.yaml and add the following deployment manifest.
This manifest defines an Nginx deployment with CPU resource requests.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/nginx:1.7.9 # Replace with your image ports: - containerPort: 80 resources: requests: # Required for HPA to function cpu: 500mRun the following command to create the Nginx deployment.
kubectl apply -f nginx.yamlCreate a file named hpa.yaml and add the following HPA manifest.
The scaleTargetRef field specifies the deployment that HPA monitors and scales. In this example, HPA targets the nginx deployment.
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: nginx-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx minReplicas: 1 # Minimum replicas. Must be 1 or greater. maxReplicas: 10 # Maximum replicas. Must be greater than minReplicas. metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 # Target CPU utilization percentageRun the following command to create the HPA resource.
kubectl apply -f hpa.yamlRun the following command to verify that HPA is working correctly.
kubectl describe hpa nginx-hpaIf HPA is running correctly, you see output similar to the following:
Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 4m53s horizontal-pod-autoscaler New size: 1; reason: All metrics below target