How to control cluster scaling - Container Service for Kubernetes

Container Service for Kubernetes (ACK) Serverless provides powerful scaling capabilities based on elastic container instances. An ACK Serverless cluster can scale out to multiple times its original size within a short time period based on the predefined scaling policy, and quickly scale in when the demand for computing power drops to help you reduce costs. This topic describes how to directly control the number of pods in an ACK Serverless cluster and configure a load-aware scaling policy.

Important

You need to pay about USD 0.5 (for an instance uptime of 30 minutes) in order to complete the tutorial in this topic. We recommend that you release the resources at the earliest opportunity after you complete the tutorial in this topic.

Prerequisites

A web application is deployed. For more information, see Deploy a web application by using NGINX.

Step 1: Install metrics-server

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.
Click the Logs and Monitoring tab, find the metrics-server card, and then click Install in the lower-right corner of the card.

Step 2: Scale pods

Use the console

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose Workloads > Deployments in the left-side navigation pane.
Click the Deployment named nginx-deploy to go to the details page.
Click Scale in the upper-right part of the page. In the Scale dialog box, set Desired Number of Pods to 10 and click OK.
After the page is refreshed, nine new pods are created. This indicates that pods are scaled out.
Repeat Step 4 to change the number of pods to 1.
After the page is refreshed, the number of pods drops to 1. This indicates that pods are scaled in.

Use kubectl

Run the following command to query the details of the Deployment:

kubectl get deploy

Expected output:

NAME           READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deploy   1/1     1            1           9m32s

Run the following command the scale the number of pods to 10:

kubectl scale deploy nginx-deploy --replicas=10

Expected output:

deployment.extensions/nginx-deploy scaled

Run the following command to query the pods that are created:

kubectl get pod

Expected output:

NAME                            READY   STATUS    RESTARTS   AGE
nginx-deploy-55d8dcf755-8jlz2   1/1     Running   0          39s
nginx-deploy-55d8dcf755-9jbzk   1/1     Running   0          39s
nginx-deploy-55d8dcf755-bqhcz   1/1     Running   0          38s
nginx-deploy-55d8dcf755-bxk8n   1/1     Running   0          38s
nginx-deploy-55d8dcf755-cn6x9   1/1     Running   0          38s
nginx-deploy-55d8dcf755-jsqjn   1/1     Running   0          38s
nginx-deploy-55d8dcf755-lhp8l   1/1     Running   0          38s
nginx-deploy-55d8dcf755-r2clb   1/1     Running   0          38s
nginx-deploy-55d8dcf755-rchhq   1/1     Running   0          10m
nginx-deploy-55d8dcf755-xspnt   1/1     Running   0          38s

Run the following command to scale the number of pods to 1:

kubectl scale deploy nginx-deploy --replicas=1

Expected output:

deployment.extensions/nginx-deploy scaled

Run the following command to query the pods that are created:

kubectl get pod

Expected output:

NAME                            READY   STATUS    RESTARTS   AGE
nginx-deploy-55d8dcf755-bqhcz   1/1     Running   0          1m

Step 3: Configure load-aware scaling

Use the console

Click the Pod Scaling tab on the Deployment details page. Then, click Create next to HPA.
In the Create dialog box, configure the following parameters and click OK.
Parameter
Example
Name
nginx-deploy
Metric
Metric Name: CPU usage
Threshold: 20%
Max. Containers
10
Min. Containers
1

Use kubectl

Run the following command to create a scaling policy based on the CPU usage metric.
After the scaling policy is applied, the Deployment scales the number of pods within a range of 1 to 10, in order to maintain the average CPU usage of containers at 50%.
```
kubectl autoscale deployment nginx-deploy --cpu-percent=20 --min=1 --max=10
```
Expected output:
```
horizontalpodautoscaler.autoscaling/nginx-deploy autoscaled
```

Run the following command to query the scaling policy:

kubectl get hpa

Expected output:

NAME           REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-deploy   Deployment/nginx-deploy   0%/20%    1         10        1          35s

Step 4 (optional): Test the scaling policy

You can increase the loads of the containers in the cluster to test the scaling policy. Perform the following steps to increase the loads of the containers in the cluster.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and click Cluster Information in the left-side navigation pane.
Click the Pods tab on the Deployment details page. Then, choose Terminal > cpu-usage in the Actions column of the first pod to connect to the pod.
Run the following command on the page that appears. This command creates an infinite loop to increase the CPU loads to the maximum extent.
```
while : ; do : ; done
```
Go back to the console and click the Pods tab. Then, click the icon in the Monitor column to view the CPU loads of the containers. Wait 1 minute and click the icon again to check the CPU loads.
Click the icon in the upper-right part of the page to update the Deployment information. The page displays that four pods are created.
The CPU usage of a pod reaches 100%. The CPU usage of the other four pods is 0%. The average CPU usage of the pods reaches 20%. The CPU usage becomes stable after the cluster is scaled out.
Go back to the terminal page in Step 3 and press Ctrl + C to end the loop. This reduces the CPU usage to 0%.
Note
If you have closed the terminal page, you can run the top command to view the infinite loop process and run the kill -9 <PID> command to kill the process.
Go back to the console, wait 5 to 10 minutes, and then click the icon. The number of pods drops to 1.

Step 5: Release resources

If you no longer need the cluster, perform the following steps to release resources:

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, choose More > Delete in the Actions column of the cluster.
In the Delete Cluster dialog box, select Delete ALB Instances Created by the Cluster, Delete Alibaba Cloud DNS PrivateZone instances Created by the Cluster, and I understand the above information and want to delete the specified cluster, and then click OK.

Parameter	Example
Name	nginx-deploy
Metric	Metric Name: CPU usage Threshold: 20%
Max. Containers	10
Min. Containers	1