You can use the node auto scaling feature to enable Container Service for Kubernetes (ACK) to automatically scale nodes when resources in the current cluster cannot fulfil pod scheduling. The node auto scaling feature is suitable for small-scale scaling activities and workloads that require only one scaling activity each time. For example, this feature is suitable for a cluster that contains less than 20 node pools with auto scaling enabled or node pools that have auto scaling enabled and each of which contains less than 100 nodes.
Before you start
To better work with the node auto scaling feature, we recommend that you read the Overview of node scaling topic and pay attention to the following items:
How node auto scaling works and its features
Use scenarios of node auto scaling
Usage notes for node auto scaling
Prerequisites
A Container Service for Kubernetes (ACK) managed cluster or ACK dedicated cluster is created. For more information, see Create an ACK managed cluster and Create an ACK dedicated cluster.
Elastic Scaling Service (ESS) has been activated.
Step 1: Enable node auto scaling
Before you use node auto scaling, you must enable and configure this feature on the node pools page in the ACK console. When you configure this feature, set Node Scaling Method to Auto Scaling.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, click Enable next to Node Scaling.
If this is the first time you use this feature, follow the on-screen instructions to activate Auto Scaling and complete authorization. Skip this step if you have already completed authorization.
For an ACK managed cluster, authorize ACK to use the AliyunCSManagedAutoScalerRole for accessing your cloud resources.
For an ACK dedicated cluster, authorize ACK to use the KubernetesWorkerRole and AliyunCSManagedAutoScalerRolePolicy for scaling management. The following figure shows the console page on which you can make the authorization when you enable Node Scaling.
In the Node Scaling Configuration panel, set Node Scaling Method to Auto Scaling, configure scaling parameters, and then click OK.
Parameter
Description
Node PoolsScale-out Policy
Random Policy: randomly scale out a node pool when multiple scalable node pools exist.
Default Policy: scale out the node pool that wastes the least resources when multiple scalable node pools exist.
Priority-based Policy: scale out node pools based on their scale-out priorities when multiple scalable node pools exist.
You can specify a scale-out priority for a node pool only after the node pool is created.
Scan Interval
Specify the interval at which the cluster is evaluated for scaling. Default value: 60s.
The autoscaler triggers scale-out activities based on the actual scheduling status. You need only to configure scale-in conditions.
ImportantElastic Compute Service (ECS) nodes: The autoscaler performs scale-in activities only when the Scale-in Threshold, Defer Scale-in For:, and Cooldown conditions are met.
GPU-accelerated nodes: The autoscaler performs scale-in activities only when the GPU Scale-in Threshold, Defer Scale-in For:, and Cooldown conditions are met.
Allow Scale-in
Specify whether to allow scale-in activities. The scale-in configuration does not take effect when this switch is turned off. Proceed with caution.
Scale-in Threshold
Specify the ratio of the resource request of a node to resource capacity of the node in a node pool that has node auto scaling enabled.
A scale-in activity is performed only when the CPU and memory utilization of a node is lower than the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU-accelerated nodes.
A scale-in activity is performed only when the CPU, memory, and GPU utilization of a node is lower than the Scale-in Threshold.
Defer Scale-in For
The interval between the time when the scale-in threshold is reached and the time when the scale-in activity (reduce the number of pods) starts. Unit: minutes. Default value: 10.
ImportantThe autoscaler performs scale-in activities only when Scale-in Threshold is configured and the Defer Scale-in For condition is met.
Cooldown
After the autoscaler performs a scale-out activity, the autoscaler waits a cooldown period before it can perform a scale-in activity.
The autoscaler cannot perform scale-in activities within the cooldown period but can still check whether the nodes meet the scale-in conditions. After the cooldown period ends, if a node meets the scale-in conditions and the waiting period specified in the Defer Scale-in For parameter ends, the node is removed. For example, the Cooldown parameter is set to 10 minutes and the Defer Scale-in For parameter is set to 5 minutes. The autoscaler cannot perform scale-in activities within the 10-minute cooldown period after performing a scale-out activity. However, the autoscaler can still check whether the nodes meet the scale-in conditions within the cooldown period. When the cooldown period ends, the nodes that meet the scale-in conditions are removed after 5 minutes.
Step 2: Configure a node pool that has auto scaling enabled
The node auto scaling feature scales only nodes in node pools that have auto scaling enabled. Therefore, after you configure node auto scaling, you need to configure at least one node pool that has auto scaling enabled. You can create a node pool that has auto scaling enabled or enable auto scaling for an existing node pool.
The following table describes the key parameters. The term "node pool" in the following section refers to a node pool that has auto scaling enabled. For more information, see Create a node pool and Modify a node pool.
Parameter | Description |
Auto Scaling | Specify whether to enable auto scaling. This feature provides cost-effective computing resource scaling based on resource demand and scaling policies. For more information, see Auto scaling overview. Before you enable this feature, you need to enable node auto scaling for the node pool. For more information, see Step 1: Enable node auto scaling. |
Instance-related parameters | Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter ECS instances by attributes such as vCPU, memory, instance family, and architecture. When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations. The instance types of the nodes in the node pool. If you select only one instance type, the fluctuations of the ECS instance stock affect the scaling success rate. We recommend that you select multiple instance types to increase the scaling success rate. If you select only GPU-accelerated instances, you can select Enable GPU Sharing on demand. For more information, see cGPU overview. |
Instances | The number of instances in the node pool, excluding existing instances in the cluster. By default, the minimum number of instances is 0. If you specify one or more instances, the system adds the instances to the node pool. When a scale-out activity is triggered, the instances in the node pool are added to the associated cluster. |
Operating System | When you enable auto scaling, you can select an image based on Alibaba Cloud Linux, Windows, or Windows Core. If you select an image based on Windows or Windows Core, the system automatically adds the |
Node Label | Node labels are automatically added to nodes that are added to the cluster by scale-out activities. Important Auto scaling can recognize node labels and taints only after the node labels and taints are mapped to node pool tags. A node pool can have only a limited number of tags. Therefore, you must limit the total number of ECS tags, taints, and node labels of a node pool that has auto scaling enabled to less than 12. |
Scaling Policy |
|
Scaling Mode | You can select Standard or Swift.
|
Taints | After you add taints to a node, ACK no longer schedules pods to the node. |
After you create a node pool that has auto scaling enabled, you can refer to Step 1: Enable node auto scaling and select Priority-based Policy on demand. The valid values of priorities are integers from 1 to 100.
Step 3: (Optional) Verify node auto scaling
After you complete the preceding configuration, you can use the node auto scaling feature. The node pool displays that auto scaling is enabled and cluster-autoscaler is installed in the cluster.
Auto scaling is enabled for the node pool
The Node Pools page displays node pools with auto scaling enabled.
cluster-autoscaler is installed
In the left-side navigation pane of the details page, choose .
Select the kube-system namespace. The cluster-autoscaler component is displayed.