If your cluster is large, the cluster requires faster resource scaling, or you want to automatically scale resources across multiple instance types and zones, node auto scaling may not meet your requirements. In this scenario, we recommend that you use node instant scaling. A cluster is considered large if a node pool that has auto scaling enabled in the cluster contains more than 100 nodes or more than 20 node pools in the cluster have auto scaling enabled. The node instant scaling feature reduces the technical gap for developers, improves scaling efficiency, and reduces manpower for O&M.
Before you start
To better work with the node instant scaling feature, we recommend that you read Overview of node scaling and pay attention to the following items before you start:
How node instant scaling works
Benefits of node instant scaling
Use scenarios of node instant scaling
Usage notes for node instant scaling
Prerequisites
A Container Service for Kubernetes (ACK) managed cluster or ACK dedicated cluster that runs Kubernetes 1.24 or later has been created. For more information, see Create an ACK managed cluster, Create an ACK dedicated cluster, and Update an ACK cluster.
You are added to the whitelist for using node instant scaling. If you are not in the whitelist, submit a ticket and describe your business scenario in the ticket.
Elastic Scaling Service (ESS) has been activated.
If your node pool has auto scaling enabled and Scaling Mode is not set to Swift, the node instant scaling feature is compatible with the original semantics and behavior of the node pool configuration. In addition, this feature can be seamlessly enabled for all types of pods. If Scaling Mode is set to Swift, the node instant scaling feature is incompatible with the node pool.
Step 1: Enable node instant scaling
Before you use node instant scaling, you must enable auto scaling on the Node Pools page. When you configure node instant scaling, select Auto Scaling as the Node Scaling Method.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, click Enable next to Node Scaling.
If this is the first time you use auto scaling, complete authorization as prompted. If you have already completed the authorization, skip this step.
For an ACK managed cluster, authorize ACK to use the AliyunCSManagedAutoScalerRole for accessing your cloud resources.
For an ACK dedicated cluster, authorize ACK to use the KubernetesWorkerRole and AliyunCSManagedAutoScalerRolePolicy for scaling management. The following figure shows the console page on which you can make the authorization when you enable Node Scaling.
In the Node Scaling Configuration panel, set Node Scaling Method to Auto Scaling, configure scaling parameters, and then click OK.
Scale-out activities are automatically triggered based on the actual scheduling status. You need only to configure scale-in conditions.
Parameter
Description
Scale-in Threshold
Specify the ratio of the resource request of a node to resource capacity of the node in a node pool that has node instant scaling enabled.
A scale-in activity is performed only when the CPU and memory utilization of a node is lower than the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU-accelerated nodes.
A scale-in activity is performed only when the CPU, memory, and GPU utilization of a node is lower than the Scale-in Threshold.
Defer Scale-in For
The interval between the time when the scale-in threshold is reached and the time when the scale-in activity (reduce the number of pods) starts. Unit: minutes. Default value: 10.
ImportantThe autoscaler performs scale-in activities only when Scale-in Threshold is configured and the Defer Scale-in For condition is met.
Cooldown
After the autoscaler performs a scale-out activity, the autoscaler waits a cooldown period before it can perform a scale-in activity.
The autoscaler cannot perform scale-in activities within the cooldown period but can still check whether the nodes meet the scale-in conditions. After the cooldown period ends, if a node meets the scale-in conditions and the waiting period specified in the Defer Scale-in For parameter ends, the node is removed. For example, the Cooldown parameter is set to 10 minutes and the Defer Scale-in For parameter is set to 5 minutes. The autoscaler cannot perform scale-in activities within the 10-minute cooldown period after performing a scale-out activity. However, the autoscaler can still check whether the nodes meet the scale-in conditions within the cooldown period. When the cooldown period ends, the nodes that meet the scale-in conditions are removed after 5 minutes.
Step 2: Create a node pool that has auto scaling enabled
The node instant scaling feature scales only nodes in node pools that have auto scaling enabled. Therefore, after configuring node instant scaling, you also need to configure at least one node pool that has auto scaling enabled.
Create a new node pool that has auto scaling enabled. For more information, see Create a node pool.
Enable auto scaling for an existing node pool. For more information, see Modify a node pool.
NoteWhen configuring an existing node pool, ensure the Expected Number of Nodes is set to null. You can verify this by navigating to the node pool details page and find this parameter under the Overview tab. Alternatively, call the DescribeClusterNodePoolDetail API to check if
desired_size
isnil
.
We recommend that you configure a diverse range of instance types for your node pool. This can be achieved by specifying multiple instance types, using generic instance types, and configuring multiple availability zones. These methods help ensure a sufficient inventory of instance types and the successful execution of node scaling activities.
Step 3: (Optional) Verify node instant scaling
After you complete the preceding configuration, you can use the node instant scaling feature. The node pool displays that auto scaling is enabled and Node Auto Scaling is installed in the cluster.
Auto scaling is enabled for the node pool
The Node Pools page shows that auto scaling is enabled for the node pool.
Node Auto Scaling is installed
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Add-ons page, the ACK GOATScaler component displays Installed.
Introduction to key events related to node instant scaling
The node instant scaling feature involves the following key events. This helps you learn the internal status of node instant scaling when these events occur.
Event name | Event object | Description |
ProvisionNode | pod | The node instant scaling feature triggers a node scale-out activity. |
ProvisionNodeFailed | pod | The node instant scaling feature fails to trigger a node scale-out activity. |
ResetPod | pod | The node instant scaling feature re-adds pods that meet the scale-out conditions and have triggered scale-out activities but are still in the Unschedulable state to the scale-out list. |
InstanceInventoryStatusChanged | ACKNodePool | The supply status of an instance specification in the configured zone changes. The format is For more information, see View the health status of node instant scaling. |
Introduction to node instant scaling labels
The node instant scaling feature maintains the following labels. Do not manually modify these labels in case exceptions occur.
Node labels
Node label | Description |
goatscaler.io/managed:true | Identifies nodes that are managed by the node instant scaling feature. The node instant scaling feature periodically checks whether nodes that have this label meet the scale-in conditions. |
k8s.aliyun.com: true | Identifies nodes that are managed by the node instant scaling feature. The node instant scaling feature periodically checks whether nodes that have this label meet the scale-in conditions. |
goatscaler.io/provision-task-id:{task-id} | Indicates the ID of a scale-out task run by the node instant scaling feature so that you can trace the source of the nodes that are added to the cluster. |
Node taints
Node taint | Description |
goatscaler.io/node-terminating | Nodes that have this taint are scaled in by the node instant scaling feature. |
Pod annotations
Pod annotation | Description |
goatscaler.io/provision-task-id | Indicates the ID of a scale-out task that is created by the node instant scaling feature for the current pod. The node instant scaling feature does not add another node for a pod that has this annotation and waits for the current node to launch. |
goatscaler.io/reschedule-deadline | Indicates the time that the node instant scaling feature waits for a pod to be scheduled to a node. If this time is exceeded and the pod is still unschedulable, the node instant scheduling feature re-adds the pod to the scale-out list. |
What to do next
View health status of node instant scaling
The node instant scaling feature supports dynamically selecting types and zones based on the inventory status of Elastic Compute Service (ECS) instances. To monitor the health of the instance within a node pool, obtain configuration suggestions for instance optimization, and ensure the execution of node scaling activities, check the ConfigMap of the node pool. This allows you to assess the health status of the node pool inventory, evaluate its inventory, and proactively analyze and adjust instance types.
For more information, see View the health status of node instant scaling.
Update Node Auto Scaling
We recommend that you update Node Auto Scaling at your earliest convenience to use the latest features and optimizations. For more information, see Manage components.