FAQs about node instant scaling

This topic provides solutions to some frequently asked questions (FAQs) about node instant scaling in Container Service for Kubernetes (ACK).

Index

Category	Subcategory	Issue

Category	Subcategory	Issue
Scaling behavior of node instant scaling	Known limitations
	Scale-out behavior	What resource types can node instant scaling simulate? Can node instant scaling adjust to the appropriate instance type in the node pool based on requests received by pods? How does node instant scaling choose by default when multiple instance types are configured in the node pool? How does node instant scaling detect changes in instance type inventory in the node pool? How can I optimize node pool configuration to avoid scale-out failures due to insufficient inventory? Why does node instant scaling fail to add nodes after a scale-out activity is triggered? How do I add custom resources to node pools that have node instant scaling enabled?
	Scale-in behavior	Why does node instant scaling fail to remove nodes after a scale-in is triggered? What types of pods can prevent node instant scaling from removing nodes?
Custom scaling behavior	Use pods to control scaling	How does node instant scaling use pods to control node scale-in activities?
Custom scaling behavior	Use nodes to control scaling	How do I specify the nodes that I want to delete during the scale-in activities of node instant scaling? How do I prevent node instant scaling from removing nodes? Can node instant scaling only scale in empty nodes?
Node instant scaling component		Are there any operations that trigger the automatic update of the node instant scaling component? Why does node scaling still fail after I complete role authorization in the ACK managed cluster?

Known limitations

Inaccurate node resource estimation

Due to resource consumption by the underlying system of Elastic Compute Service (ECS), available memory on instances may be less than the specifications defined for the instance type. For more information, see Why does a purchased instance have a memory size that is different from the memory size defined in the instance type? Consequently, the node instant scaling component cannot guarantee 100% accuracy in resource estimation for schedulable node capacity. When you configure pod requests, take note of the following:

Total resource requests (including CPU, memory, and disk) cannot exceed the instance type specifications. We recommend keeping total requests below 70% of the node’s allocatable resources.
Resource checks by the node instant scaling component only account for Kubernetes pods (including pending and DaemonSet pods). Static pods not managed by DaemonSet require manual resource reservation.
For resource-demanding pods, such as those requesting more than 70% of node capacity, validate schedulability on nodes of the same instance type before deployment.

Limited simulatable resource types

The node instant scaling component supports only specific resource types for scaling simulations. For details, see What resource types can node instant scaling simulate?

Scale-out behavior

What resource types can node instant scaling simulate?

The following resource types are supported:

cpu
memory
ephemeral-storage 
aliyun.com/gpu-mem # Only supports shared GPU
nvidia.com/gpu

Can node instant scaling adjust to the appropriate instance type in the node pool based on requests received by pods?

Yes, it can. For example, if you configure two instance types for a node pool with Auto Scaling enabled: 4 Core 8 GB and 12 Core 48 GB, and the pod requests 2 Core, node instant scaling will prioritize scheduling the pod to the 4 Core 8 GB node during a scaling operation. If the 4 Core 8 GB node is later upgraded to 8 Core 16 GB, node instant scaling will automatically run the pod on the 8 Core 16 GB node.

How does node instant scaling choose by default when multiple instance types are configured in the node pool?

Based on the instance types configured in the node pool, node instant scaling periodically excludes instance types with insufficient inventory. It then sorts the remaining types by the number of CPU cores and checks each one to see if it meets the resource requests of unschedulable pods. Once an instance type meets the requirements, node instant scaling prioritizes it and does not check the remaining types.

How does node instant scaling detect changes in instance type inventory in the node pool?

Node instant scaling offers health metrics that periodically update inventory changes in the Auto Scaling node pool. When the inventory status of an instance type changes, node instant scaling sends a Kubernetes Event named InstanceInventoryStatusChanged. You can subscribe to this event notification to monitor the inventory health of the node pool, assess its status, and analyze or adjust the instance type configuration in advance. For more information, see View the health status of node instant scaling.

How can I optimize node pool configuration to avoid scale-out failures due to insufficient inventory?

Consider the following suggestions to expand the range of instance type options:

Configure multiple optional instance types for the node pool, or use generalized configurations.
Configure multiple zones for the node pool.

Why does node instant scaling fail to add nodes after a scale-out activity is triggered?

Check for the following issues:

Instance types configured in the node pool have insufficient inventory.
The instance types configured in the node pool cannot meet the resource requests from the pods. Some resources provided by the specified ECS instance type are reserved or occupied for the following purposes:
- Resources are used for virtualization or occupied by the operating system during instance creation. For more information, see Why does a purchased instance have a memory size that differs from the memory size defined in the instance type?
- ACK needs to occupy some resources to run Kubernetes components and system processes such as kubelet, kube-proxy, Terway, and the container runtime. For more information, see Resource reservation policy.
- By default, system components are installed on each node. Therefore, the requested pod resources must be less than the resource capacity of the instance type.
The Resource Access Management (RAM) role lacks permissions to manage the Kubernetes cluster. For more information, see Enable node instant scaling.
The node pool with Auto Scaling enabled fails to scale out.

To ensure the accuracy of subsequent scaling and the stability of the system, the node instant scaling component does not perform any scaling operations until it resolves issues with abnormal nodes.

How do I add custom resources to node pools that have node instant scaling enabled?

By configuring ECS tags with the following prefix for node pools that have node instant scaling enabled, the auto scaler component can recognize the available custom resources within these node pools or identify the exact amount of specific resource types.

Note

The version of the node instant scaling component must be 0.2.18 or later. To update it, see Manage components.

goatscaler.io/node-template/resource/{RESOURCE_NAME}:{RESOURCE_SIZE}

Example:

goatscaler.io/node-template/resource/hugepages-1Gi:2Gi

Scale-in behavior

Why does node instant scaling fail to remove nodes after a scale-in is triggered?

Check for the following issues:

Only scaling in empty nodes is enabled, but the node being removed is not empty.
The requested resource threshold of each pod is higher than the specified scale-in threshold.
Pods in the kube-system namespace are running on the node.
A scheduling policy forces the pods to run on the current node. Therefore, the pods cannot be scheduled to other nodes.
PodDisruptionBudget is set for the pods on the node and the minimum value of PodDisruptionBudget has been reached.
If there are new nodes, node instant scaling does not perform scale-in operations on the node within 10 minutes.

What types of pods can prevent node instant scaling from removing nodes?

If a pod is not created by a native Kubernetes Controller, such as a Deployment, ReplicaSet, Job, or StatefulSet, or if pods on a node cannot be securely terminated or migrated, the node may not be removed by the node instant scaling.

Use pods to control scaling

How does node instant scaling use pods to control node scale-in activities?

You can use the pod annotation goatscaler.io/safe-to-evict to specify whether a pod will prevent node instant scaling from scaling in a node.

To prevent the node from being scaled in: Add the annotation "goatscaler.io/safe-to-evict": "false" to the pod.
To allow the node to be scaled in: Add the annotation "goatscaler.io/safe-to-evict": "true" to the pod.

Use nodes to control scaling

How do I specify the nodes that I want to delete during the scale-in activities of node instant scaling?

You can add the taint goatscaler.io/force-to-delete:true:NoSchedule to the nodes that you want to delete. After you add this taint, node instant scaling will execute the delete operation without checking the pod status or whether the pod has been evicted from the drained node. Use this feature with caution, because it may result in service interruptions or data loss.

How do I prevent node instant scaling from removing nodes?

To prevent node instant scaling from removing nodes, add the annotation "goatscaler.io/scale-down-disabled": "true" to the node configurations. Then run the following command to add the annotation:

kubectl annotate node <nodename> goatscaler.io/scale-down-disabled=true

Can node instant scaling only scale in empty nodes?

You can configure whether to scale in only empty nodes at the node or cluster level, or both. If both are configured, the node-level setting takes precedence.

Node level: Add the label goatscaler.io/scale-down-only-empty:true or goatscaler.io/scale-down-only-empty:false to the node to enable or disable this feature, respectively.
Cluster level: On the Add-ons page in the Container Service Management Console, find the node instant scaling component, and configure ScaleDownOnlyEmptyNodes as true or false to enable or disable this feature as prompted.

The node instant scaling component

Are there any operations that trigger the automatic update of the node instant scaling component?

No, there are not. Except during system maintenance and upgrades, ACK will not automatically update the node instant scaling component. You need to update them manually on the Add-ons page in the Container Service Management Console.

Why does node scaling still fail after I complete role authorization in the ACK managed cluster?

This may be caused by the absence of addon.aliyuncsmanagedautoscalerrole.token in the secret under the cluster kube-system namespace. If this token is missing, use one of the following methods to add the token:

Submit a ticket for technical support.
Manually add the AliyunCSManagedAutoScalerRolePolicy permission: By default, ACK assumes the worker RAM role to use the relevant capabilities. Use the following steps to manually assign the AliyunCSManagedAutoScalerRolePolicy permission to the worker role:
1. On the Clusters page, find the cluster that you want to manage and click the name of the cluster. In the left-side pane, click Cluster Information.
2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.
3. On the Node Pools page, click Enable next to Node Scaling.
4. Authorize the KubernetesWorkerRole role and the AliyunCSManagedAutoScalerRolePolicy system policy as prompted. The following figure shows the console page on which you can complete the authorization:
5. To apply the new RAM policy, manually restart cluster-autoscaler or ack-goatscaler Deployment in the kube-system namespace. The cluster-autoscaler manages node auto scaling, while ack-goatscaler handles node instant scaling.