FAQs and solutions about node auto scaling - Container Service for Kubernetes

This topic provides answers to some frequently asked questions about node auto scaling.

Index

Category	Subcategory	Issue
Scaling behavior of node auto scaling	Limits
	Scale-out behavior	What scheduling policies does cluster-autoscaler use to determine whether unschedulable pods can be scheduled to a node pool for which node auto scaling is enabled? What resources can cluster-autoscaler simulate? Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered? How does cluster-autoscaler evaluate the resource capacity of a scaling group that uses multiple types of instances? How do I choose between multiple node pools for which auto scaling is enabled when I perform a scaling activity? How do I add custom resources to node pools for which auto-scaling is enabled?
	Scale-in behavior	Why does cluster-autoscaler fail to remove nodes after a scale-in activity is triggered? How do I enable or disable pod eviction for a DaemonSet pod? What types of pods can prevent cluster-autoscaler from removing nodes?
	Extended support	Does cluster-autoscaler support CRD?
Custom scaling behavior	Use pods to manage scaling	How do I set a scale-out delay in cluster-autoscaler for unschedulable pods?
Custom scaling behavior	Use nodes to manage scaling	How do I prevent cluster-autoscaler from removing nodes? How do I use pod annotations to allow cluster-autoscaler to remove the node that hosts the pod or prevent cluster-autoscaler from removing the node that hosts the pod?
Questions related to cluster-autoscaler		How do I update cluster-autoscaler to the latest version? What operations can trigger the system to automatically update cluster-autoscaler? Why does node scaling still fail after I complete role authorization in the ACK managed cluster?

Limits

Unable to accurately estimate node available resources

The available memory of an Elastic Compute Service (ECS) instance is less than the memory size defined in the instance type because the underlying system consumes some resources. For more information, see Why does a purchased instance have a memory size that is different from the memory size defined in the instance type? Due to this constraint, the estimated schedulable resources of cluster-autoscaler may be greater than the schedulable resources of the actual node, and cannot be accurately estimated. When you configure the resource requests of a pod, take note of the following items.

The resources requested by the pods must be less than the resources defined in the instance type, which includes CPU, memory, and disk. We recommend that the resources requested by the pods do not exceed 70% of the computing resources provided by the node.
cluster-autoscaler checks only the resource requests of pending pods and pods created by DaemonSets when cluster-autoscaler determines whether the nodes in your cluster can provide sufficient resources for pod scheduling. If the nodes in your cluster have static pods that are not created by DaemonSets, we recommend that you reserve resources for the pods.
If the resources requested by the pods on a node exceed 70% of the computing resources provided by the node, we recommend that you check whether the pods can be scheduled to another node of the same instance type.

Limited scheduling policies are supported

cluster-autoscaler supports only a limited number of scheduling policies to determine whether unschedulable pods can be scheduled to a node pool for which auto-scaling is enabled. For more information, see What scheduling policies does cluster-autoscaler use to determine whether unschedulable pods can be scheduled to a node pool for which auto-scaling is enabled?

You cannot scale out the specified instance type for a node pool that has multiple instance types

If multiple instance types are configured for your node pool, you cannot specify an instance type for scaling. cluster-autoscaler evaluates the resource capacity of the scaling group based on the least amount of resources that the scaling group can provide. For more information, see How does cluster-autoscaler evaluate the resource capacity of a scaling group that uses multiple types of instances?

Pods that depend on a specific zone cannot be scaled out in a multi-zone node pool

If your node pool is configured with multiple zones and pods are running in a specific zone. For example, the pod uses a PVC that specifies a volume that resides in a specific zone or the pod has a nodeSelector that resides in a specific zone, cluster-autoscaler may fail to create a node in the specified zone. For more information about why cluster-autoscaler fails to add nodes, see Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered?

Scale-out behavior

What scheduling policies does cluster-autoscaler use to determine whether unschedulable pods can be scheduled to a node pool for which node auto scaling is enabled?

The following list describes the scheduling policies used by cluster-autoscaler:

PodFitsResources
GeneralPredicates
PodToleratesNodeTaints
MaxGCEPDVolumeCount
NoDiskConflict
CheckNodeCondition
CheckNodeDiskPressure
CheckNodeMemoryPressure
CheckNodePIDPressure
CheckVolumeBinding
MaxAzureDiskVolumeCount
MaxEBSVolumeCount
ready
NoVolumeZoneConflict

What resources can cluster-autoscaler simulate?

cluster-autoscaler can simulate and evaluate the following resources:

cpu
memory
sigma/eni
ephemeral-storage
aliyun.com/gpu-mem (shared GPUs only)
nvidia.com/gpu

For more information, see the How do I add custom resources to node pools for which auto-scaling is enabled? section of this topic.

Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered?

Check whether the following scenarios exist:

The instance types in the scaling group cannot fulfill the resource requests of pods. Resources provided by ECS instance types comply with the ECS specifications. ACK reserves a certain amount of node resources to run Kubernetes components and system processes. This ensures that the OS kernel, system services, and Kubernetes daemons can run as expected. However, this causes the amount of allocatable resources of a node to differ from the resource capacity of the node. For more information, see Resource reservation policy.
- Resources are used for virtualization or consumed by the operating system during instance creation. For more information, see the Why does a purchased instance have a memory size that is different from the memory size defined in the instance type?.
- Resources are consumed to run components such as kubelet, kube-proxy, Terway, and container runtime. For more information, see Resource reservation policy.
- By default, system components are installed for each node. Therefore, the requested pod resources must be less than the resource capacity of the instance type.
Cross-zone scale-out activities cannot be triggered for pods that have limits on zones.
The Resource Access Management (RAM) role does not have the permissions to manage the Kubernetes cluster. You must configure RAM roles for each Kubernetes cluster that is involved in the scale-out activity. For more information about RAM roles, see Prerequisites.
The following issues occur when you enable node auto scaling:
- The instance fails to be added to the cluster and a timeout error occurs.
- The node is NotReady and a timeout error occurs.
To ensure that nodes can be accurately scaled, cluster-autoscaler does not perform scaling activities before it fixes the abnormal nodes.

How does cluster-autoscaler evaluate the resource capacity of a scaling group that uses multiple types of instances?

For a scaling group that uses multiple types of instances, Auto Scaling evaluates the resource capacity of the scaling group based on the least amount of resources that the scaling group can provide.

For example, a scaling group uses two types of instances. One instance type provides 4 vCores and 32 GB of memory and the other one provides 8 vCores and 16 GB of memory. In this scenario, Auto Scaling considers that the scaling group can add instances each of which provides 4 vCores and 16 GB of memory. If a pending pod requests resources more than 4 vCores and 16 GB of memory, the pod is not scheduled.

You still need to take resource reservation into consideration after you specify multiple instance types for a scaling group. For more information, see Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered?

How do I choose between multiple node pools for which auto scaling is enabled when I perform a scaling activity?

When pods cannot be scheduled to nodes, the auto scaling component simulates the scheduling of the pods based on the configurations of scaling groups. The configurations include labels, taints, and instance specifications. If a scaling group meets the requirements, this scaling group is selected for the scale-out activity. If multiple node pools for which auto-scaling is enabled meet the scheduling conditions during the simulation, cluster-autoscaler applies the least-waste principle. The node pool that has the least resources left after nodes are added to the cluster is selected.

How do I add custom resources to node pools for which auto-scaling is enabled?

To enable cluster-autoscaler to identify custom resources provided by a node pool for which auto-scaling is enabled. or identify the amounts of specific resource types provided by the node pool, you must add an ECS tag with the following prefix to the node pool.

k8s.io/cluster-autoscaler/node-template/resource/{Resource name}:{Resource amount}

Example:

k8s.io/cluster-autoscaler/node-template/resource/hugepages-1Gi:2Gi

Scale-in behavior

Why does cluster-autoscaler fail to remove nodes after a scale-in activity is triggered?

Check whether the following scenarios exist:

The requested resource threshold of each pod is higher than the specified scale-in threshold.
The pod that runs the kube-system namespace on the node.
A scheduling policy forces the pods to run on the current node. In this case, the pods cannot be scheduled to other nodes.
PodDisruptionBudget is set for the pods on the node and the minimum value of PodDisruptionBudget is reached.

For more information about FAQ, see autoscaler.

How do I enable or disable pod eviction for a DaemonSet pod?

cluster-autoscaler decides whether to evict DaemonSet pods based on the Evict DaemonSet Pods setting. The setting takes effect on all DaemonSet pods in the cluster. For more information, see Step 1: Enable node auto scaling. If you want to enable pod eviction for a DaemonSet pod, add the "cluster-autoscaler.kubernetes.io/enable-ds-eviction":"true" annotation to the configurations of the pod.

If you want to disable pod eviction for a DaemonSet pod, add the annotation "cluster-autoscaler.kubernetes.io/enable-ds-eviction":"false" to the configurations of the pod.

Note

If the DaemonSet pod eviction feature is disabled, DaemonSet pods that have the annotation are evicted only if the node hosts pods other than DaemonSet pods. If you want to use the annotation to evict a node that hosts only DaemonSet pods, you must first enable the DaemonSet pod eviction feature.
You must add the preceding annotation to the DaemonSet pod instead of the DaemonSet.
This annotation does not take effect on pods that are not created by DaemonSets.
By default, cluster-autoscaler does not delay other tasks when it evicts DaemonSet pods. DaemonSet pod eviction is performed simultaneously with other tasks. If you want cluster-autoscaler to wait until all DaemonSet pods are evicted, you must add the "cluster-autoscaler.kubernetes.io/wait-until-evicted":"true" annotation to the pod configuration.

What types of pods can prevent cluster-autoscaler from removing nodes?

If pods are not created by native Kubernetes controllers, such as Deployments, ReplicaSets, Jobs, and StatefulSets, or if pods on a node cannot be safely terminated or migrated, cluster-autoscaler may prevent the node from being removed. For more information, see What types of pods can prevent CA from removing a node?

Extended support

Does cluster-autoscaler support CRD?

cluster-autoscaler supports only standard Kubernetes objects and does not support Kubernetes CustomResourceDefinitions (CRDs).

Use pods to manage scaling

How do I set a scale-out delay in cluster-autoscaler for unschedulable pods?

You can add the cluster-autoscaler.kubernetes.io/pod-scale-up-delay annotation to set a scale-out delay for all pods. If pods are still unschedulable after the delay ends, cluster-autoscaler may add nodes to schedule the pods. Example: "cluster-autoscaler.kubernetes.io/pod-scale-up-delay": "600s".

How do I use pod annotations to allow cluster-autoscaler to remove the node that hosts the pod or prevent cluster-autoscaler from removing the node that hosts the pod?

You can configure a pod to allow cluster-autoscaler to remove the node that hosts the pod or prevent cluster-autoscaler from removing the node that hosts the pod.

To configure a pod to prevent cluster-autoscaler from removing the node that hosts the pod, add the "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" annotation to the pod configuration.
To configure a pod to allow cluster-autoscaler to remove the node that hosts the pod, add the "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" annotation to the pod configuration.

Use nodes to manage scaling

How do I prevent cluster-autoscaler from removing nodes?

To prevent a node from being removed by Cluster Autoscaler, add the "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true" annotation to the node configurations. Run the following command to add the annotation:

kubectl annotate node <nodename> cluster-autoscaler.kubernetes.io/scale-down-disabled=true

Questions related to cluster-autoscaler

How do I update cluster-autoscaler to the latest version?

For a cluster for which auto scaling is enabled, you can use one of the following methods to update cluster-autoscaler:

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.
Click Edit on the right side of Node Scaling, and then click OK at the bottom of the panel to update cluster-autoscaler to the latest version.

What operations can trigger the system to automatically update cluster-autoscaler?

To ensure that the configurations of cluster-autoscaler are up to date and its version is compatible with the cluster, the following operations can trigger the system to automatically update cluster-autoscaler:

Update the auto scaling configuration.
Create, delete, or update node pools for which auto-scaling is enabled.
The cluster is updated.

Why does node scaling still fail after I complete role authorization in the ACK managed cluster?

This may be caused by the absence of addon.aliyuncsmanagedautoscalerrole.token in the secret under the cluster kube-system namespace. If this token is missing, use one of the following methods to add the token:

Submit a ticket for technical support.
Manually add the AliyunCSManagedAutoScalerRolePolicy permission: By default, ACK assumes the worker RAM role to use the relevant capabilities. Use the following steps to manually assign the AliyunCSManagedAutoScalerRolePolicy permission to the worker role:
1. On the Clusters page, find the cluster that you want to manage and click the name of the cluster. In the left-side pane, click Cluster Information.
2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.
3. On the Node Pools page, click Enable next to Node Scaling.
4. Authorize the KubernetesWorkerRole role and the AliyunCSManagedAutoScalerRolePolicy system policy as prompted. The following figure shows the console page on which you can complete the authorization:
5. To apply the new RAM policy, manually restart cluster-autoscaler or ack-goatscaler Deployment in the kube-system namespace. The cluster-autoscaler manages node auto scaling, while ack-goatscaler handles node instant scaling.