All Products
Search
Document Center

Container Service for Kubernetes:Overview of node scaling

最終更新日:Jul 19, 2024

When the size of your Container Service for Kubernetes (ACK) cluster cannot meet the pod scheduling requirements, you can use the node scaling features of ACK to automatically scale node resources. ACK provides the node auto scaling and node instant scaling solutions. Compared with node instant scaling, node auto scaling is more efficient and easier to use.

Before you begin

To better use the node scaling solutions of ACK and choose the solution that best suits your business, we recommend that you read this topic before you enable the node scaling feature.

Before you read this topic, we recommend that you understand the terms of manual scaling, auto scaling, horizontal scaling, and vertical scaling. For more information, see Kubernetes official documentation.

How node scaling works

Node scaling in Kubernetes differs from the traditional scaling model that is based on resource utilization thresholds. Typically, you need to fix the node scaling issue after you migrate your business from data centers or other orchestration systems to Kubernetes.

The traditional scaling model works based on resource utilization. For example, a cluster contains three nodes. When the CPU utilization and memory usage of the nodes exceed the thresholds, the system adds new nodes to the cluster. However, this model has the following issues.

How are scaling thresholds determined?

The resource utilization of hot nodes in the cluster is usually higher than other nodes.

  • If resource scaling is triggered based on the average resource utilization of nodes in the cluster, the resource utilization of hot nodes is spread to other nodes. Consequently, resources cannot be scaled out promptly when the resource utilization of hot nodes exceeds the threshold.

  • If resource scaling is triggered based on the highest resource utilization, resource waste usually occurs. This adversely affects the entire cluster.

How are loads reduced after nodes are added?

In a Kubernetes cluster, pods are the smallest deployable units for applications. Pods are deployed on different nodes. When the resource utilization of a pod is high, even if the node or cluster that hosts the pod is scaled out, the number of pods deployed for the application or the resource limits of the pods remain unchanged. In this case, the loads of the node cannot be balanced to the newly added nodes.

How is node scaling triggered and performed?

If resource scaling is triggered based on resource utilization, pods with heavy resource requests and low resource usage may be evicted. When the cluster contains large numbers of the preceding pods, the schedulable resources in the cluster are exhausted. Consequently, some pods become unschedulable.

ACK uses the node scaling (resource layer) and workload scaling (scheduling layer) models to fix the preceding issue. The node scaling model triggers pod (schedulable unit) scaling based on resource utilization. The following sections describe node scaling in details.

How are scale-out activities triggered?

The node scaling model listens to pods that fail to be scheduled to determine if a scale-out activity is needed. When pods fail to be scheduled due to insufficient resources, the node scaling model starts to simulate pod scheduling, selects a node pool with the auto scaling feature enabled and can provide required resources to host these pods, and adds nodes in the node pool to the cluster.

Note

The scheduling simulation treats each node pool with the auto scaling feature enabled as an abstracted node. The instance types specified in the configuration of the node pool are abstracted into the CPU capacity, memory capacity, and GPU capacity of the node. In addition, the labels and taints of the node pool are mapped to the labels and taints of the node. The scheduler adds the abstracted node to the schedulable list during the scheduling simulation. When the scheduling conditions are met, the scheduler calculates the required number of nodes and adds nodes in the node pool to the cluster.

How are scale-in activities triggered?

The node scaling model scales in nodes only in node pools with the auto scaling feature enabled. It cannot manage static nodes, including nodes that are not in the node pool with the auto scaling feature enabled. The node scaling model matches each node against the scale-in conditions. When the resource utilization of a node is lower than the scale-in threshold, a scale-in activity is triggered. Then, the node scaling model attempts to simulate pod eviction on the node to check whether the node can be drained. Non-DaemonSet pods in the kube-system namespace and PodDisruptionBudget pods skip the node and choose other candidate nodes. The node is drained before it is removed. After pods on the node are evicted to other nodes, the node is removed.

How is a node pool selected when multiple node pools with the auto scaling feature enabled exist?

Choosing between multiple node pools with the auto scaling feature enabled is equivalent to choosing between multiple abstracted nodes. The same scheduling policy is applied. Scores are assigned to node pools with the auto scaling feature enabled. The autoscaler first matches nodes against the scheduling policy and selects nodes based on the affinity policy, such as node affinity.

If no proper node can be selected based on the preceding policies, the node auto scaling feature filters nodes based on the least-waste principle. The key of the least-waste principle is to identify the node with the least idle resources after the scale-out activity.

Note

When a GPU-accelerated node pool and a CPU-accelerated node pool have the auto scaling feature enabled and both node pools meet the scale-out conditions, the CPU-accelerated node pool is prioritized.

By default, the node instant scaling feature compares the inventory and costs of different scale-out solutions and selects the solution with the highest inventory and lowest costs.

How can I improve the success rate of auto scaling?

The success rate of auto scaling depends on the following factors:

  • Whether the scheduling conditions are met

    After you create a node pool with the auto scaling feature enabled, you need to confirm the pod scheduling policy that suits the node pool. If you cannot confirm the policy, configure nodeSelector to select the label of the node pool and perform a scheduling simulation.

  • Whether resources are sufficient

    After the scheduling simulation is complete, the system automatically selects the node pool with the auto scaling feature enabled and adds nodes in the node pool to the cluster. However, the inventory of Elastic Compute Service (ECS) instance types specified in the node pool configuration affects the success rate of the scale-out activity. Therefore, we recommend that you specify multiple instance types across different zones to improve the success rate.

How can I accelerate auto scaling?

  • Method 1: Use the swift mode to accelerate auto scaling. After a node pool with the auto scaling feature enabled warms up by completing a scale-out activity and a scale-in activity, the node pool runs in swift mode. For more information, see Enable node auto scaling.

  • Method 2: Use a custom image based on Alibaba Cloud Linux 3 to improve the efficiency of resource delivery at the Infrastructure as a Service (IaaS) layer by 50%. For more information, see Create custom images.

Scaling solutions: node auto scaling and node instant scaling

The node scaling model scales resources at the resource layer. When the size of a cluster cannot meet pod scheduling requirements, the model scales node resources. ACK provides two node scaling solutions.

Introduction

Important
  • The scaling statistics used in this topic are theoretical values and auto scaling is implemented based on custom images. The actual statistics in your business environment shall prevail. For more information about custom images, see Create custom images.

  • To enable node instant scaling, make sure that you are added to the whitelist. If you are not in the whitelist, submit a ticket and describe your business scenario in the ticket.

Solution

Component

Description

Solution 1: node auto scaling

cluster-autoscaler

The component checks and updates cluster status in a round-robin manner. When scaling conditions are met, the component automatically scales nodes.

Solution 2: node instant scaling

ACK GOATScaler

An event-driven node autoscaler. In large clusters or consecutive scale-out scenarios, the component can ensure efficient resource delivery. A cluster is considered large if a node pool with the auto scaling feature enabled in the cluster contains more than 100 nodes or more than 20 node pools in the cluster have the auto scaling feature enabled. The scaling speed (time taken from the initial scheduling failure to a successful scheduling), success rate, and resource fragment reduction of this component are 45 seconds, 99%, and 30%, respectively. This component is more extensible if it is used with custom scaling policies.

Solution comparison

If a node pool in your cluster has the auto scaling feature enabled and does not use the swift scaling mode, the node instant scaling feature is compatible with the original semantics and behavior of the node pool configuration. In addition, you can seamlessly enable the node instant scaling feature for all types of pods. This section compares node instant scaling with node auto scaling.

Benefit

Node auto scaling

Node instant scaling

Scaling speeds and efficiency

A scaling activity requires 60 seconds in standard mode and 50 seconds in swift mode.

You can use event-driven scaling together with Alibaba Cloud ContainerOS to accelerate resource scaling. In this case, each scaling activity requires 35 to 55 seconds.

When the duration of a scaling activity reaches 1 minute, auto scaling encounters a performance bottleneck. The efficiency of auto scaling fluctuates based on the number of node pools and scaling scenarios. For example, if the number of node pools exceeds 100, the duration of a scaling activity is increased to 100 to 150 seconds.

The duration of a scaling activity does not significantly increase when the number of node pools or pods increases. Therefore, this solution is suitable for scenarios that require fast scaling.

This solution uses the round-robin mode and relies on cluster status updates. The minimum latency of auto scaling is 5 seconds.

This solution is event-driven. The latency of auto scaling is 1 to 3 seconds.

Resource scaling success rate

The inventory of cloud resources changes frequently. Due to issues such as the use of different instance types and insufficient inventory, the success rate of node auto scaling is approximately 97%.

You can configure an auto inventory selection policy to filter out insufficient instance types among thousands of Alibaba Cloud instance types based on the predefined filter conditions and priorities, and then select the optimal instance type or add other instance types that meet the scale-out conditions when all of the specified instance types are insufficient. This greatly simplifies the O&M work and increases the success rate of auto scaling to 99%.

Scale-out activities are performed based on the instance types specified in the node pool configuration. When multiple instance types are specified, the instance type with the lowest specifications is preferably selected during a scale-out activity.

You can specify multiple instance types for scale-out activities.

The autoscaler periodically retries when resource scaling fails.

The autoscaler can generate alerts when the specified instance types are insufficient.

Use and O&M

Compared with node auto scaling, node instant scaling is easier to use in the following aspects:

  • Node pool configuration maintenance: The node instant scaling solution can automatically select instances across instance types and zones based on instance attributes to host pending pods. However, when node auto scaling is used, you need to manually maintain node pool configurations to ensure that all pods can be scheduled. Therefore, when the pod configuration is updated, you must update the node pool configuration accordingly.

  • Node O&M: Exceptions that occur during scaling activities are notified by pod events. This allows developers to focus on pod lifecycle management.

  • Feature extension: Feature extension is supported. For example, both solutions can be used with Descheduler to prepare elastic resources. The node instant scaling solution is intrusion-free. It allows you to define custom actions in resource provision policies and node lifecycle management to support secondary development.

Scheduling policy

In addition to the scheduling features provided by the node auto scaling solution, the node instant scaling solution also supports the following features:

  • Topology: This feature can meet cross-zone high availability requirements.

  • Pod Disruption Budgets: This feature can limit the number of pods that encounter voluntary disruptions at the same time in an application that has multiple pods.

The node instant scaling solution can reduce the resource fragment rate to 30% by using the Bin Packing and PreBind (custom feature) policies.

Limits of node instant scaling

When you use the node instant scaling solution, you also need to understand the limits of node instant scaling.

  • Node instant scaling does not support the swift mode.

  • A node pool can contain up to 180 nodes per scale-out batch.

  • The following custom settings are not supported:

    • Disable scale-in

    • Custom GPU scale-in threshold

    • Custom scale-out policy

Suggestions on selecting a node scaling solution

After you read the Solution comparison and Limits of node instant scaling sections, if your business is not sensitive to the scaling speeds, scaling success rate, and O&M costs and cannot comply with the limits of node instant scaling, you can use node auto scaling. If your business has the following requirements, we recommend that you use node instant scaling:

  • When the size of your cluster grows, the efficiency of node auto scaling is severely compromised. If you use large clusters, choose node instant scaling because the cluster size has minor impacts on its scaling efficiency. A cluster is considered large if a node pool with the auto scaling feature enabled in the cluster contains more than 100 nodes or more than 20 node pools in the cluster have the auto scaling feature enabled.

  • Your business requires faster resource scaling speed. A scaling activity in standard mode of node auto scaling requires 60 seconds and a scaling activity in standard mode of node instant scaling requires only 45 seconds.

  • The number of scale-out batches is uncontrollable. A node pool is usually involved in multiple consecutive scale-out activities. In consecutive scaling scenarios, the performance of node auto scaling is compromised and the scaling efficiency frequently fluctuates. However, node instant scaling still requires about 45 seconds to complete a scaling activity.

Usage notes

Quotas and limits

  • You can add up to 200 custom route entries to a route table of a virtual private cloud (VPC). To increase the quota limit, log on to the Quota Center console and submit an application. For more information about the quotas of other resources and how to increase the quota limits, see the Dependent cloud service quotas section of the "Quotas and limits" topic.

  • We recommend that you properly configure the maximum number of nodes in a node pool with the auto scaling feature enabled. Make sure that the dependent resources and quotas are sufficient for the specified number of nodes, such as the VPC CIDR blocks and vSwitches. Otherwise, scale-out activities may fail. For more information about the maximum number of nodes supported by a node pool with the auto scaling feature enabled, see Enable node auto scaling. For more information about how to plan an ACK network, see Plan the network of an ACK cluster.

  • The node scaling feature does not support subscription nodes. If you want to create a node pool with the auto scaling feature enabled, do not set the billing method of the node pool to subscription. If you want to enable the auto scaling feature for an existing node pool, make sure that the node pool does not have subscription nodes.

Maintenance of dependent resources

If elastic IP addresses (EIPs) are associated with ECS nodes added by the node scaling feature, do not directly delete the ECS nodes in the ECS console. Otherwise, the EIPs cannot be automatically released.

References