When the size of your Container Service for Kubernetes (ACK) cluster cannot meet the pod scheduling requirements, you can use the node scaling features of ACK to automatically scale node resources. ACK provides the node auto scaling and node instant scaling solutions. Compared with node instant scaling, node auto scaling is more efficient and easier to use.
Before you begin
To better use the node scaling solutions of ACK and choose the solution that best suits your business, we recommend that you read this topic before you enable the node scaling feature.
Before you read this topic, we recommend that you understand the terms of manual scaling, auto scaling, horizontal scaling, and vertical scaling. For more information, see Kubernetes official documentation.
How node scaling works
Node scaling in Kubernetes differs from the traditional scaling model that is based on resource utilization thresholds. Typically, you need to fix the node scaling issue after you migrate your business from data centers or other orchestration systems to Kubernetes.
The traditional scaling model works based on resource utilization. For example, a cluster contains three nodes. When the CPU utilization and memory usage of the nodes exceed the thresholds, the system adds new nodes to the cluster. However, this model has the following issues.
ACK uses the node scaling (resource layer) and workload scaling (scheduling layer) models to fix the preceding issue. The node scaling model triggers pod (schedulable unit) scaling based on resource utilization. The following sections describe node scaling in details.
Scaling solutions: node auto scaling and node instant scaling
The node scaling model scales resources at the resource layer. When the size of a cluster cannot meet pod scheduling requirements, the model scales node resources. ACK provides two node scaling solutions.
Introduction
The scaling statistics used in this topic are theoretical values and auto scaling is implemented based on custom images. The actual statistics in your business environment shall prevail. For more information about custom images, see Create custom images.
Solution | Component | Description |
Solution 1: node auto scaling | cluster-autoscaler | The component checks and updates cluster status in a round-robin manner. When scaling conditions are met, the component automatically scales nodes. |
Solution 2: node instant scaling | ACK GOATScaler | An event-driven node autoscaler. In large clusters or consecutive scale-out scenarios, the component can ensure efficient resource delivery. A cluster is considered large if a node pool with the auto scaling feature enabled in the cluster contains more than 100 nodes or more than 20 node pools in the cluster have the auto scaling feature enabled. The scaling speed (time taken from the initial scheduling failure to a successful scheduling), success rate, and resource fragment reduction of this component are 45 seconds, 99%, and 30%, respectively. This component is more extensible if it is used with custom scaling policies. |
Solution comparison
If a node pool in your cluster has the auto scaling feature enabled and does not use the swift scaling mode, the node instant scaling feature is compatible with the original semantics and behavior of the node pool configuration. In addition, you can seamlessly enable the node instant scaling feature for all types of pods. This section compares node instant scaling with node auto scaling.
Benefit | Node auto scaling | Node instant scaling |
Scaling speeds and efficiency | A scaling activity requires 60 seconds in standard mode and 50 seconds in swift mode. | You can use event-driven scaling together with Alibaba Cloud ContainerOS to accelerate resource scaling. In this case, each scaling activity requires 35 to 55 seconds. |
When the duration of a scaling activity reaches 1 minute, auto scaling encounters a performance bottleneck. The efficiency of auto scaling fluctuates based on the number of node pools and scaling scenarios. For example, if the number of node pools exceeds 100, the duration of a scaling activity is increased to 100 to 150 seconds. | The duration of a scaling activity does not significantly increase when the number of node pools or pods increases. Therefore, this solution is suitable for scenarios that require fast scaling. | |
This solution uses the round-robin mode and relies on cluster status updates. The minimum latency of auto scaling is 5 seconds. | This solution is event-driven. The latency of auto scaling is 1 to 3 seconds. | |
Resource scaling success rate | The inventory of cloud resources changes frequently. Due to issues such as the use of different instance types and insufficient inventory, the success rate of node auto scaling is approximately 97%. | You can configure an auto inventory selection policy to filter out insufficient instance types among thousands of Alibaba Cloud instance types based on the predefined filter conditions and priorities, and then select the optimal instance type or add other instance types that meet the scale-out conditions when all of the specified instance types are insufficient. This greatly simplifies the O&M work and increases the success rate of auto scaling to 99%. |
Scale-out activities are performed based on the instance types specified in the node pool configuration. When multiple instance types are specified, the instance type with the lowest specifications is preferably selected during a scale-out activity. | You can specify multiple instance types for scale-out activities. | |
The autoscaler periodically retries when resource scaling fails. | The autoscaler can generate alerts when the specified instance types are insufficient. | |
Use and O&M | Compared with node auto scaling, node instant scaling is easier to use in the following aspects:
| |
Scheduling policy | In addition to the scheduling features provided by the node auto scaling solution, the node instant scaling solution also supports the following features:
| |
The node instant scaling solution can reduce the resource fragment rate to 30% by using the Bin Packing and PreBind (custom feature) policies. |
Limits of node instant scaling
When you use the node instant scaling solution, you also need to understand the limits of node instant scaling.
Node instant scaling does not support the swift mode.
A node pool can contain up to 180 nodes per scale-out batch.
Scale-in cannot be disabled for a specific cluster.
NoteTo disable scale-in for a specific node, see How do I prevent node instant scaling from removing nodes?
Suggestions on selecting a node scaling solution
After you read the Solution comparison and Limits of node instant scaling sections, if your business is not sensitive to the scaling speeds, scaling success rate, and O&M costs and cannot comply with the limits of node instant scaling, you can use node auto scaling. If your business has the following requirements, we recommend that you use node instant scaling:
When the size of your cluster grows, the efficiency of node auto scaling is severely compromised. If you use large clusters, choose node instant scaling because the cluster size has minor impacts on its scaling efficiency. A cluster is considered large if a node pool with the auto scaling feature enabled in the cluster contains more than 100 nodes or more than 20 node pools in the cluster have the auto scaling feature enabled.
Your business requires faster resource scaling speed. A scaling activity in standard mode of node auto scaling requires 60 seconds and a scaling activity in standard mode of node instant scaling requires only 45 seconds.
The number of scale-out batches is uncontrollable. A node pool is usually involved in multiple consecutive scale-out activities. In consecutive scaling scenarios, the performance of node auto scaling is compromised and the scaling efficiency frequently fluctuates. However, node instant scaling still requires about 45 seconds to complete a scaling activity.
Usage notes
Quotas and limits
You can add up to 200 custom route entries to a route table of a virtual private cloud (VPC). To increase the quota limit, log on to the Quota Center console and submit an application. For more information about the quotas of other resources and how to increase the quota limits, see the Dependent cloud service quotas section of the "Quotas and limits" topic.
We recommend that you properly configure the maximum number of nodes in a node pool with the auto scaling feature enabled. Make sure that the dependent resources and quotas are sufficient for the specified number of nodes, such as the VPC CIDR blocks and vSwitches. Otherwise, scale-out activities may fail. For more information about the maximum number of nodes supported by a node pool with the auto scaling feature enabled, see Enable node auto scaling. For more information about how to plan an ACK network, see Plan the network of an ACK cluster.
The node scaling feature does not support subscription nodes. If you want to create a node pool with the auto scaling feature enabled, do not set the billing method of the node pool to subscription. If you want to enable the auto scaling feature for an existing node pool, make sure that the node pool does not have subscription nodes.
Maintenance of dependent resources
If elastic IP addresses (EIPs) are associated with ECS nodes added by the node scaling feature, do not directly delete the ECS nodes in the ECS console. Otherwise, the EIPs cannot be automatically released.
References
For more information about node auto scaling and node instant scaling and their differences, see Enable node auto scaling and Enable node instant scaling.
For more information about the frequently asked questions (FAQ) about node scaling, see FAQs about node auto scaling.