How does Auto Scaling work? - Auto Scaling - Alibaba Cloud Documentation Center

This topic describes the Auto Scaling workflow and how to configure scaling modes. It also includes workflow diagrams for Auto Scaling.

Auto Scaling manages scaling groups containing Elastic Compute Service (ECS) instances and elastic container instances in the same way. This topic uses ECS instances to illustrate the Auto Scaling workflow. For more information about ECS instances and elastic container instances, see What is ECS? and What is Elastic Container Instance?

How Auto Scaling works

The following figure shows how Auto Scaling adds ECS instances.

In this example, a web application with a three-tier architecture is used. ECS instances process requests, as indicated by the dotted line box on the right side of the previous figure. In the architecture, the Server Load Balancer (SLB) instance at the top layer forwards client requests to ECS instances in the scaling group at the middle layer. The ECS instances process the requests, while ApsaraDB RDS instances at the bottom layer store the service data.

You can use Auto Scaling to adjust the number of ECS instances at the middle layer based on your business requirements. The following procedure describes how Auto Scaling adjusts the number of ECS instances:

Auto Scaling executes scaling activities when the conditions specified in the scaling modes are met. The following table describes the supported scaling modes. For information about how to configure scaling modes, see Configure scaling modes.

You can combine the scaling modes described in the following table based on your business requirements. For example, if your workload increases significantly at 12:00 p.m. daily, you can configure a schedule task to automatically create 20 ECS instances at that time. To ensure the number of ECS instances meets your business requirements, you can combine scheduled mode with other scaling modes, such as dynamic and custom modes. This approach helps address potential discrepancies between the created instances and your needs.

Scaling mode	Description	User guide	API references
Fixed-quantity mode	After you configure the Minimum Number of Instances parameter during scaling group creation, Auto Scaling will automatically add ECS instances to meet the specified minimum if the total number of ECS instances falls below this minimum. After you configure the Maximum Number of Instances parameter during scaling group creation, Auto Scaling will automatically remove ECS instances if the total number exceeds the specified maximum, reducing the instance count to the set value. If you configure the Expected Number of Instances parameter during scaling group creation, Auto Scaling will automatically adjust the number of ECS instances in the scaling group to match the specified value.	Configure scaling groups	CreateScalingGroup
Health mode	If you enable the health check feature during scaling group creation, Auto Scaling will monitor ECS instances at specified intervals. If an ECS instance is found to be unhealthy, Auto Scaling will remove it from the scaling group. Note The health check feature is an integral component of scaling groups. If an SLB instance with the health check feature enabled is attached to your scaling group, the health check features of both the scaling group and the SLB instance will be active simultaneously. The SLB instance can be either a Classic Load Balancer (CLB) instance or an Application Load Balancer (ALB) instance.	Configure scaling groups	CreateScalingGroup
Scheduled mode	You can create a scheduled task to automatically execute a scaling rule at a designated time point.	Configure a scheduled task	CreateScheduledTask
Custom mode	You can manually perform scaling actions by executing scaling rules or adding, removing, or deleting ECS instances.	Configure scaling rules Manually configure instances for a scaling group	ExecuteScalingRule AttachInstances DetachInstances RemoveInstances
Dynamic mode	You can create an event-triggered task based on a performance metric monitored by CloudMonitor, such as CPU utilization. When the metric value of a scaling group meets the alert condition, an alert is triggered, and the corresponding scaling rule is executed. For example, if the average CPU utilization of all ECS instances in a scaling group exceeds 60%, the alert is triggered, and the scaling action is performed.	Manage event-triggered tasks	CreateAlarm

Auto Scaling calls the ExecuteScalingRule API operation to execute scaling activities. This API operation must include the unique identifier of the scaling rule that you want to execute. Example: ari:acs:ess:cn-hangzhou:140692647406****:scalingrule/asr-bp1dvirgwkoowxk7****.
- If you create a scaling rule in the Auto Scaling console, you can find the scaling rule in the scaling rule list and click the ID of the scaling rule in the Scaling Rule ID/Name column. On the page that appears, you can view the unique identifier of the scaling rule. Example: asr-bp14u7kzh8442w9z****. For more information about how to create scaling rules, see Configure scaling rules.
- If you create a scaling rule by calling an API operation, you can call the DescribeScalingRules API operation to query the unique identifier of the scaling rule.
Auto Scaling uses the unique identifier to retrieve information about the scaling rule, including the associated scaling group and scaling configuration. It then initiates a scaling activity based on the information.
1. Auto Scaling uses the unique identifier to retrieve information about the scaling rule and its associated scaling group. It then determines the required number of ECS instances based on your business needs. Additionally, Auto Scaling can query information about the SLB instance and the ApsaraDB RDS instances for ECS instance attachment.
2. Auto Scaling retrieves the scaling configuration details of the scaling group, including the vCPUs, memory size, and bandwidth required to create ECS instances.
3. Auto Scaling initiates a scaling activity based on the required number of ECS instances, the instance configuration source, and the SLB instance and ApsaraDB RDS instances for ECS instance attachment.
During scaling, Auto Scaling automatically creates ECS instances and configures the SLB instance and ApsaraDB RDS instances for ECS instance attachment.
1. Auto Scaling provisions the required number of ECS instances based on the instance configuration source.
2. The private IP addresses of the ECS instances are added to the ApsaraDB RDS instance whitelists, and the ECS instances are registered as backend servers for the SLB instance.
After the scaling activity is complete, Auto Scaling activates the cooldown period for the scaling group.
The scaling group can only process new scaling requests after the cooldown period ends.

Configure scaling modes

Auto Scaling automatically adjusts the number of ECS instances in your scaling group based on your configurations, adding or removing instances as needed. You can configure scaling modes described in the following table in Auto Scaling to execute scaling activities.

Scaling mode	Configuration method	Description
Fixed-quantity mode	Scaling group + Instance configuration source^①	The scaling effect in fixed-quantity mode depends on the settings of the following parameters: Minimum number of instances Maximum number of instances (Optional) Expected Number of Instances
Health mode	Scaling group + Instance configuration source^①	You must turn on Instance Health Check for the scaling group.
Scheduled mode	Scaling group + Instance configuration source + Scaling rule + Scheduled task^②	The scaling effect in scheduled mode depends on the configurations of scheduled tasks.
Dynamic mode	Scaling group + Instance configuration source + Scaling rule + Event-triggered task^③	The scaling effect in dynamic mode depends on the configurations of event-triggered tasks.
Custom mode	Custom configuration method	In this mode, you can manually add, remove, or delete ECS instances. You can also manually execute scaling rules.
Multi-mode	Combination of the preceding configuration methods	The scaling effect in the multi-mode depends on the scaling modes that are included. The scaling modes operate independently, without any priority. Auto Scaling applies the configurations of the first triggered scaling mode. For example, when using scheduled and dynamic modes together, you must create a scheduled task and an event-triggered task. If the scheduled task is triggered before the event-triggered task, Auto Scaling will execute the scheduled task first.

The following table describes each configuration method.

No.	Configuration method	Description
①	Scaling group + Instance configuration source	Create a scaling group, configure an instance configuration source for it, and then enable both the instance configuration source and the scaling group. Auto Scaling can only scale instances after the preceding operations are complete. The scaling group and instance configuration source are essential components of the basic configuration unit.
②	Scaling group + Instance configuration source + Scaling rule + Scheduled task	Along with the basic configuration unit in Method 1, create a scaling rule and a scheduled task. Auto Scaling initiates the scheduled task to execute the scaling rule.
③	Scaling group + Instance configuration source + Scaling rule + Event-triggered task	Along with the basic configuration unit in Method 1, create a scaling rule and an event-triggered task. Auto Scaling initiates the event-triggered task to execute the scaling rule.

Workflows

Auto Scaling enables you to associate your scaling group with one or more SLB and ApsaraDB RDS instances. When a client sends a request from a mobile device or PC, the associated SLB instance forwards the request to an ECS instance in the scaling group. The ECS instance processes the request, and the ApsaraDB RDS instance stores the application data.

Auto Scaling adjusts the number of ECS instances in the scaling group based on your business requirements and the configured scaling modes. The following figures show the scaling and elastic recovery (health check) workflows.