Auto Scaling is a cloud service that automatically adds or removes instances based on workload changes and scaling policies. You can use Auto Scaling to ensure sufficient computing resources, prevent idle resources, and reduce costs.
The following video uses Elastic Compute Service (ECS) instances as an example to describe how to use Auto Scaling.
Why Auto Scaling
When your business demand grows, Auto Scaling helps add ECS instances or elastic container instances to your scaling group to provide sufficient computing power. When your business demand drops, Auto Scaling helps remove ECS instances or elastic container instances from your scaling group to minimize resource costs. Therefore, Auto Scaling can automatically adjust the number of instances of a specific type in your scaling group to meet your fluctuating business demand.
The following table describes the Auto Scaling benefits.
Benefit | Description |
Automation |
|
Cost-effectiveness | If you use Auto Scaling, you do not need to invest much time and energy into adjusting the number of instances on which your application runs, or prepare instances before traffic spikes occur. Auto Scaling ensures you have no idle resources. Auto Scaling automatically scales instances to help you reduce resource costs. Auto Scaling automatically monitors the changes in metric values or in the expected number of instances in your scaling group. The default statistical period is 1 minute. If Auto Scaling detects that the metric values are not within the allowed range, Auto Scaling immediately triggers scaling events. The following factors may affect the scaling speed:
|
High availability | You can use Auto Scaling to check the health status of ECS instances or elastic container instances to ensure the high availability of your application. Auto Scaling automatically checks the health status of your instances. If Auto Scaling detects that an instance is unhealthy, Auto Scaling automatically adds an instance to replace the unhealthy instance. The new instance and the unhealthy instance must be of the same type. |
Flexibility and intelligence |
|
Easy auditing | Auto Scaling logs the details of each scaling event and monitors scaling groups. You can use the logs to identify and resolve issues. |
For more information, see Benefits.
Features
Auto Scaling supports only the scale-in and scale-out of ECS instances and elastic container instances. Auto Scaling does not support the configuration adjustment of an ECS instance or elastic container instance. To modify the configurations of an ECS instance or an elastic container instance (such as the number of vCPUs, memory size, and bandwidth), you can activate Alibaba Cloud Operation Orchestration Service (OOS). For more information, see What is OOS?
Auto Scaling can automatically scale the required number of ECS instances or elastic container instances. The following table describes the main components of Auto Scaling.
Component | Description |
A scaling group contains instances of the same type that you can use in similar business scenarios. You can configure a scaling group to specify the type of instances that provide computing power. You can also specify the instance configuration source, the maximum and minimum numbers of instances, and the Classic Load Balancer (CLB) instances or the Application Load Balancer (ALB) server groups with which you want to associate the scaling group. If you have multiple business scenarios, you can create multiple scaling groups. Auto Scaling automatically allocates computing resources to each scaling group based on your configurations. | |
An instance configuration source specifies information about the template that is used to manage your ECS instances or elastic container instances. Auto Scaling uses the template of the ECS type to create ECS instances and the template of the Elastic Container Instance type to create elastic container instances during scale-out events. | |
A scaling rule is used to trigger a scaling event. For example, you can create a scaling rule that triggers a scale-out event in which an ECS instance or elastic container instance is added. You can manually execute a scaling rule. You can also create an event-triggered task or a scheduled task to automatically execute a scaling rule. Scaling rules help change the maximum or minimum number of instances that are allowed or required in your scaling group. | |
Auto Scaling is integrated into CloudMonitor to monitor the metrics of your scaling group in real time. If the monitored metrics reach the specified thresholds, the specified scaling rules are triggered. | |
You can create scheduled tasks to automatically execute scaling rules at the specified points in time. |
Auto Scaling can run only after you configure and enable a scaling group and specify the instance configuration source for the scaling group. You can specify other configurations based on your business requirements. The following figure shows how to use Auto Scaling.
Auto Scaling provides the following features to meet different business requirements:
If a scaling event is successful, fails, or is rejected, Auto Scaling sends notifications based on rules. The following table describes the rules.
Rule
Description
Auto Scaling sends notifications by using text messages, internal messages, and emails.
Auto Scaling sends notifications to CloudMonitor or Simple Message Queue (SMQ, formerly MNS). If you use SMQ, notifications are sent to the specified SMQ topic or SMQ queue. You are charged when you use SMQ. For more information about the pricing of SMQ, see Billing.
Auto Scaling provides features to help you manage instances in a scaling group. The following table describes the features.
Feature
Description
A lifecycle hook is used to manage the lifecycle of ECS instances or elastic container instances in a scaling group. During a scaling event, a lifecycle hook can be triggered to switch the status of ECS instances or elastic container instances to Pending Add or Pending Remove. You can perform operations on the instances until the lifecycle hook times out.
Auto Scaling supports manually adding or removing ECS instances, elastic container instances, or Alibaba Cloud-managed third-party instances to or from scaling groups.
If your scaling group is of the ECS type, you can use the rolling update feature to manage the ECS instances. You can create rolling update tasks to update the configurations of multiple ECS instances at the same time. For example, you can update images, run scripts, or install OOS packages on ECS instances that are in the In Service state.
Scenarios
Auto Scaling provides various scaling features that you can use in the following business scenarios:
Your workload fluctuations can be predicted.
For example, a video production company whose workload fluctuations can be predicted uses Auto Scaling to create scheduled tasks. Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes that occur at 20:00:00 every Friday.
Your workload fluctuations cannot be predicted.
For example, a video production company whose workload fluctuations cannot be predicted uses Auto Scaling to create event-triggered tasks and monitor the CPU utilization of instances. If the CPU utilization exceeds 60%, Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes.
For more information, see Scenarios.
How it works
Auto Scaling triggers scaling events based on scaling modes to add or remove ECS instances or elastic container instances to or from scaling groups. ECS instances or elastic container instances are used to process client requests. You can use Auto Scaling to add or remove instances based on changing business demands. For more information, see How Auto Scaling works.
Billing rules
Auto Scaling is free of charge. However, you are charged for using ECS instances, elastic container instances, ApsaraDB RDS instances, SLB instances (including ALB instances, ALB server groups, and NLB server groups), and SMQ topics or queues in the Auto Scaling console. For more information, see Billing overview.
Usage methods
Auto Scaling console: a web page that supports interactive operations.
API: a remote procedure call (RPC) API that supports GET and POST requests. For more information about API operations, see List of operations by function. If you want to call the Auto Scaling API, use one of the following common developer tools:
Alibaba Cloud CLI: a flexible and scalable management tool based on Alibaba Cloud APIs. You can use the CLI to encapsulate Alibaba Cloud native APIs and develop custom features.
OpenAPI Explorer: a tool that allows you to obtain API operations, call API operations online, and dynamically generate SDK sample code.
Related services
Service | Description |
A ready-to-use and scalable IaaS-level service provided by Alibaba Cloud. | |
An agile and secure serverless container runtime service provided by Alibaba Cloud. Elastic container instances are scalable and secure for your business system, and can help reduce resource and O&M costs. | |
A secure, reliable, cost-effective, and scalable online database service that helps you resolve database O&M issues. | |
A load balancing service that distributes network traffic on demand. You can use SLB to eliminate single points of failure in application systems and improve application availability. SLB provides the following types of load balancers: ALB, NLB, and CLB. | |
A service that monitors Alibaba Cloud resources and Internet applications. CloudMonitor helps you fully understand the usage of Alibaba Cloud resources and the status of your business. This way, you can handle faulty resources at the earliest opportunity to ensure that your business runs as expected. | |
An efficient, reliable, secure, convenient, and scalable lightweight messaging service. SMQ allows developers to transfer data and messages between distributed components of applications to build loosely coupled systems. |