What is Auto Scaling? - Auto Scaling - Alibaba Cloud Documentation Center

Auto Scaling is a cloud service that automatically adjusts the number of instances based on workload demands and scaling policies. It helps ensure adequate computing resources, minimize idle capacity, and reduce costs.

This video demonstrates how to use Auto Scaling with Elastic Compute Service (ECS) instances.

Why choose Auto Scaling?

Auto Scaling automatically adds or removes ECS instances or elastic container instances to or from your scaling group based on changes in business demand. This ensures adequate computing power when demand increases and minimizes resource costs when demand decreases. Therefore, Auto Scaling can ensure that your scaling group always aligns with your business needs.

The following table describes the Auto Scaling benefits.

Benefit	Description
Automation	During scale-out events, Auto Scaling automatically launches instances of a specific type in your scaling group and attaches them to the relevant Server Load Balancer (SLB) instances. It also adds the private IP addresses of these instances to the IP address whitelist of the associated ApsaraDB RDS instances. During scale-in events, Auto Scaling automatically removes instances of a specific type from the scaling group and detaches them from the associated SLB instances. It also removes the private IP addresses of the removed instances from the IP address whitelists of the associated ApsaraDB RDS instances.
Cost-effectiveness	With Auto Scaling, you do not need to manually adjust the number of instances or prepare them for traffic spikes. It automatically ensures that you have the right number of instances without any idle resources. Auto Scaling automatically adjusts the number of instances to optimize resource costs. It monitors metric changes or fluctuations in the expected number of instances within your scaling group over a defined statistical period, which defaults to one minute. If Auto Scaling detects that metric values fall outside the allowed range, it immediately triggers scaling actions. The following factors may affect scaling speed: Instance startup time: the time it takes for an instance to become operational, from creation to when it can provide services as expected. Number of instances pending addition: If 1,000 or fewer instances are waiting to be added to your scaling group, Auto Scaling can complete the process within 1 minute.
High availability	You can use Auto Scaling to monitor the health of ECS instances or elastic container instances, ensuring your application remains highly available. Auto Scaling automatically checks instance health, and if an instance is detected as unhealthy, it will replace it with a new instance of the same type.
Flexibility and intelligence	You can configure Auto Scaling to scale instances of your chosen type (ECS instances or elastic container instances). Auto Scaling offers the following scaling modes to address complex business requirements: fixed-number, health, scheduled, dynamic, and custom modes. The dynamic mode supports integration with external monitoring systems by API operations. Auto Scaling enables you to select an instance creation template to provision instances based on your business needs, improving the success rate of scale-out events. Auto Scaling also supports multiple scaling policies for use in various business scenarios.
Easy auditing	Auto Scaling logs details of each scaling event and monitors scaling groups, enabling you to identify and resolve issues with efficiency.

For more information, see Benefits.

Feature description

Auto Scaling supports scaling the number of ECS instances and elastic container instances (scale-in and scale-out). It does not support adjusting the configuration of individual ECS instances or elastic container instances. To modify the configuration of an ECS instance or an elastic container instance (such as the vCPUs, memory size, bandwidth), activate Alibaba Cloud Operation Orchestration Service (OOS). For more information, see What is OOS?

Auto Scaling dynamically adjusts the number of ECS instances or elastic container instances to meet your business requirements. The following table describes the main components of Auto Scaling.

Component	Description
Scaling group	A scaling group contains identical instances, designed for use in similar business scenarios. You can configure a scaling group to define the instance types for computing power. Additionally, you can specify the instance configuration source, the minimum and maximum instance counts, and the associated Classic Load Balancer (CLB) or Application Load Balancer (ALB) server groups. For multiple business scenarios, you can create separate scaling groups. Auto Scaling will automatically allocate computing resources to each group based on your configurations.
Instance configuration source	An instance configuration source defines the template used to manage your ECS instances or elastic container instances. Auto Scaling uses the ECS template to create ECS instances and the Elastic Container Instance template to create elastic container instances during scale-out events.
Scaling rule	A scaling rule triggers scaling actions, such as adding an ECS instance or an elastic container instance. Scaling rules can be executed manually, or configured to run automatically through event-triggered or scheduled tasks. Scaling rules enable dynamic adjustment of the minimum and maximum instance limits for your scaling group based on specific triggers.
Event-triggered task	Auto Scaling is integrated with CloudMonitor to track your scaling group metrics in real time. When the monitored metrics reach the defined thresholds, the corresponding scaling rules are executed.
Scheduled task	You can create scheduled tasks to automatically execute scaling rules at designated time points.

Auto Scaling can only be activated after you configure and enable a scaling group, and specify the instance configuration source for the scaling group. Additional configurations can be specified based on your business requirements. The following figure shows how to use Auto Scaling.

Auto Scaling also provides the following features to address various business requirements:

Auto Scaling sends notifications based on rules when a scaling event is successful, fails, or is rejected. The following table describes the rules.

Rule	Description
Regular notification rule	Auto Scaling sends notifications by text message, internal message, and email.
Advanced notification rule	Auto Scaling sends notifications to CloudMonitor or SMQ (formerly MNS). If you use SMQ, notifications are sent to the specified SMQ topic or SMQ queue. You are charged when you use SMQ. For more information about the pricing of SMQ, see Billing.

Auto Scaling also provides features to help you manage instances in a scaling group. The following table describes the features.

Feature	Description
Lifecycle hook	A lifecycle hook manages the lifecycle of ECS instances or elastic container instances within a scaling group. It is triggered during a scaling event to change the status of instances to "Pending Add" or "Pending Remove." Operations can be performed on the instances until the lifecycle hook times out.
Manual instance management	Auto Scaling allows for the manual addition or removal of ECS instances, elastic container instances, or Alibaba Cloud-managed third-party instances in scaling groups.
Rolling update	If your scaling group is of the ECS type, you can use the rolling update feature to manage ECS instances. This allows you to update configurations across multiple instances simultaneously, such as updating images, running scripts, or installing OOS packages on instances in the "In Service" state.

Scenarios

Auto Scaling provides multiple scaling features suitable for the following business scenarios:

Workload fluctuations can be predicted.
For example, a video production company can use Auto Scaling to create scheduled tasks based on predictable traffic patterns. Auto Scaling automatically provisions an ECS instance or elastic container instance to handle traffic spikes, such as those occurring every Friday at 20:00:00.
Workload fluctuations cannot be predicted.
For example, a video production company with unpredictable traffic patterns can use Auto Scaling to create event-triggered tasks and monitor CPU utilization. When the CPU utilization exceeds 60%, Auto Scaling automatically adds an ECS or elastic container instance to manage traffic spikes.

For more information, see Scenarios.

How it works

Auto Scaling automatically adjusts the number of ECS instances or elastic container instances in scaling groups based on defined scaling modes. These instances handle client requests. Auto Scaling enables dynamic scaling, adding or removing instances to meet fluctuating business demands. For more information, see How Auto Scaling works.

Billing rules

Auto Scaling is free of charge. However, you will incur charges for the following resources used in the Auto Scaling console: ECS instances, elastic container instances, ApsaraDB RDS instances, SLB instances (including CLB instances, ALB server groups, and NLB server groups), and SMQ topics or queues. For more information, see Billing overview.

Usage methods

Auto Scaling console: a web page that supports interactive operations.
API: a remote procedure call (RPC) API that supports GET and POST requests. For more information about API operations, see List of operations by function. If you want to call the Auto Scaling API, use one of the following common developer tools:
- Alibaba Cloud CLI: a flexible and scalable management tool based on Alibaba Cloud APIs. You can use the CLI to encapsulate Alibaba Cloud native APIs and develop custom features.
- OpenAPI Explorer: a tool that efficiently indexes API operations, allows online API calls, and dynamically generates SDK sample code.

Related services

Service	Description
Elastic Compute Service (ECS)	A scalable, ready-to-use IaaS computing service offered by Alibaba Cloud.
Elastic Container Instance	An agile and secure serverless container runtime service provided by Alibaba Cloud. Elastic container instances are scalable, secure, and cost-effective, helping reduce resource and O&M costs for your business system.
ApsaraDB RDS	A secure, reliable, cost-effective, and scalable online database service designed to address database O&M challenges.
Server Load Balancer (SLB)	A load balancing service that distributes network traffic dynamically to enhance application availability and eliminate single points of failure. SLB provides the following types of load balancers: ALB, NLB, and CLB.
CloudMonitor	A service that monitors Alibaba Cloud resources and Internet applications. CloudMonitor provides comprehensive insights into resource usage and business status, enabling you to quickly identify and address any issues to ensure smooth business operations.
Simple Message Queue (SMQ, formerly MNS)	A lightweight, scalable messaging service that enables efficient, reliable, and secure data transfer between distributed application components, supporting the development of loosely coupled systems.