What is Auto Scaling? - Auto Scaling - Alibaba Cloud Documentation Center

Auto Scaling, also known as Elastic Scaling Service (ESS), automatically adjusts your compute resources (instances) based on policies that you define. This helps you handle fluctuations in application traffic, improve resource utilization, and lower costs.

Why use Auto Scaling?

When business demand increases, Auto Scaling automatically adds more instances of a specified type—such as Elastic Compute Service (ECS) instances or Elastic Container Instance (ECI) instances—to ensure sufficient computing power. When demand decreases, it automatically removes instances to save costs.

Auto Scaling provides the following benefits:

Benefit	Description
Automation	Scale-out: Automatically creates specified types of instances and adds them to a load balancer. It also automatically associates them with ApsaraDB RDS instances Scale-in: Automatically removes specified types of instances and detaches them from the load balancer. It also automatically disassociates them from RDS instances.
Cost savings	You no longer need to manually adjust resources, pre-provision capacity, or worry about promptly releasing idle resources. Auto Scaling performs scaling tasks at the right time to reduce your total cost of ownership. Auto Scaling monitors the relevant scaling metrics (or the expected number of instances) by default, once per minute. If a metric does not match your specified threshold, a scaling activity is immediately triggered. The response time of Auto Scaling depends on the following factors: The startup time of the scaled instances, which is the time from when an instance is created until its operating system is ready. The number of instances to be scaled. For up to 1,000 instances, scaling activities are typically completed within one minute.
High availability	You no longer need to worry about the health of your ECS or ECI instances. Auto Scaling provides a health check feature that automatically replaces unhealthy instances—those not in a running state—with new ones to maintain availability.
Flexibility and intelligence	Supports specifying the instance type: ECS or ECI. Supports multiple scaling modes to handle diverse scenarios. Supported modes include fixed-number, health, scheduled, dynamic, and custom modes. The dynamic mode supports integration with external monitoring systems via an API. Supports flexible instance templates to increase the success rate of instance creation. Supports various scaling policies for different business scenarios.
Easy auditing	Auto Scaling records every scaling activity and provides monitoring for scaling groups to help you quickly diagnose issues.

For more information, see Benefits.

Features

Auto Scaling only supports increasing or decreasing the number of ECS or ECI instances. It does not support changing the configuration of individual instances, such as their CPU, memory, or bandwidth. To adjust these configurations, you can use CloudOps Orchestration Service (OOS). For more information, see What is OOS?

Auto Scaling can automatically create or remove ECS or ECI instances based on your business needs. You need to configure the following main components.

feature	Description
Scaling group	A scaling group contains identical instances, designed for use in similar business scenarios. You can configure a scaling group to define the instance types for computing power. Additionally, you can specify the instance configuration source, the minimum and maximum instance counts, and the associated Classic Load Balancer (CLB) or Application Load Balancer (ALB) server groups. If you have multiple application scenarios, you can create multiple scaling groups. Auto Scaling adjusts the compute capacity for each scaling group independently based on your configuration.
Instance configuration source	Manages the templates used to create ECS or ECI instances. During a scale-out event, Auto Scaling uses an ECS-type template to create ECS Instances and an ECI-type template to create ECI instances.
Scaling rule	Triggers a scaling activity, such as adding one ECS or ECI instance. You can execute a scaling rule manually or trigger it by using an event-triggered task or a scheduled task. You can also use scaling rules to intelligently set the boundary values (maximum and minimum number of instances) for a scaling group.
Event-triggered task	Uses CloudMonitor to monitor various metrics of a scaling group in real time. When a metric meets the configured threshold, Auto Scaling executes the corresponding scaling rule.
Scheduled task	Executes a scaling rule at a specified time.

Among the components listed above, you must configure and enable a scaling group and its instance configuration source for Auto Scaling to work. Other components can be configured as needed. The workflow of Auto Scaling is shown in the following figure.

Auto Scaling also provides other features to meet your needs in different scenarios:

When a scaling activity succeeds, fails, or is rejected, Auto Scaling supports sending notifications through the following methods.

Rule	Description
Regular notification rule	Supports sending notifications by using SMS, internal message, and email.
Advanced notification rule	Supports sending messages to CloudMonitor system events or Simple Message Queue (SMQ), formerly known as MNS. SMQ includes two service models: topics and queues. SMQ is a paid service. For more pricing details, see Billing overview.

When you manage Instances within a scaling group, Auto Scaling also supports the following features.

Feature	Description
Lifecycle hook	A tool for managing the lifecycle of ECS or ECI instances within a scaling group. When Auto Scaling automatically performs a scale-out or scale-in activity, it can trigger a lifecycle hook to place the affected instances in a pending state. This gives you a custom period to perform operations on the instances before the hook times out and the activity continues.
Manual instance management	Lets you manually add instances to or remove Instances from a scaling group. The Instances can be ECS instances, ECI instances, or managed instances.
Rolling update	Rolling updates are available for ECS-type scaling groups. You can use this feature to update the configurations of ECS instances in batches. You can update the image, run scripts, or install OOS packages on all ECS instances in the `In Service` state within the scaling group.

Use cases

Auto Scaling is ideal for scenarios with fluctuating business volumes:

For predictable changes in business volume.
For example, a video streaming company experiences a surge in traffic every Friday at 8:00 PM when a popular show airs. You can create a scheduled task to automatically add one ECS or ECI instance at that time every week.
For unpredictable changes in business volume.
For example, a live streaming company has traffic that is difficult to predict. You can create an event-triggered task to automatically add one ECS or ECI Instance whenever CPU utilization exceeds 60%.

For more information, see Use cases.

How it works

Auto Scaling executes a scaling activity based on the configured scaling mode to add instances to or remove instances from a scaling group. For more information, see Working principle.

Billing

Auto Scaling itself is free of charge. However, you are charged for the resources you use, such as ECS Instances, ECI instances, RDS instances, Server Load Balancer (SLB) services (such as CLB instances, ALB server groups, or Network Load Balancer (NLB) server groups), and SMQ. For more information, see Billing overview.

How to use Auto Scaling

Auto Scaling console: A web-based interface for interactive operations.
API: Supports RPC-style APIs with GET and POST requests. For more information about the API, see List of operations by function. The following are common developer tools for calling Auto Scaling APIs:
- Alibaba Cloud CLI: A flexible and extensible management tool built on Alibaba Cloud APIs. You can use the CLI to encapsulate native Alibaba Cloud APIs and extend their functionality.
- OpenAPI Explorer: Provides services for quickly searching for interfaces, making online API calls, and dynamically generating SDK example code.

Related services

Service	Description
Elastic Compute Service (ECS)	An Infrastructure as a Service (IaaS) cloud computing service that lets you provision and use compute resources on demand and enables Auto Scaling.
Elastic Container Instance	An agile and secure serverless container service. Using ECI as a container runtime environment provides greater elasticity and security for your applications while reducing usage and O&M costs.
ApsaraDB RDS	A secure, stable, reliable, cost-effective, and scalable online database service that simplifies database operations.
Server Load Balancer (SLB)	A service that distributes incoming traffic on demand to eliminate single points of failure and improve application availability. It includes three product types: ALB, NLB, and CLB.
CloudMonitor	A service for monitoring Alibaba Cloud resources and internet applications. It helps you understand your resource usage and business status on Alibaba Cloud, so you can address failures promptly and ensure business continuity.
Simple Message Queue (SMQ, formerly MNS)	An efficient, reliable, secure, and scalable lightweight message queue service. It helps developers build loosely coupled systems by enabling data and message exchange between distributed application components.