Auto Scaling, also known as Elastic Scaling Service (ESS), automatically adjusts your compute resources (instances) based on policies that you define. This helps you handle fluctuations in application traffic, improve resource utilization, and lower costs.
Why use Auto Scaling?
When business demand increases, Auto Scaling automatically adds more instances of a specified type—such as Elastic Compute Service (ECS) instances or Elastic Container Instance (ECI) instances—to ensure sufficient computing power. When demand decreases, it automatically removes instances to save costs.
Auto Scaling provides the following benefits:
Benefit | Description |
Automation |
|
Cost savings | You no longer need to manually adjust resources, pre-provision capacity, or worry about promptly releasing idle resources. Auto Scaling performs scaling tasks at the right time to reduce your total cost of ownership. Auto Scaling monitors the relevant scaling metrics (or the expected number of instances) by default, once per minute. If a metric does not match your specified threshold, a scaling activity is immediately triggered. The response time of Auto Scaling depends on the following factors:
|
High availability | You no longer need to worry about the health of your ECS or ECI instances. Auto Scaling provides a health check feature that automatically replaces unhealthy instances—those not in a running state—with new ones to maintain availability. |
Flexibility and intelligence |
|
Easy auditing | Auto Scaling records every scaling activity and provides monitoring for scaling groups to help you quickly diagnose issues. |
For more information, see Benefits.
Features
Auto Scaling only supports increasing or decreasing the number of ECS or ECI instances. It does not support changing the configuration of individual instances, such as their CPU, memory, or bandwidth. To adjust these configurations, you can use CloudOps Orchestration Service (OOS). For more information, see What is OOS?
Auto Scaling can automatically create or remove ECS or ECI instances based on your business needs. You need to configure the following main components.
feature | Description |
A scaling group contains identical instances, designed for use in similar business scenarios. You can configure a scaling group to define the instance types for computing power. Additionally, you can specify the instance configuration source, the minimum and maximum instance counts, and the associated Classic Load Balancer (CLB) or Application Load Balancer (ALB) server groups. If you have multiple application scenarios, you can create multiple scaling groups. Auto Scaling adjusts the compute capacity for each scaling group independently based on your configuration. | |
Manages the templates used to create ECS or ECI instances. During a scale-out event, Auto Scaling uses an ECS-type template to create ECS Instances and an ECI-type template to create ECI instances. | |
Triggers a scaling activity, such as adding one ECS or ECI instance. You can execute a scaling rule manually or trigger it by using an event-triggered task or a scheduled task. You can also use scaling rules to intelligently set the boundary values (maximum and minimum number of instances) for a scaling group. | |
Uses CloudMonitor to monitor various metrics of a scaling group in real time. When a metric meets the configured threshold, Auto Scaling executes the corresponding scaling rule. | |
Executes a scaling rule at a specified time. |
Among the components listed above, you must configure and enable a scaling group and its instance configuration source for Auto Scaling to work. Other components can be configured as needed. The workflow of Auto Scaling is shown in the following figure.
Auto Scaling also provides other features to meet your needs in different scenarios:
When a scaling activity succeeds, fails, or is rejected, Auto Scaling supports sending notifications through the following methods.
Rule
Description
Supports sending notifications by using SMS, internal message, and email.
Supports sending messages to CloudMonitor system events or Simple Message Queue (SMQ), formerly known as MNS. SMQ includes two service models: topics and queues. SMQ is a paid service. For more pricing details, see Billing overview.
When you manage Instances within a scaling group, Auto Scaling also supports the following features.
Feature
Description
A tool for managing the lifecycle of ECS or ECI instances within a scaling group. When Auto Scaling automatically performs a scale-out or scale-in activity, it can trigger a lifecycle hook to place the affected instances in a pending state. This gives you a custom period to perform operations on the instances before the hook times out and the activity continues.
Lets you manually add instances to or remove Instances from a scaling group. The Instances can be ECS instances, ECI instances, or managed instances.
Rolling updates are available for ECS-type scaling groups. You can use this feature to update the configurations of ECS instances in batches. You can update the image, run scripts, or install OOS packages on all ECS instances in the
In Servicestate within the scaling group.
Use cases
Auto Scaling is ideal for scenarios with fluctuating business volumes:
For predictable changes in business volume.
For example, a video streaming company experiences a surge in traffic every Friday at 8:00 PM when a popular show airs. You can create a scheduled task to automatically add one ECS or ECI instance at that time every week.
For unpredictable changes in business volume.
For example, a live streaming company has traffic that is difficult to predict. You can create an event-triggered task to automatically add one ECS or ECI Instance whenever CPU utilization exceeds 60%.
For more information, see Use cases.
How it works
Auto Scaling executes a scaling activity based on the configured scaling mode to add instances to or remove instances from a scaling group. For more information, see Working principle.
Billing
Auto Scaling itself is free of charge. However, you are charged for the resources you use, such as ECS Instances, ECI instances, RDS instances, Server Load Balancer (SLB) services (such as CLB instances, ALB server groups, or Network Load Balancer (NLB) server groups), and SMQ. For more information, see Billing overview.
How to use Auto Scaling
Auto Scaling console: A web-based interface for interactive operations.
API: Supports RPC-style APIs with GET and POST requests. For more information about the API, see List of operations by function. The following are common developer tools for calling Auto Scaling APIs:
Alibaba Cloud CLI: A flexible and extensible management tool built on Alibaba Cloud APIs. You can use the CLI to encapsulate native Alibaba Cloud APIs and extend their functionality.
OpenAPI Explorer: Provides services for quickly searching for interfaces, making online API calls, and dynamically generating SDK example code.
Related services
Service | Description |
An Infrastructure as a Service (IaaS) cloud computing service that lets you provision and use compute resources on demand and enables Auto Scaling. | |
An agile and secure serverless container service. Using ECI as a container runtime environment provides greater elasticity and security for your applications while reducing usage and O&M costs. | |
A secure, stable, reliable, cost-effective, and scalable online database service that simplifies database operations. | |
A service that distributes incoming traffic on demand to eliminate single points of failure and improve application availability. It includes three product types: ALB, NLB, and CLB. | |
A service for monitoring Alibaba Cloud resources and internet applications. It helps you understand your resource usage and business status on Alibaba Cloud, so you can address failures promptly and ensure business continuity. | |
An efficient, reliable, secure, and scalable lightweight message queue service. It helps developers build loosely coupled systems by enabling data and message exchange between distributed application components. |