Auto Scaling is a cloud service that automatically adjusts the number of instances based on workload demands and scaling policies. It helps ensure adequate computing resources, minimize idle capacity, and reduce costs.
This video demonstrates how to use Auto Scaling with Elastic Compute Service (ECS) instances.
Why choose Auto Scaling?
Auto Scaling automatically adds or removes ECS instances or elastic container instances to or from your scaling group based on changes in business demand. This ensures adequate computing power when demand increases and minimizes resource costs when demand decreases. Therefore, Auto Scaling can ensure that your scaling group always aligns with your business needs.
The following table describes the Auto Scaling benefits.
Benefit | Description |
Automation |
|
Cost-effectiveness | With Auto Scaling, you do not need to manually adjust the number of instances or prepare them for traffic spikes. It automatically ensures that you have the right number of instances without any idle resources. Auto Scaling automatically adjusts the number of instances to optimize resource costs. It monitors metric changes or fluctuations in the expected number of instances within your scaling group over a defined statistical period, which defaults to one minute. If Auto Scaling detects that metric values fall outside the allowed range, it immediately triggers scaling actions. The following factors may affect scaling speed:
|
High availability | You can use Auto Scaling to monitor the health of ECS instances or elastic container instances, ensuring your application remains highly available. Auto Scaling automatically checks instance health, and if an instance is detected as unhealthy, it will replace it with a new instance of the same type. |
Flexibility and intelligence |
|
Easy auditing | Auto Scaling logs details of each scaling event and monitors scaling groups, enabling you to identify and resolve issues with efficiency. |
For more information, see Benefits.
Feature description
Auto Scaling supports scaling the number of ECS instances and elastic container instances (scale-in and scale-out). It does not support adjusting the configuration of individual ECS instances or elastic container instances. To modify the configuration of an ECS instance or an elastic container instance (such as the vCPUs, memory size, bandwidth), activate Alibaba Cloud Operation Orchestration Service (OOS). For more information, see What is OOS?
Auto Scaling dynamically adjusts the number of ECS instances or elastic container instances to meet your business requirements. The following table describes the main components of Auto Scaling.
Component | Description |
A scaling group contains identical instances, designed for use in similar business scenarios. You can configure a scaling group to define the instance types for computing power. Additionally, you can specify the instance configuration source, the minimum and maximum instance counts, and the associated Classic Load Balancer (CLB) or Application Load Balancer (ALB) server groups. For multiple business scenarios, you can create separate scaling groups. Auto Scaling will automatically allocate computing resources to each group based on your configurations. | |
An instance configuration source defines the template used to manage your ECS instances or elastic container instances. Auto Scaling uses the ECS template to create ECS instances and the Elastic Container Instance template to create elastic container instances during scale-out events. | |
A scaling rule triggers scaling actions, such as adding an ECS instance or an elastic container instance. Scaling rules can be executed manually, or configured to run automatically through event-triggered or scheduled tasks. Scaling rules enable dynamic adjustment of the minimum and maximum instance limits for your scaling group based on specific triggers. | |
Auto Scaling is integrated with CloudMonitor to track your scaling group metrics in real time. When the monitored metrics reach the defined thresholds, the corresponding scaling rules are executed. | |
You can create scheduled tasks to automatically execute scaling rules at designated time points. |
Auto Scaling can only be activated after you configure and enable a scaling group, and specify the instance configuration source for the scaling group. Additional configurations can be specified based on your business requirements. The following figure shows how to use Auto Scaling.
Auto Scaling also provides the following features to address various business requirements:
Auto Scaling sends notifications based on rules when a scaling event is successful, fails, or is rejected. The following table describes the rules.
Rule
Description
Auto Scaling sends notifications by text message, internal message, and email.
Auto Scaling sends notifications to CloudMonitor or SMQ (formerly MNS). If you use SMQ, notifications are sent to the specified SMQ topic or SMQ queue. You are charged when you use SMQ. For more information about the pricing of SMQ, see Billing.
Auto Scaling also provides features to help you manage instances in a scaling group. The following table describes the features.
Feature
Description
A lifecycle hook manages the lifecycle of ECS instances or elastic container instances within a scaling group. It is triggered during a scaling event to change the status of instances to "Pending Add" or "Pending Remove." Operations can be performed on the instances until the lifecycle hook times out.
Auto Scaling allows for the manual addition or removal of ECS instances, elastic container instances, or Alibaba Cloud-managed third-party instances in scaling groups.
If your scaling group is of the ECS type, you can use the rolling update feature to manage ECS instances. This allows you to update configurations across multiple instances simultaneously, such as updating images, running scripts, or installing OOS packages on instances in the "In Service" state.
Scenarios
Auto Scaling provides multiple scaling features suitable for the following business scenarios:
Workload fluctuations can be predicted.
For example, a video production company can use Auto Scaling to create scheduled tasks based on predictable traffic patterns. Auto Scaling automatically provisions an ECS instance or elastic container instance to handle traffic spikes, such as those occurring every Friday at 20:00:00.
Workload fluctuations cannot be predicted.
For example, a video production company with unpredictable traffic patterns can use Auto Scaling to create event-triggered tasks and monitor CPU utilization. When the CPU utilization exceeds 60%, Auto Scaling automatically adds an ECS or elastic container instance to manage traffic spikes.
For more information, see Scenarios.
How it works
Auto Scaling automatically adjusts the number of ECS instances or elastic container instances in scaling groups based on defined scaling modes. These instances handle client requests. Auto Scaling enables dynamic scaling, adding or removing instances to meet fluctuating business demands. For more information, see How Auto Scaling works.
Billing rules
Auto Scaling is free of charge. However, you will incur charges for the following resources used in the Auto Scaling console: ECS instances, elastic container instances, ApsaraDB RDS instances, SLB instances (including CLB instances, ALB server groups, and NLB server groups), and SMQ topics or queues. For more information, see Billing overview.
Usage methods
Auto Scaling console: a web page that supports interactive operations.
API: a remote procedure call (RPC) API that supports GET and POST requests. For more information about API operations, see List of operations by function. If you want to call the Auto Scaling API, use one of the following common developer tools:
Alibaba Cloud CLI: a flexible and scalable management tool based on Alibaba Cloud APIs. You can use the CLI to encapsulate Alibaba Cloud native APIs and develop custom features.
OpenAPI Explorer: a tool that efficiently indexes API operations, allows online API calls, and dynamically generates SDK sample code.
Related services
Service | Description |
A scalable, ready-to-use IaaS computing service offered by Alibaba Cloud. | |
An agile and secure serverless container runtime service provided by Alibaba Cloud. Elastic container instances are scalable, secure, and cost-effective, helping reduce resource and O&M costs for your business system. | |
A secure, reliable, cost-effective, and scalable online database service designed to address database O&M challenges. | |
A load balancing service that distributes network traffic dynamically to enhance application availability and eliminate single points of failure. SLB provides the following types of load balancers: ALB, NLB, and CLB. | |
A service that monitors Alibaba Cloud resources and Internet applications. CloudMonitor provides comprehensive insights into resource usage and business status, enabling you to quickly identify and address any issues to ensure smooth business operations. | |
A lightweight, scalable messaging service that enables efficient, reliable, and secure data transfer between distributed application components, supporting the development of loosely coupled systems. |