All Products
Search
Document Center

Auto Scaling:What is Auto Scaling?

Last Updated:Oct 31, 2024

Auto Scaling is a cloud service that automatically adds or removes instances based on workload changes and scaling policies. You can use Auto Scaling to ensure sufficient computing resources, prevent idle resources, and reduce costs.

The following video uses Elastic Compute Service (ECS) instances as an example to describe how to use Auto Scaling.

Why Auto Scaling

When your business demand grows, Auto Scaling helps add ECS instances or elastic container instances to your scaling group to provide sufficient computing power. When your business demand drops, Auto Scaling helps remove ECS instances or elastic container instances from your scaling group to minimize resource costs. Therefore, Auto Scaling can automatically adjust the number of instances of a specific type in your scaling group to meet your fluctuating business demand.

The following table describes the Auto Scaling benefits.

Benefit

Description

Automation

  • During scale-out events, Auto Scaling automatically creates instances of a specific type in your scaling group. Then it attaches the created instances to the Server Load Balancer (SLB) instances that are associated with your scaling group. Auto Scaling also automatically adds the private IP addresses of the created instances to the IP address whitelists of the associated ApsaraDB RDS instances.

  • During scale-in events, Auto Scaling automatically removes instances of a specific type from your scaling group. Then it detaches the removed instances from the associated SLB instances. Auto Scaling also automatically removes the private IP addresses of the removed instances from the IP address whitelists of the associated ApsaraDB RDS instances.

Cost-effectiveness

If you use Auto Scaling, you do not need to invest much time and energy into adjusting the number of instances on which your application runs, or prepare instances before traffic spikes occur. Auto Scaling ensures you have no idle resources. Auto Scaling automatically scales instances to help you reduce resource costs.

Auto Scaling automatically monitors the changes in metric values or in the expected number of instances in your scaling group. The default statistical period is 1 minute. If Auto Scaling detects that the metric values are not within the allowed range, Auto Scaling immediately triggers scaling events. The following factors may affect the scaling speed:

  • The startup time of instances that are waiting to be added to your scaling group. The startup time of an instance starts from the point in time at which the instance is created to the point in time at which the instance can provide services as expected.

  • The number of instances that are waiting to be added to your scaling group. If the number of instances that are waiting to be added to your scaling group is less than or equal to 1,000, Auto Scaling can complete scaling within 1 minute.

High availability

You can use Auto Scaling to check the health status of ECS instances or elastic container instances to ensure the high availability of your application. Auto Scaling automatically checks the health status of your instances. If Auto Scaling detects that an instance is unhealthy, Auto Scaling automatically adds an instance to replace the unhealthy instance. The new instance and the unhealthy instance must be of the same type.

Flexibility and intelligence

  • You can use Auto Scaling to scale ECS instances or elastic container instances based on your business requirements.

  • Auto Scaling provides multiple scaling modes to help you meet the requirements of complex business scenarios. The scaling modes include the fixed-number mode, health mode, scheduled mode, dynamic mode, and custom mode. The dynamic mode allows you to interconnect Auto Scaling with external monitoring systems by using API operations.

  • Auto Scaling allows you to select a template to create instances based on your business requirements. This improves the success rate of scale-out events.

  • Auto Scaling also supports multiple scaling policies that you can use in various business scenarios.

Easy auditing

Auto Scaling logs the details of each scaling event and monitors scaling groups. You can use the logs to identify and resolve issues.

For more information, see Benefits.

Features

Auto Scaling supports only the scale-in and scale-out of ECS instances and elastic container instances. Auto Scaling does not support the configuration adjustment of an ECS instance or elastic container instance. To modify the configurations of an ECS instance or an elastic container instance (such as the number of vCPUs, memory size, and bandwidth), you can activate Alibaba Cloud Operation Orchestration Service (OOS). For more information, see What is OOS?

Auto Scaling can automatically scale the required number of ECS instances or elastic container instances. The following table describes the main components of Auto Scaling.

Component

Description

Scaling group

A scaling group contains instances of the same type that you can use in similar business scenarios. You can configure a scaling group to specify the type of instances that provide computing power. You can also specify the instance configuration source, the maximum and minimum numbers of instances, and the Classic Load Balancer (CLB) instances or the Application Load Balancer (ALB) server groups with which you want to associate the scaling group. If you have multiple business scenarios, you can create multiple scaling groups. Auto Scaling automatically allocates computing resources to each scaling group based on your configurations.

Instance configuration source

An instance configuration source specifies information about the template that is used to manage your ECS instances or elastic container instances. Auto Scaling uses the template of the ECS type to create ECS instances and the template of the Elastic Container Instance type to create elastic container instances during scale-out events.

Scaling rule

A scaling rule is used to trigger a scaling event. For example, you can create a scaling rule that triggers a scale-out event in which an ECS instance or elastic container instance is added. You can manually execute a scaling rule. You can also create an event-triggered task or a scheduled task to automatically execute a scaling rule. Scaling rules help change the maximum or minimum number of instances that are allowed or required in your scaling group.

Event-triggered task

Auto Scaling is integrated into CloudMonitor to monitor the metrics of your scaling group in real time. If the monitored metrics reach the specified thresholds, the specified scaling rules are triggered.

Scheduled task

You can create scheduled tasks to automatically execute scaling rules at the specified points in time.

Auto Scaling can run only after you configure and enable a scaling group and specify the instance configuration source for the scaling group. You can specify other configurations based on your business requirements. The following figure shows how to use Auto Scaling.

image

Auto Scaling provides the following features to meet different business requirements:

  • If a scaling event is successful, fails, or is rejected, Auto Scaling sends notifications based on rules. The following table describes the rules.

    Rule

    Description

    Regular notification rule

    Auto Scaling sends notifications by using text messages, internal messages, and emails.

    Advanced notification rule

    Auto Scaling sends notifications to CloudMonitor or Simple Message Queue (SMQ, formerly MNS). If you use SMQ, notifications are sent to the specified SMQ topic or SMQ queue. You are charged when you use SMQ. For more information about the pricing of SMQ, see Billing.

  • Auto Scaling provides features to help you manage instances in a scaling group. The following table describes the features.

    Feature

    Description

    Lifecycle hook

    A lifecycle hook is used to manage the lifecycle of ECS instances or elastic container instances in a scaling group. During a scaling event, a lifecycle hook can be triggered to switch the status of ECS instances or elastic container instances to Pending Add or Pending Remove. You can perform operations on the instances until the lifecycle hook times out.

    Manual instance management

    Auto Scaling supports manually adding or removing ECS instances, elastic container instances, or Alibaba Cloud-managed third-party instances to or from scaling groups.

    Rolling update

    If your scaling group is of the ECS type, you can use the rolling update feature to manage the ECS instances. You can create rolling update tasks to update the configurations of multiple ECS instances at the same time. For example, you can update images, run scripts, or install OOS packages on ECS instances that are in the In Service state.

Scenarios

Auto Scaling provides various scaling features that you can use in the following business scenarios:

  • Your workload fluctuations can be predicted.

    For example, a video production company whose workload fluctuations can be predicted uses Auto Scaling to create scheduled tasks. Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes that occur at 20:00:00 every Friday.

  • Your workload fluctuations cannot be predicted.

    For example, a video production company whose workload fluctuations cannot be predicted uses Auto Scaling to create event-triggered tasks and monitor the CPU utilization of instances. If the CPU utilization exceeds 60%, Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes.

For more information, see Scenarios.

How it works

Auto Scaling triggers scaling events based on scaling modes to add or remove ECS instances or elastic container instances to or from scaling groups. ECS instances or elastic container instances are used to process client requests. You can use Auto Scaling to add or remove instances based on changing business demands. For more information, see How Auto Scaling works.

Billing rules

Auto Scaling is free of charge. However, you are charged for using ECS instances, elastic container instances, ApsaraDB RDS instances, SLB instances (including ALB instances, ALB server groups, and NLB server groups), and SMQ topics or queues in the Auto Scaling console. For more information, see Billing overview.

Usage methods

  • Auto Scaling console: a web page that supports interactive operations.

  • API: a remote procedure call (RPC) API that supports GET and POST requests. For more information about API operations, see List of operations by function. If you want to call the Auto Scaling API, use one of the following common developer tools:

    • Alibaba Cloud CLI: a flexible and scalable management tool based on Alibaba Cloud APIs. You can use the CLI to encapsulate Alibaba Cloud native APIs and develop custom features.

    • OpenAPI Explorer: a tool that allows you to obtain API operations, call API operations online, and dynamically generate SDK sample code.

Related services

Service

Description

ECS

A ready-to-use and scalable IaaS-level service provided by Alibaba Cloud.

Elastic Container Instance

An agile and secure serverless container runtime service provided by Alibaba Cloud. Elastic container instances are scalable and secure for your business system, and can help reduce resource and O&M costs.

ApsaraDB RDS

A secure, reliable, cost-effective, and scalable online database service that helps you resolve database O&M issues.

SLB

A load balancing service that distributes network traffic on demand. You can use SLB to eliminate single points of failure in application systems and improve application availability. SLB provides the following types of load balancers: ALB, NLB, and CLB.

CloudMonitor

A service that monitors Alibaba Cloud resources and Internet applications. CloudMonitor helps you fully understand the usage of Alibaba Cloud resources and the status of your business. This way, you can handle faulty resources at the earliest opportunity to ensure that your business runs as expected.

SMQ

An efficient, reliable, secure, convenient, and scalable lightweight messaging service. SMQ allows developers to transfer data and messages between distributed components of applications to build loosely coupled systems.