User guide for scaling groups - Auto Scaling - Alibaba Cloud Documentation Center

The scaling group is a fundamental element of Auto Scaling, which manages service instances of the same type and suitable for similar business purposes. You can use a scaling group to accelerate horizontal expansion of instances in a cluster. You can also use a scaling group to dynamically scale the number of instances based on your business requirements, which allows for significant cost savings.

Benefits

Rapid scale-out capability and guarantee of high service availability
You can use a scaling group to efficiently expand service clusters and improve service availability.
Cost control
Horizontally scaling service clusters can lead to higher operational costs due to increased resource management. However, your business may not always operate at full capacity. In this case, you can use the elastic capabilities of cloud computing to reduce resource investment when computing resource requirements decrease, which helps manage costs.

Supported scaling solutions

Solution 1: Maintenance of a fixed number of available instances

Scenario: High availability maintenance without cluster scaling
Implementation method: Enable the Instance Health Check and Expected Number of Instances features.
After you enable the Instance Health Check feature for your scaling group, Auto Scaling automatically removes unhealthy instances from the scaling group. If the current number of instances in your scaling group is less than the expected number of instances, Auto Scaling automatically triggers a scale-out event to maintain a fixed number of available instances in the scaling group.
Example
For example, you enable the Expected Number of Instances feature for your scaling group and specify 10 as the expected number. If the actual number of instances in the scaling group is less than 10, Auto Scaling automatically triggers a scale-out event to increase the actual number to 10.

Solution 2: Regularly scheduled autoscaling

Scenario: Stable resource utilization
Implementation method:: Create scheduled tasks to enable regular autoscaling.
When resource utilization in the cluster increases, you can execute a scheduled task to trigger a scale-out event. When resource utilization in the cluster decreases, you can execute a scheduled task to trigger a scale-in event. For more information, see Scale ECS instances by triggering scheduled tasks.
Example
For example, your cluster experiences an increase in traffic every evening at 19:00 and a decrease every monitoring at 01:00. To handle the fluctuations in business demand, you can create the following scheduled tasks:
- Increased traffic: You can enable a scheduled task to increase the number of service replicas every evening at 19:00. This improves the capability of the cluster to handle the increased traffic.
- Decreased traffic: You can enable a scheduled task to decrease the number of service replicas every morning at 01:00. This improves resource utilization and maximizes cost efficiency.

Solution 3: Autoscaling based on resource utilization thresholds

Scenario: Sudden fluctuations in workloads
Implementation method:
Trigger scaling events when resource utilization exceeds or falls below the specified threshold
You can create event-triggered tasks to trigger scaling events. When resource utilization exceeds or falls below the specified threshold, the event-triggered tasks are automatically executed to trigger scaling events.
Add or remove instances after alerts are triggered
When you create an event-triggered task in your scaling group, you must configure a simple scaling rule in the task. The simple scaling rule specifies the action to add or remove a specific number of instances when the event-triggered task is executed.
Effect description
If you configure a simple scaling rule, you can directly add or remove a specific number of instances or allow Auto Scaling to maintain the desired number of instances. Examples:
When the average CPU utilization exceeds 80%, you can execute a simple scaling rule by triggering an event-triggered task to add N instances.
When the average CPU utilization falls below 70%, you can execute a simple scaling rule by triggering an event-triggered task to remove N instances.
For more information, see Scale ECS instances by triggering event-triggered tasks.
Add or remove instances based on resource utilization tiers
When you create an event-triggered task in your scaling group, you can configure a step scaling rule in the task. This enables autoscaling based on predefined resource utilization tiers when the event-triggered task is executed.
Important
Scaling groups of the Elastic Container Instance type do not support step scaling rules.
Effect description
If you configure a step scaling rule, you can enable autoscaling based on the adjustment steps predefined in the rule. Examples:
When the average CPU utilization falls between 60% and 70%, you can execute a step scaling rule by triggering an event-triggered task to remove one instance.
When the average CPU utilization falls between 30% and 60%, you can continue to execute the step scaling rule by triggering the event-triggered task to remove three instances.
When the average CPU utilization falls below 30%, you can continue to execute the step scaling rule by triggering the event-triggered task to remove five instances.
Maintain the desired resource utilization
You can create a target tracking scaling rule in your scaling group to maintain the desired resource utilization.
Example
You create a target tracking scaling rule in a scaling group of the Elastic Compute Service (ECS) type and specify 80% as the desired average CPU utilization. In this case, Auto Scaling dynamically adds or removes instances to maintain the average CPU utilization at 80%.
Differences between the implementation methods
- The simple scaling rule or step scaling rule-based implementation method provides increased flexibility and customization. You can control the number of instances to add or remove after event-triggered tasks are triggered. This implementation method also allows for instance scaling based on changes in resource utilization tiers.
- The target tracking scaling rule-based implementation method is more simplified. You need to only focus on the desired resource utilization.

Solution 4: Custom scaling

If none of the preceding solutions meets your business requirements, you can configure a custom scaling solution.

You can manually execute scaling rules or modify the instance numbers to trigger scaling events. For more information, see Manually scale ECS instances with a few clicks.

Note

Custom scaling supports API calls. You can call API operations to configure custom scaling solutions based on your business requirements.

Solution 5: Predictive scaling

Auto Scaling can also automatically make adjustments to meet predicted resource demands.

This solution allows you to perform initial testing of predictive scaling rules by enabling only prediction to assess accuracy and relevance. If the results are satisfactory, you can enable both prediction and scaling to automatically generate predictive tasks and scale instances based on scheduled plans. For more information, see View the prediction of a predictive scaling rule.

Usage notes

Before you use a scaling group, make sure that the instances on which you deploy your business support horizontal scaling.

Auto Scaling horizontally scales instances. We recommend that you consider the potential impact of horizontal scaling on your business.

Data consistency
If your database is deployed on instances, data inconsistency may occur after you use Auto Scaling to horizontally scale out instances. To resolve this issue, we recommend that you change your architectural design by separately deploying the database and allowing all instances to access the same database. This helps achieve service statelessness.
Data security
Instances in scaling groups are automatically created and released. If you store data on the instances, make sure that you perform data backup operations to secure your data.

How do I use scaling groups?

Getting started

Advanced requirements

Business deployment: Automatically deploy business software packages on new instances

Enable automatic deployment by using images equipped with software packages
- Scaling groups of the ECS type
  Build a custom image that is equipped with your software packages, and modify the instance configuration source to use the image.
- Scaling groups of the Elastic Container Instance type
  Build a Docker image for your business, and modify the instance configuration source to use the image.
Automatically run the deployment script upon instance startup
- Custom instance user data
  If your scaling group is of the ECS type, you can enable the Instance User Data feature. You can include scripts in the custom user data to deploy service software packages. For more information, see Use the Instance User Data feature to automatically configure ECS instances.
- Lifecycle hook
  If your scaling group is of the ECS type, you can enable the Lifecycle Hook feature. Lifecycle hooks allow you to deploy service software packages on instances before the instances are added to the scaling group after scale-out events are triggered. For more information, see Automatically execute scripts on ECS instances.

Rolling update: Update instance images or run scripts

You can update instance images or batch execute scripts on multiple instances by using the Rolling Update feature. For more information, see Rolling update.

Association with cloud databases: Allow new instances to access databases

You can configure an identical security group for all instances in a scaling group. You can also add the private IP addresses of new instances in a scaling group to the IP address whitelists of the cloud databases associated with the scaling group. This allows for access from the new instances to the cloud databases.

References

Associate instances in a scaling group with cloud databases

Association with load balancers: Configure an access entry point for instances in a scaling group

If your instance cluster uses a load balancer as the access entry point, you can associate the load balancer with the scaling group that manages the instances. After you associate a load balancer with a scaling group, new instances in the scaling group are automatically added to the backend server groups of the load balancer.

References

Attach or detach SLB instances to or from scaling groups

Perform custom operations during scaling

You can use lifecycle hooks to put instances into a Pending state and perform custom operations on the instances, such as mounting File Storage NAS file systems, binding elastic IP addresses (EIPs), and executing custom scripts.

References

Design a scale-in policy

When your business has lower workloads, Auto Scaling automatically scales in your resources to minimize costs. During a scale-in process, you may have questions about controlling the scale-in frequency, gracefully scaling in instances, and selecting which instances to scale in.

References

Scale-in guide

Optimize resource costs

If you use a scaling group, you can create preemptible instances and enable a cost optimization policy to optimize resource costs.

References

For information about how to use preemptible instances to optimize resource costs, see Save your money with Auto Scaling.
For information about how to use multiple instance types and a cost optimization policy to optimize resource costs, see Combine a cost optimization policy with the selection of multiple instance types.

Improve the disaster recovery capability and scale-out success rate

Scale-out failures may occur due to insufficient resources in a single zone. To resolve this issue, you can specify multiple zones and instance types to reduce the risk of such failures. You can also enable a balanced distribution policy in your scaling group to implement multi-zone disaster recovery.

References

For information about how to use a balanced distribution policy to implement multi-zone disaster recovery, see Use a balanced distribution policy to deploy high-availability computing clusters.
For information about how to specify multiple instance types to reduce the risk of scale-out failures caused by insufficient resources, see Create a multi-instance type scaling group by using a launch template.

Scale Kubernetes nodes

You can use scaling groups to enable autoscaling of Kubernetes nodes.

References

Automatic scaling of Kubernetes nodes

Benefits

Supported scaling solutions

Solution 1: Maintenance of a fixed number of available instances

Solution 2: Regularly scheduled autoscaling

Solution 3: Autoscaling based on resource utilization thresholds

Trigger scaling events when resource utilization exceeds or falls below the specified threshold

Maintain the desired resource utilization

Solution 4: Custom scaling

Solution 5: Predictive scaling

Usage notes

How do I use scaling groups?

Getting started

Advanced requirements

Business deployment: Automatically deploy business software packages on new instances

Rolling update: Update instance images or run scripts

Association with cloud databases: Allow new instances to access databases

Association with load balancers: Configure an access entry point for instances in a scaling group

Perform custom operations during scaling

Design a scale-in policy

Optimize resource costs

Improve the disaster recovery capability and scale-out success rate

Scale Kubernetes nodes