The scaling group is a fundamental element of Auto Scaling, which manages service instances of the same type and suitable for similar business purposes. You can use a scaling group to accelerate horizontal expansion of instances in a cluster. You can also use a scaling group to dynamically scale the number of instances based on your business requirements, which allows for significant cost savings.
Benefits
Rapid scale-out capability and guarantee of high service availability
You can use a scaling group to efficiently expand service clusters and improve service availability.
Cost control
Horizontally scaling service clusters can lead to higher operational costs due to increased resource management. However, your business may not always operate at full capacity. In this case, you can use the elastic capabilities of cloud computing to reduce resource investment when computing resource requirements decrease, which helps manage costs.
Supported scaling solutions
Solution 1: Maintenance of a fixed number of available instances
Scenario: High availability maintenance without cluster scaling
Implementation method: Enable the Instance Health Check and Expected Number of Instances features.
After you enable the Instance Health Check feature for your scaling group, Auto Scaling automatically removes unhealthy instances from the scaling group. If the current number of instances in your scaling group is less than the expected number of instances, Auto Scaling automatically triggers a scale-out event to maintain a fixed number of available instances in the scaling group.
Example
For example, you enable the Expected Number of Instances feature for your scaling group and specify 10 as the expected number. If the actual number of instances in the scaling group is less than 10, Auto Scaling automatically triggers a scale-out event to increase the actual number to 10.
Solution 2: Regularly scheduled autoscaling
Scenario: Stable resource utilization
Implementation method:: Create scheduled tasks to enable regular autoscaling.
When resource utilization in the cluster increases, you can execute a scheduled task to trigger a scale-out event. When resource utilization in the cluster decreases, you can execute a scheduled task to trigger a scale-in event. For more information, see Scale ECS instances by triggering scheduled tasks.
Example
For example, your cluster experiences an increase in traffic every evening at 19:00 and a decrease every monitoring at 01:00. To handle the fluctuations in business demand, you can create the following scheduled tasks:
Increased traffic: You can enable a scheduled task to increase the number of service replicas every evening at 19:00. This improves the capability of the cluster to handle the increased traffic.
Decreased traffic: You can enable a scheduled task to decrease the number of service replicas every morning at 01:00. This improves resource utilization and maximizes cost efficiency.
Solution 3: Autoscaling based on resource utilization thresholds
Scenario: Sudden fluctuations in workloads
Implementation method:
Trigger scaling events when resource utilization exceeds or falls below the specified threshold
You can create event-triggered tasks to trigger scaling events. When resource utilization exceeds or falls below the specified threshold, the event-triggered tasks are automatically executed to trigger scaling events.
Maintain the desired resource utilization
You can create a target tracking scaling rule in your scaling group to maintain the desired resource utilization.
Example
You create a target tracking scaling rule in a scaling group of the Elastic Compute Service (ECS) type and specify 80% as the desired average CPU utilization. In this case, Auto Scaling dynamically adds or removes instances to maintain the average CPU utilization at 80%.
Differences between the implementation methods
The simple scaling rule or step scaling rule-based implementation method provides increased flexibility and customization. You can control the number of instances to add or remove after event-triggered tasks are triggered. This implementation method also allows for instance scaling based on changes in resource utilization tiers.
The target tracking scaling rule-based implementation method is more simplified. You need to only focus on the desired resource utilization.
Solution 4: Custom scaling
If none of the preceding solutions meets your business requirements, you can configure a custom scaling solution.
You can manually execute scaling rules or modify the instance numbers to trigger scaling events. For more information, see Manually scale ECS instances with a few clicks.
Custom scaling supports API calls. You can call API operations to configure custom scaling solutions based on your business requirements.
Solution 5: Predictive scaling
Auto Scaling can also automatically make adjustments to meet predicted resource demands.
This solution allows you to perform initial testing of predictive scaling rules by enabling only prediction to assess accuracy and relevance. If the results are satisfactory, you can enable both prediction and scaling to automatically generate predictive tasks and scale instances based on scheduled plans. For more information, see View the prediction of a predictive scaling rule.
Usage notes
Before you use a scaling group, make sure that the instances on which you deploy your business support horizontal scaling.
Auto Scaling horizontally scales instances. We recommend that you consider the potential impact of horizontal scaling on your business.
Data consistency
If your database is deployed on instances, data inconsistency may occur after you use Auto Scaling to horizontally scale out instances. To resolve this issue, we recommend that you change your architectural design by separately deploying the database and allowing all instances to access the same database. This helps achieve service statelessness.
Data security
Instances in scaling groups are automatically created and released. If you store data on the instances, make sure that you perform data backup operations to secure your data.