All Products
Search
Document Center

Auto Scaling:Create a scaling group of the Elastic Container Instance type

Last Updated:Oct 23, 2024

A scaling group is a collection of Elastic Compute Service (ECS) instances or elastic container instances that can be used in similar business scenarios. If you run your container applications on elastic container instances, you can activate Auto Scaling to automatically scale the number of elastic container instances based on your business requirements. This ensures smooth service delivery and maximizes cost efficiency.

Procedure

Before you proceed, choose which method you want to use to create the scaling group. Alibaba Cloud provides multiple methods to create a scaling group. You can follow the instructions provided in this topic based on your specific business requirements.

Method 1: Create a scaling group based on the configurations of an existing instance

If you want to create a scaling group of the Elastic Container Instance type based on the configurations of an existing elastic container instance, perform the following steps:

  1. Go to the Create Scaling Group page.

    1. Log on to the Auto Scaling console.

      Note

      The first time you log on to Auto Scaling, follow the on-screen instructions to activate Auto Scaling and grant the required permissions. For more information, see Service-linked role.

    2. In the top navigation bar, select the region where Auto Scaling is activated.

    3. In the left-side navigation pane, click Scaling Groups.

    4. On the Scaling Groups page, click Create to go to the Create Scaling Group page.

  2. Click the Create by Form tab and follow the on-screen instructions to create a scaling group.

    The following table describes the required parameters for creating a scaling group based on the configurations of an existing elastic container instance. For information about other parameters, see Parameters.

    Required parameters

    Parameter

    Description

    Scaling Group Name

    Enter a name for the scaling group. The name must meet the requirements displayed on the Auto Scaling console.

    Type

    Specify the type of the scaling group. In this example, ECI is used.

    Instance Configuration Source

    Specify an instance configuration source. In this example, Select Existing Instance is used. Auto Scaling creates elastic container instances in the scaling group based on the configurations of the selected instance.

    Minimum Number of Instances

    Specify the lower limit of the number of elastic container instances in the scaling group. If the actual number of elastic container instances is less than the lower limit, Auto Scaling triggers a scale-out event to add elastic container instances to the scaling group until the lower limit is reached.

    Maximum Number of Instances

    Specify the upper limit of the number of elastic container instances in the scaling group. If the actual number of elastic container instances exceeds the upper limit, Auto Scaling triggers a scale-in event to remove excess elastic container instances from the scaling group.

    Default Cooldown Time (Seconds)

    Specify a default cooldown period for the scaling group. Unit: seconds. Default value: 300. For more information, see Cooldown period.

    VPC

    Specify a virtual private cloud (VPC) for the scaling group. All elastic container instances in the scaling group communicate with each other over the VPC. When you create a scaling group based on the configurations of an existing elastic container instance, the VPC parameter is automatically configured as the VPC of the elastic container instance. You can also modify the VPC parameter based on your business requirements.

    Warning

    However, after you create the scaling group, you can no longer modify the VPC parameter.

    vSwitch

    After you select a VPC for the scaling group, you must select one or more vSwitches of the VPC for the scaling group. All elastic container instances in the scaling group communicate with each other by using the selected vSwitches.

    Important

    We recommend that you select multiple vSwitches. If you select only one vSwitch, scale-out failures caused by insufficient resources in a single zone may occur.

  3. Click Create.

Note

  • If you create a scaling group based on the configurations of an existing elastic container instance, a scaling configuration is automatically created. You can manage the scaling configuration based on your business requirements.

  • If you want the scaling group to provide services immediately after the creation is complete, enable the scaling group. For information about how to enable a scaling group, see Enable or disable scaling groups.

Method 2: Create a scaling group from scratch

If you want to configure an instance configuration source after you create a scaling group, perform the following steps:

  1. Go to the Create Scaling Group page.

    1. Log on to the Auto Scaling console.

      Note

      The first time you log on to Auto Scaling, follow the on-screen instructions to activate Auto Scaling and grant the required permissions. For more information, see Service-linked role.

    2. In the top navigation bar, select the region where Auto Scaling is activated.

    3. In the left-side navigation pane, click Scaling Groups.

    4. On the Scaling Groups page, click Create to go to the Create Scaling Group page.

  2. Click the Create by Form tab and follow the on-screen instructions to create a scaling group.

    The following table describes the required parameters for creating a scaling group from scratch. For information about other parameters, see Parameters.

    Required parameters

    Parameter

    Description

    Scaling Group Name

    Enter a name for the scaling group. The name must meet the requirements displayed on the Auto Scaling console.

    Type

    Specify the type of the scaling group. In this example, ECI is used.

    Note

    This topic describes only how to create a scaling group of the Elastic Container Instance type. For information about how to create a scaling group of the ECS type, see Create a scaling group of the ECS type.

    Instance Configuration Source

    Specify an instance configuration source. In this example, Create from Scratch is used. Therefore, you can configure an instance configuration source after you create the scaling group. For more information, see Overview.

    Minimum Number of Instances

    Specify the lower limit of the number of elastic container instances in the scaling group. If the actual number of elastic container instances is less than the lower limit, Auto Scaling triggers a scale-out event to add elastic container instances to the scaling group until the lower limit is reached.

    Maximum Number of Instances

    Specify the upper limit of the number of elastic container instances in the scaling group. If the actual number of elastic container instances exceeds the upper limit, Auto Scaling triggers a scale-in event to remove excess elastic container instances from the scaling group.

    Default Cooldown Time (Seconds)

    Specify a default cooldown period for the scaling group. Unit: seconds. Default value: 300. For more information, see Cooldown period.

    VPC

    Specify a VPC for the scaling group. All elastic container instances in the scaling group communicate with each other over the VPC. When you create a scaling group based on the configurations of an existing elastic container instance, the VPC parameter is automatically configured as the VPC of the elastic container instance. You can also modify the VPC parameter based on your business requirements.

    Warning

    However, you can no longer modify the VPC parameter after you create the scaling group.

    vSwitch

    After you select a VPC for the scaling group, you must select one or more vSwitches of the VPC for the scaling group. All elastic container instances in the scaling group communicate with each other by using the selected vSwitches.

    Important

    We recommend that you select multiple vSwitches. If you select only one vSwitch, scale-out failures caused by insufficient resources in a single zone may occur.

  3. Click Create.

Note

After you create the scaling group from scratch, you can follow the on-screen instructions to immediately create a scaling configuration. You can also create a scaling configuration for the scaling group later. For more information, see Create a scaling configuration of the Elastic Container Instance type.

Method 3: Create a scaling group by using a Kubernetes YAML file

If you want to create and manage a scaling group by using a Kubernetes YAML file, perform the following steps:

Use the Auto Scaling console

In this example, the Nginx:latest image is used to show how to create a scaling group of the Elastic Container Instance type in the Auto Scaling console by using a Kubernetes YAML file.

  1. Go to the Create Scaling Group page.

    1. Log on to the Auto Scaling console.

      Note

      The first time you log on to Auto Scaling, follow the on-screen instructions to activate Auto Scaling and grant the required permissions. For more information, see Service-linked role.

    2. In the top navigation bar, select the region where Auto Scaling is activated.

    3. In the left-side navigation pane, click Scaling Groups.

    4. On the Scaling Groups page, click Create to go to the Create Scaling Group page.

  2. Click the Create by YAML File tab to compile a YAML file.

    Sample code:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-demo
      annotations:
        # The name of the scaling group.
        k8s.aliyun.com/ess-scaling-group-name: use-yaml-create-scaling-group
        # The lower limit of the number of elastic container instances in the scaling group.
        k8s.aliyun.com/ess-scaling-group-min-size: '0'
        # The upper limit of the number of elastic container instances in the scaling group.
        k8s.aliyun.com/ess-scaling-group-max-size: '5'
    spec:
      selector:
        matchLabels:
          app: nginx-demo
      # The expected number of elastic container instances in the scaling group.
      replicas: 1
      template:
        metadata:
          labels:
              app: nginx-demo
          annotations:
            # Specifies whether to automatically create and bind elastic IP addresses (EIPs) to elastic container instances.
            k8s.aliyun.com/eci-with-eip: 'true'
            # The IDs of the vSwitches. You can specify up to eight vSwitches in the same VPC. Separate multiple vSwitches with commas (,).
            k8s.aliyun.com/eci-vswitch: vsw-bp******1,vsw-bp******2,vsw-bp******3,vsw-bp******4
            # The IDs of the security groups. You can specify up to five security groups in the same VPC. Separate multiple security groups with commas (,).
            k8s.aliyun.com/eci-security-group: sg-bp******1,sg-bp******2
        spec:
          containers:
            - name: nginx
              # The container image.
              image: nginx:latest
              ports:
                - containerPort: 80
                  name: http
                - containerPort: 443
                  name: https
              resources:
                requests:
                  memory: 0.05Gi
                  cpu: 50m
                limits:
                  memory: 1Gi
                  cpu: '1'
    

    The following table describes the parameters used in the preceding sample code. For more information about the supported parameters, see YAML parameters.

    Parameters in the sample code

    Parameter

    Description

    Example

    k8s.aliyun.com/ess-scaling-group-name

    The name of the scaling group.

    use-yaml-create-scaling-group

    k8s.aliyun.com/ess-scaling-group-min-size

    The lower limit of the number of elastic container instances in the scaling group.

    0

    k8s.aliyun.com/ess-scaling-group-max-size

    The upper limit of the number of elastic container instances in the scaling group.

    5

    k8s.aliyun.com/eci-with-eip

    Specifies whether to automatically assign EIPs to elastic container instances. If you want to assign EIPs to elastic container instances, set the value to true.

    true

    k8s.aliyun.com/eci-vswitch

    The IDs of the vSwitches. You can specify up to eight vSwitches in the same VPC. Separate multiple vSwitches with commas (,).

    Important

    If you specify no VPC or vSwitch, Auto Scaling automatically creates and uses a default VPC and vSwitch. For more information, see Default VPCs and default vSwitches.

    vsw-bp******1,vsw-bp******2,vsw-bp******3

    k8s.aliyun.com/eci-security-group: sg-bp******

    The ID of the security group. You can specify up to five security groups in the same VPC.

    Important

    If you specify multiple security groups, make sure that the security groups belong to the same VPC.

    sg-bp******1,sg-bp******2

  3. Click Create.

Use Alibaba Cloud CLI

You can run commands by using Alibaba Cloud Command Line Interface (CLI) to create and manage a scaling group. In this example, the Nginx:latest image is used to show how to create a scaling group of the Elastic Container Instance type by executing a YAML file in Alibaba Cloud CLI.

Important

Before you proceed, make sure that Alibaba Cloud CLI is installed and the required credentials and environment variables are configured. For more information, see What is Alibaba Cloud CLI?

  1. Create a file named use-yaml-create-scaling-group.yaml and add the following content to the file:

    Sample code:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-demo
      annotations:
        # The name of the scaling group.
        k8s.aliyun.com/ess-scaling-group-name: use-yaml-create-scaling-group
        # The lower limit of the number of elastic container instances in the scaling group.
        k8s.aliyun.com/ess-scaling-group-min-size: '0'
        # The upper limit of the number of elastic container instances in the scaling group.
        k8s.aliyun.com/ess-scaling-group-max-size: '5'
    spec:
      selector:
        matchLabels:
          app: nginx-demo
      # The expected number of elastic container instances in the scaling group.
      replicas: 1
      template:
        metadata:
          labels:
              app: nginx-demo
          annotations:
            # Specifies whether to automatically create and bind elastic IP addresses (EIPs) to elastic container instances.
            k8s.aliyun.com/eci-with-eip: 'true'
            # The IDs of the vSwitches. You can specify up to eight vSwitches in the same VPC. Separate multiple vSwitches with commas (,).
            k8s.aliyun.com/eci-vswitch: vsw-bp******1,vsw-bp******2,vsw-bp******3,vsw-bp******4
            # The IDs of the security groups. You can specify up to five security groups in the same VPC. Separate multiple security groups with commas (,).
            k8s.aliyun.com/eci-security-group: sg-bp******1,sg-bp******2
        spec:
          containers:
            - name: nginx
              # The container image.
              image: nginx:latest
              ports:
                - containerPort: 80
                  name: http
                - containerPort: 443
                  name: https
              resources:
                requests:
                  memory: 0.05Gi
                  cpu: 50m
                limits:
                  memory: 1Gi
                  cpu: '1'
    

    The following table describes the parameters used in the preceding sample code. For more information about the supported parameters, see YAML parameters.

    Parameters in the sample code

    Parameter

    Description

    Example

    k8s.aliyun.com/ess-scaling-group-name

    The name of the scaling group.

    use-yaml-create-scaling-group

    k8s.aliyun.com/ess-scaling-group-min-size

    The lower limit of the number of elastic container instances in the scaling group.

    0

    k8s.aliyun.com/ess-scaling-group-max-size

    The upper limit of the number of elastic container instances in the scaling group.

    5

    k8s.aliyun.com/eci-with-eip

    Specifies whether to automatically assign EIPs to elastic container instances. If you want to assign EIPs to elastic container instances, set the value to true.

    true

    k8s.aliyun.com/eci-vswitch

    The IDs of the vSwitches. You can specify up to eight vSwitches in the same VPC. Separate multiple vSwitches with commas (,).

    Important

    If you specify no VPC or vSwitch, Auto Scaling automatically creates and uses a default VPC and vSwitch. For more information, see Default VPCs and default vSwitches.

    vsw-bp******1,vsw-bp******2,vsw-bp******3

    k8s.aliyun.com/eci-security-group: sg-bp******

    The ID of the security group. You can specify up to five security groups in the same VPC.

    Important

    If you specify multiple security groups, make sure that the security groups belong to the same VPC.

    sg-bp******1,sg-bp******2

  2. In the directory of the use-yaml-create-scaling-group.yaml file, run the following command to create a scaling group:

    Important
    • In this example, the China (Hangzhou) region is used. You can modify the --RegionId parameter in the command based on your business requirements.

    • In this example, the command is run by using Alibaba Cloud CLI to call the ApplyScalingGroup operation. For information about the ApplyScalingGroup operation, see ApplyScalingGroup.

    aliyun ess ApplyScalingGroup  --RegionId cn-hangzhou --Content "$(cat test-aliyun-cli-create-group.yaml)" --version 2022-02-22 --method POST --force

Parameters

Basic parameters for creating a scaling group

Parameter

Description

Scaling Group Name

The name of a scaling group must be 2 to 64 characters in length, and can contain letters, digits, periods (.), underscores (_), and hyphens (-). The name must start with a letter or a digit.

Type

The type of instances that provide computing power in the scaling group. The type of instances that Auto Scaling adds to or removes from the scaling group during a scaling event also matches the value of the Type parameter. Valid values:

  • ECS: ECS instances.

  • ECI: elastic container instances.

Instance Configuration Source

The instance configuration source. Auto Scaling creates ECS instances in the scaling group based on the specified source. Valid values:

  • Launch Templates: A launch template contains information such as the key pair, Resource Access Management (RAM) role, instance type, and network settings. A launch template does not contain passwords. The Launch Templates setting is available only if you set the Type parameter to ECS.

    Scale-out failures caused by insufficient resources may often occur if you specify only one instance type. You can configure the Extend Launch Template parameter to specify multiple instance types to improve the success rate of scale-out events. For more information, see Create a multi-instance type scaling group by using a launch template.

  • Select Existing Instance: You must select an existing instance. Auto Scaling extracts the basic configurations of the instance to create a default scaling configuration.

    Important
    • The basic configurations that are extracted from an existing ECS instance include the instance type, network type, security group, and base image. Take note that the instance logon password and tags are not extracted. The base image is the image of the existing ECS instance. The base image does not include instance data such as application data. If you want the scaling configuration created from an existing ECS instance to contain the system configurations and instance data of the ECS instance, you must create a custom image from the ECS instance.

  • Create from Scratch: You can configure an instance configuration source after you create the scaling group. The instance configuration source can be a scaling configuration or a launch template. The steps that you must perform to create a scaling configuration vary based on the value of the Type parameter. For more information, see Create a scaling configuration of the ECS type and Create a scaling configuration of the Elastic Container Instance type.

Note

If you create a scaling group based on an existing ECS instance created in the ECS console, Auto Scaling automatically fills the instance configuration source and network type of the scaling group. We recommend that you retain the settings.

Suspend Process

You can suspend processes before you perform specific operations. For example, you can suspend the health check process before you stop an instance. This way, if the health check fails, the instance is not removed from the scaling group. You can suspend the following processes in a scaling group:

  • Scale-out: If you suspend a process of this type, Auto Scaling rejects all scale-out requests.

  • Scale-in: If you suspend a process of this type, Auto Scaling rejects all scale-in requests.

  • Health Check: If you suspend a process of this type, Auto Scaling suspends health checks and does not remove unhealthy instances.

  • Scheduled Task: If you suspend a process of this type, Auto Scaling does not execute the scaling rule that is associated with a scheduled task when the execution time of the scheduled task arrives.

  • Event-triggered Task: If you suspend a process of this type, Auto Scaling does not execute the scaling rule that is associated with an event-triggered task when the event-triggered task enters the Alert state.

For more information, see Suspend and resume scaling group processes.

Deletion Protection

After you enable this feature, you cannot delete the scaling group in the Auto Scaling console or by calling an API operation. This helps prevent scaling groups from being accidentally deleted.

Instance Health Check

After you enable this feature, Auto Scaling checks the status of instances on a regular basis. If Auto Scaling detects that an instance does not run as expected, Auto Scaling considers the instance unhealthy and removes the instance from the scaling group. For more information, see Instance lifecycles. Valid values:

  • Disable: disables Instance Health Check.

  • Instance Status Check: enables Instance Health Check.

  • Load Balancer Health Check: enables Instance Health Check and uses the health check results of Application Load Balancer (ALB) or Network Load Balancer (NLB) instances as the basis for checking the health status of instances in the scaling group. The health check results of Classic Load Balancer (CLB) instances cannot be used to check the health status of instances in the scaling group.

Maximum Life Span of Instance (Seconds)

The maximum lifespan of each instance in the scaling group. When the lifespan of an instance in the scaling group exceeds the value of this parameter, Auto Scaling automatically creates a new instance to replace the instance.

Note

This parameter is available only if you set the Type parameter to ECS.

Parameters for configuring instance numbers

Parameter

Description

Minimum Number of Instances

The lower limit of the number of instances in the scaling group. If the actual number of instances drops below the lower limit, Auto Scaling automatically adds instances to the scaling group until the actual number reaches the lower limit.

Maximum Number of Instances

The upper limit of the number of instances in the scaling group. If the actual number of instances exceeds the upper limit, Auto Scaling automatically removes instances from the scaling group until the actual number equals to the upper limit. For information about scale-in policies, see Combine scaling policies and scale-in policies.

Expected Number of Instances

The desired number of instances in the scaling group. If you enable this feature, Auto Scaling automatically ensures that the actual number of instances in the scaling group equals the desired number. For more information, see Expected number of instances.

Parameters for triggering scale-out or scale-in events

Important

If your scaling group is of the Elastic Container Instance type, you cannot modify the default settings of the following parameters: Scaling Policy, Scale-In Policy, and Instance Reclaim Mode. Default values of the parameters:

  • Scaling Policy: Cost Optimization Policy

  • Scale-In Policy: Instances Created From Earliest Scaling Configuration as the first scale-in step and Earliest Instances as the second scale-in step

  • Instance Reclaim Mode: Release

Parameter

Description

Scaling Policy

By default, Auto Scaling triggers scaling events in a scaling group based on the specified order (priority policy) of vSwitches. You can set the Scaling Policy parameter to other policies based on your business requirements.

Important

You can configure the Scaling Policy parameter only for scaling groups whose Type parameter is set to ECS and Network Type parameter is set to VPC. If your scaling group is of the ECI type, only the priority policy is supported.

  • Priority policy (default)

    This policy adds or removes Elastic Compute Service (ECS) instances based on the specified vSwitches. If Auto Scaling cannot create ECS instances in the zone where the vSwitch with the highest priority resides, Auto Scaling creates ECS instances in the zone where the vSwitch with the next highest priority resides.

  • Balanced distribution policy

    This policy ensures disaster recovery. If you want to evenly distribute ECS instances across the zones of your scaling group after scaling events are complete, use this policy. If ECS instances are not evenly distributed across multiple zones due to insufficient resources, execute the balanced distribution policy to evenly redistribute instances across the zones. For more information, see Rebalance the distribution of ECS instances.

  • Cost optimization policy

    If you prioritize cost in your decision-making, use this policy. When a scale-out event occurs, Auto Scaling preferentially creates ECS instances by using the instance type that has the lowest-priced vCPU. If multiple preemptible instance types are specified in your scaling configuration, Auto Scaling preferentially creates preemptible instances. If Auto Scaling fails to create preemptible instances due to insufficient resources, Auto Scaling attempts to create pay-as-you-go instances. When a scale-in event occurs, Auto Scaling preferentially removes ECS instances of the instance type that has the highest-priced vCPU from your scaling group.

  • Custom combination policy

    If you use this policy, you can adjust the ratio of pay-as-you-go instances to preemptible instances, balance instance distribution across zones, and specify instance types.

For information about the custom combination policy, see Configure a scaling policy.

Scale-In Policy

When a scale-in request is triggered, Auto Scaling removes instances from the scaling group based on the steps defined in the scale-in policy. This parameter is displayed only if you set the Type parameter to ECS. Valid values:

  • Instances Created From Earliest Scaling Configuration: Auto Scaling removes the instances created from the earliest scaling configuration from the scaling group. No scaling configuration or launch template is associated with instances that are manually added to the scaling group. Therefore, instances that are manually added are not removed first. If more instances need to be removed from the scaling group after Auto Scaling removes all instances with which the earliest scaling configuration or the earliest launch template is associated, Auto Scaling removes manually added instances at random.

    Important
    • The scaling configuration in the Instances Created From Earliest Scaling Configuration setting can be a scaling configuration or a launch template.

    • If the scaling configuration is a launch template, the point in time at which the launch template is applied in the scaling group matters. Example:

      If you apply a launch template of version 2 but subsequently roll back the launch template to version 1, the earliest scaling configuration in this case is the launch template of version 2.

  • Earliest Instances: Auto Scaling removes the instances that were created at the earliest point in time from the scaling group.

  • Most Recent Instances: Auto Scaling removes the instances that were created at the most recent point in time from the scaling group.

  • Custom Policy: Auto Scaling removes instances from the scaling group based on a custom policy. The custom policy contains a service, version, and function.

If more than one instance meets the scale-in requirements when you set the Scale-In Policy parameter to Instances Created From Earliest Scaling Configuration or Custom Policy, you can proceed to configure the Then Remove parameter. Valid values of the Then Remove parameter:

  • No Policy: Auto Scaling stops removing instances from the scaling group even if multiple instances meet the scale-in requirements.

  • Earliest Instances: Auto Scaling removes the instances that were created at the earliest point in time among the remaining instances from the scaling group.

  • Most Recent Instances: Auto Scaling removes the instances that were created at the most recent point in time among the remaining instances from the scaling group.

Note

The value of the Scaling Policy parameter also affects the manner in which instances are removed from scaling groups. For more information, see Combine scaling policies and scale-in policies.

Instance Reclaim Mode

After an instance is removed from the scaling group, Auto Scaling reclaims the instance based on the value of this parameter. Valid values:

Note

This parameter is available only if you set the Type parameter to ECS and the Network Type parameter to VPC. By default, elastic container instances are released after they are removed from scaling groups.

  • Release: releases instances that are removed from the scaling group. In this case, no resources are retained. If a scale-out request is triggered, Auto Scaling creates new instances and adds the instances to the scaling group.

  • Economical Mode: stops instances that are removed from the scaling group in Economical Mode. In this case, you are still charged for resources that are retained. If a scale-out request is triggered, Auto Scaling preferentially adds the instances that are stopped in Economical Mode to the scaling group. If the number of instances that are in Economical Mode does not meet the scale-out requirement, Auto Scaling creates and adds new instances. The Economical Mode setting helps improve the efficiency of scaling. For more information, see Use the Economical Mode feature to scale instances faster. Valid values:

    Important
    • Data that is stored on instances may be lost when the instances are reclaimed. To prevent data loss, do not store application data or logs on instances.

    • In the following scenarios, instances that are stopped in Economical Mode may be released:

      • If you manually change the value of the Maximum Number of Instances parameter for the scaling group and the number of instances in all states in the scaling group is greater than the new value of the Maximum Number of Instances parameter, Auto Scaling preferentially releases instances that are stopped in Economical Mode.

      • If your Alibaba Cloud account has overdue payments or insufficient resources, instances that are stopped in Economical Mode may fail to be added to the scaling group during a scale-out event. In this case, Auto Scaling may release the failed instances.

    • For more information about Economical Mode, see Economical mode.

  • Forcibly Release: In this mode, Auto Scaling forcibly releases instances that are in the Running state when scale-in events are triggered.

    Warning

    Forced release is equivalent to a power outage. Forced release may cause ephemeral data on instances to be irrecoverably lost. Exercise caution when you specify this setting.

  • Forcibly Recycle: In this mode, Auto Scaling forcibly shuts down instances that are in the Running state when scale-in events are triggered.

    Warning

    Forced shutdown is equivalent to a power outage. Forced shutdown may cause ephemeral data on instances to be irrecoverably lost. Exercise caution when you specify this setting.

Parameters for configuring a network

Important

When you set the Instance Configuration Source parameter to Launch Templates or Select Existing Instance, the following parameters are automatically filled based on the selected launch template or existing instance: Network Type, VPC, and vSwitch. You can also reconfigure these parameters based on your business requirements.

Parameter

Description

Network Type

Specify the network type for the scaling group that you want to create. Valid values: VPC and Classic Network.

Warning

After you create the scaling group, you cannot change the value of the Network Type parameter.

Important

We recommend that you select VPC. Scaling groups that reside in VPCs support more flexible scaling policies, instance reclaim modes, and load balancing mechanisms. For more information about VPCs, see What is a VPC?

VPC

This parameter is displayed only if you set the Network Type parameter to VPC. After you select a VPC from the VPC drop-down list, all ECS instances in this scaling group reside in the VPC.

Warning

After you create the scaling group, you cannot change the value of the VPC parameter.

vSwitch

After you select a VPC for the scaling group, you must select one or more vSwitches of the VPC for the scaling group. All ECS instances in the scaling group communicate with each other by using the selected vSwitches.

Important

We recommend that you select multiple vSwitches. If you select only one vSwitch, scale-out failures caused by insufficient resources in a single zone may often occur. If you select multiple vSwitches, you can set the Scaling Policy parameter to Balanced Distribution Policy to manage the distribution of ECS instances across multiple zones.

Parameters for associating a scaling group with other cloud services

Parameter

Description

Associate with ApsaraDB RDS, Redis, or MongoDB

If ECS instances in the scaling group need to access a database such as ApsaraDB RDS, Redis, or MongoDB, you can associate the scaling group with the database by configuring this parameter. You can select one of the following association modes based on the type of the database that you want to use: IP Whitelist Mode and Security Group Mode.

Note

Associate CLB Instance

After you associate a CLB instance with a scaling group, Auto Scaling adds the instances in the scaling group to the backend server groups of the CLB instance as backend servers. Then, the CLB instance forwards requests to the backend servers.

The following types of server groups are supported:

  • Default server group: the group of instances that are used to receive requests. If you do not specify a vServer group or a primary/secondary server group for a listener, requests are forwarded to the instances in the default server group.

  • vServer group: If you want to forward requests to backend servers that are not in the default server group or configure domain name- or URL-based routing methods, you can use vServer groups.

If you specify the default server group and multiple vServer groups at the same time, the instances are added to all the specified server groups.

Note

An upper limit is imposed on the number of CLB instances and server groups that you can associate with a scaling group. To view the quota or request a quota increase, go to Quota Center.

Associate ALB and NLB Server Groups

Important

This parameter is available only if you set the Network Type parameter to VPC.

After you associate an ALB or NLB server group with a scaling group, Auto Scaling adds instances in the scaling group to the ALB or NLB server group as backend servers. Then, the ALB or NLB instance forwards requests to the backend servers. You must specify the port number and weight for each backend server. By default, the weight of a backend server is 50. If you increase the weight of a server, the number of requests that are forwarded to the server increases. If you set the weight of a backend server to 0, no requests are forwarded to the server.

If you associate multiple ALB or NLB server groups with the same scaling group, Auto Scaling adds instances in the scaling group to all the ALB or NLB server groups at the same time.

Note

An upper limit is imposed on the number of ALB or NLB server groups that you can associate with a scaling group. To view the quota or request a quota increase, go to Quota Center.

Other parameters

Parameter

Description

Tag

You can add tags to scaling groups for easy search and aggregation. For more information, see Tags.

Note

The tags apply only to the scaling group. If you want to add tags to an instance in the scaling group, configure the tags in the scaling configuration or the launch template based on which the instance is created.

Tags Propagated to Instances During Scale-out

After you add one or more tags to the scaling group, you can continue to specify whether to propagate the tags to instances in the scaling group during scale-out events.

Add Existing Instance

This parameter is available only if you set the Type parameter to ECS and set the Instance Configuration Source parameter to Launch Templates or Select Existing Instance.

If you configure the Expected Number of Instances and Add Existing Instance parameters at the same time, the value of the Expected Number of Instances parameter automatically increases. For example, if you set the Expected Number of Instances parameter to 1 and select two existing ECS instances from the Add Existing Instance drop-down list when you create the scaling group, the value of the Expected Number of Instances parameter automatically increases from 1 to 3.

If you want to use the scaling group to manage the instance lifecycles, you can select Enable the scaling group to manage the instance lifecycle.

  • If you enable this feature, Auto Scaling may remove instances that are considered unhealthy from the scaling group. Auto Scaling may also release the instances that you manually remove from the scaling group.

  • If you do not enable this feature, Auto Scaling does not release instances that are removed from the scaling group.

Note

You can add subscription instances to the scaling group, but you cannot enable the scaling group to manage the lifecycles of the subscription instances.

Create Regular Rule

When a scaling event succeeds, fails, or is rejected, Auto Scaling notifies you by text message, internal message, or email based on the rule you set. For more information, see Create a regular notification rule.

Resource Group

You can add scaling groups to resource groups. Then, you can manage scaling groups by resource group. This facilitates resource isolation and permission control. For more information, see Use resource groups to manage scaling groups in a fine-grained manner.

Synchronize Alert Rule to CloudMonitor

You can enable or disable this feature only when you create a scaling group. After you enable this feature, Alibaba Cloud creates and associates a CloudMonitor application group with the scaling group. The alert rules of the scaling group are synced and displayed in the CloudMonitor console.

YAML fields

Supported Kubernetes YAML fields

When you use a Kubernetes Deployment to deploy a scaling group of the Elastic Container Instance type, you can configure only the following YAML fields:

Note

A YAML file typically consists of the kind, metadata, and spec fields. For more information about the YAML file structure, see sample Deployment on the Kubernetes official website.

Supported Kubernetes YAML fields

  • kind: the resource type. Set the value to Deployment.

  • metadata.name: the resource name. This field does not take effect on the scaling group that you want to create. You can use the k8s.aliyun.com/ess-scaling-group-name annotation to specify a name for the scaling group.

  • spec.replicas: the number of pod replicas, which is the same as the expected number of elastic container instances in the scaling group that you want to create.

  • spec.template.spec: the pod configurations. The following table describes the supported features.

    Feature

    YAML field

    Description

    DNS

    dnsPolicy

    The Domain Name System (DNS) policy.

    dnsConfig.nameservers

    The IP addresses of the DNS servers.

    dnsConfig.searches

    The search domains of the DNS servers.

    dnsConfig.options.name

    The option key.

    dnsConfig.options.value

    The option value.

    Container

    contaners.name

    The container name.

    contaners.image

    The container image.

    contaners.command

    The container startup command.

    contaners.args

    The startup arguments of the container.

    contaners.imagePullPolicy

    The image pulling policy of the container.

    contaners.stdin

    Specifies whether to allocate buffer resources to stdin.

    contaners.stdinOnce

    Specifies whether to allocate one-time buffer resources to stdin.

    contaners.tty

    Specifies whether to allocate a TeleTYpe (TTY) to each container.

    contaners.ports

    containerPort

    The port number.

    protocol

    The TCP or UDP protocol.

    contaners.env

    name

    The name of the environment variable.

    value

    The value of the environment variable.

    contaners.resources

    requests.cpu

    The requested CPU resources.

    requests.memory

    The requested memory resources.

    limits.cpu

    The upper limit of resource usage.

    limits.memory

    The upper limit of memory usage.

    limits.nvidia.com/gpu

    The requested GPU resources. You can add annotations to the metadata section in the pod configuration file to specify GPU specifications.

    Then, you must add the nvidia.com/gpu field to the resources section in which you define the configurations of containers.

    contaners.securityContext

    runAsUser

    The ID of the user who runs the container.

    readOnlyRootFilesystem

    Specifies whether the root file system on which the container runs is read-only.

    capabilities.add

    Adds specific permissions to processes running in the container.

    contaners.volumeMounts

    name

    The volume that you want to mount to the container. The value of this field must match the custom name of the volume that you want to mount.

    mountPath

    The mount path of the volume in the container.

    mountPropagation

    The mount propagation settings of the container.

    readOnly

    Specifies whether the volume is mounted in read-only mode. Valid values:

    • true: The volume is mounted in read-only mode.

    • false: The volume is mounted in read/write mode.

    Default value: false.

    subPath

    The sub-path of the volume.

    contaners.livenessProbe

    • initialDelaySeconds

    • periodSeconds

    • successThreshold

    • timeoutSeconds

    • failureThreshold

    • exec.command

    • tcpSocket.port

    • httpGet.scheme

    • httpGet.port

    • httpGet.path

    The configurations for liveness, readiness, and startup probes.

    contaners.readinessProbe

    • initialDelaySeconds

    • periodSeconds

    • successThreshold

    • timeoutSeconds

    • failureThreshold

    • exec.command

    • tcpSocket.port

    • httpGet.scheme

    • httpGet.port

    • httpGet.path

    init container

    initContainers.name

    The name of the init container.

    initContainers.image

    The image of the init container.

    initContainers.command

    The startup command of the init container.

    initContainers.args

    The startup arguments of the init container.

    initContainers.imagePullPolicy

    The image pulling policy of the init container.

    initContainers.env

    name

    The name of the environment variable used by the init container.

    value

    The value of the environment variable used by the init container.

    initContainers.resources

    requests.cpu

    The CPU resources requested by the init container.

    requests.memory

    The memory resources requested by the init container.

    limits.cpu

    The upper limit of CPU utilization for the init container.

    limits.memory

    The upper limit of memory usage for the init container.

    limits.nvidia.com/gpu

    The upper limit of GPU usage for the init container.

    initContainers.securityContext

    capabilities.add

    Adds specific permissions to specific processes running in the init container.

    initContainers.volumeMounts

    name

    The volume that you want to mount to the init container. The value of this field must match the custom name of the volume that you want to mount.

    mountPath

    The mount path of the volume in the init container.

    mountPropagation

    The mount propagation settings of the init container.

    readOnly

    Specifies whether the volume is mounted in read-only mode. Valid values:

    • true: The volume is mounted in read-only mode.

    • false: The volume is mounted in read/write mode.

    Default value: false.

    subPath

    The sub-path of the volume.

    Volume

    volumes.nfs

    name

    The custom name of the volume.

    server

    The endpoint of the Network File System (NFS) server, which is the same as the mount target of the NAS file system.

    path

    The path to the NFS volume.

    readOnly

    Specifies whether the volume is read-only.

    volumes.emptyDir

    sizeLimit

    The size of the emptyDir volume. Unit: Gi or Mi.

    medium

    The storage medium of the emptyDir volume. Valid values:

    • If you leave this field empty, the node file system is used as the storage medium.

    • If you set the value to memory, the memory is used as the storage medium.

    By default, this field is left empty.

    volumes.flexVolume

    driver

    The driver name of the Flex volume.

    options

    The options of the Flex volume. Each option is a key-value pair in the JSON format.

    If you want to mount a Flex volume, specify the options in the {"volumeId":"d-2zehdahrwoa7srg****","performanceLevel": "PL2"} format.

    Graceful shutdown

    terminationGracePeriodSeconds

    The buffer period during which a program handles operations before the program is stopped. Unit: seconds.

Extended annotations

When you use a Kubernetes Deployment file in the YAML format to deploy a scaling group of the Elastic Container Instance type, you can extend only specific annotations. The following tables describe the extended annotations.

Extended annotations in metadata

Annotation

Description

Example

k8s.aliyun.com/ess-scaling-group-name

The name of the scaling group.

ess-group-test

k8s.aliyun.com/ess-scaling-group-min-size

The lower limit of the number of instances in the scaling group. Default value: 0.

0

k8s.aliyun.com/ess-scaling-group-max-size

The upper limit of the number of instances in the scaling group. Default value: max(replicas, 30).

20

Extended annotations in spec.template.spec

For information about more annotations, see Pod annotations.

Annotation

Example

Description

k8s.aliyun.com/eci-ntp-server

100.100.*.*

The IP address of the Network Time Protocol (NTP) server.

k8s.aliyun.com/eci-use-specs

2-4Gi

The specification of elastic container instances. You can specify multiple specifications. For more information, see Create pods by specifying multiple specifications.

k8s.aliyun.com/eci-vswitch

vsw-bp1xpiowfm5vo8o3c****

The ID of the vSwitch. You can specify multiple vSwitch IDs to ensure that elastic container instances can be created in zones where resources are sufficient.

k8s.aliyun.com/eci-security-group

sg-bp1dktddjsg5nktv****

The ID of the security group. The following requirements must be met:

  • You can specify up to five security groups.

  • If you specify multiple security groups, the security groups must belong to the same VPC.

  • If you specify multiple security groups, the security groups must be of the same type.

k8s.aliyun.com/eci-sls-enable

"false"

Specifies whether to collect logs for a pod. If you do not want to collect logs for a specific pod when you use custom resource definitions (CRDs) of Simple Log Service to collect logs, you can set this annotation to false to disable the log collection feature. This prevents waste of resources when the system automatically creates Logtail.

k8s.aliyun.com/eci-spot-strategy

SpotAsPriceGo

The bidding policy for preemptible instances. You can configure this annotation based on your business requirements. Valid values:

  • SpotWithPriceLimit: The instances are created as preemptible instances that have a maximum hourly price. In this case, you must also configure the k8s.aliyun.com/eci-spot-price-limit annotation.

  • SpotAsPriceGo: The instances are created as preemptible instances for which the market price at the time of purchase is automatically used as the bid price.

k8s.aliyun.com/eci-spot-price-limit

"0.5"

The maximum hourly price of preemptible instances. The value of this annotation can be accurate up to three decimal places. If you set the

k8s.aliyun.com/eci-spot-strategy

annotation to

SpotWithPriceLimit

this annotation takes effect.

k8s.aliyun.com/eci-with-eip

"true"

Specifies whether to automatically create and allocate an EIP to each elastic container instance.

k8s.aliyun.com/eci-data-cache-bucket

default

The bucket of data caches. You must configure this annotation when you create pods from data caches.

k8s.aliyun.com/eci-data-cache-pl

PL1

The performance level (PL) of the disk that is created from data caches. By default, an Enterprise SSD (ESSD) of PL1 is used.

k8s.aliyun.com/eci-data-cache-provisionedIops

"40000"

The provisioned read/write IOPS for the ESSD AutoPL disk. Valid values: 0 to min{50000, 1000 x Capacity - Baseline IOPS}, where Baseline IOPS = min{1800 + 50 x Capacity, 50000}. For more information, see ESSD AutoPL disks.

If you add this annotation, the disk that is created from data caches must be an ESSD AutoPL disk.

k8s.aliyun.com/eci-data-cache-burstingEnabled

"true"

Specifies whether to enable the Burst feature for the ESSD AutoPL disk. For more information, see ESSD AutoPL disks.

If you add this annotation, the disk that is created from data caches must be an ESSD AutoPL disk.

k8s.aliyun.com/eci-custom-tags

"env:test,name:alice"

The tag strings. You can bind up to three tags to each elastic container instance. Separate the tag key and the tag value with a colon (:). Separate multiple tags with commas (,).

For more information about annotations, see Pod annotations.