How to configure auto scaling for cross-zone deployment - Container Service for Kubernetes

Multi-zone load balancing is a deployment solution that is commonly used in high availability scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want Container Service for Kubernetes (ACK) to create a specific number of nodes in each zone of the application. This topic describes how to configure auto scaling for cross-zone deployment.

Prerequisites

Multiple zones are selected and vSwitches are created in the zones. You must create at least one vSwitch in the zone in which you want to create nodes. For more information about how to create zones and vSwitches, see Create an ACK managed cluster.

Background Information

The node auto scaling component provided by ACK checks whether an application can be deployed in a specific scaling group by using prescheduling. Then, the component sends a scale-out request to the scaling group and the scaling group creates the requested number of nodes. The following issues may occur if you configure vSwitches of multiple zones for a scaling group:

If application pods in multiple zones cannot be scheduled due to insufficient cluster resources, the node auto scaling component sends a request to the scaling group to trigger scale-out activities. However, the scaling group may not be able to create nodes in each zone that requires more nodes. Instead, the scaling group may create nodes only in specific zones. This does not meet the requirement of auto scaling for cross-zone deployment.

Solution

To meet the requirement of auto scaling for cross-zone deployment, ACK provides the ack-autoscaling-placeholder component. The component fixes this issue by using resource redundancy. The component can scale out node pools in different zones concurrently instead of creating nodes in specific zones. For more information, see Use ack-autoscaling-placeholder to scale pods within seconds.

The following section describes how to configure auto scaling for cross-zone deployment:

Create a node pool in each zone and add a label to the node pool. The label specifies the zone in which the node pool is deployed.
Configure the nodeSelector to schedule pods based on zone labels. This way, ack-autoscaling-placehodler can schedule a placeholder pod to each zone. By default, the priority of placeholder pods is lower than that of application pods.
This allows pending application pods to replace placeholder pods. After the pending application pods replace the placeholder pods that are scheduled by using the nodeSelector, the placeholder pods become pending. The node scheduling policy that is used by the node auto scaling component is changed from antiAffinity to nodeSelector. This way, the node auto scaling component can create nodes in each zone of the application concurrently.

The following figure shows how to create nodes in two zones concurrently based on the existing architecture.

Use ack-autoscaling-placeholder as the bridge between the application pods and the node auto scaling component and create a placeholder pod in each zone. The priority of the placeholder pods must be lower than the priority of the application pods.
After the state of the application pods changes to Pending, the application pods replace the placeholder pods and are scheduled to the existing nodes of each zone. The state of the placeholder pods changes to Pending.
The placeholder pods are scheduled by using the nodeSelector. The node auto scaling component must create nodes in each zone to host the placeholder pods.

Step 1: Create a node pool for each zone and configure a custom node label

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click the name of the cluster. On the details page of the cluster, choose Nodes > Node Pools in the left-side navigation pane.
In the upper-right corner of the Node Pools page, click Create Node Pool.

In the Create Node Pool dialog box, configure the parameters for the node pool. In this example, the auto-zone-I node pool for which the auto scaling feature is enabled is created in Zone I.

The following table describes the key parameters. For more information about other parameters, see Create a node pool.

Parameter	Description
Node Pool Name	auto-zone-I
vSwitch	Select a vSwitch deployed in Zone I.
Auto Scaling	Enable auto scaling.
Node Label	Set the Key parameter of the node label to avaliable_zone and the Value parameter to i.

Click Confirm Order.
On the Node Pools page, you can find the node pools that you create. After the state of the auto-zone-I node pool changes to Active, the node pool is created.
Repeat the preceding steps to create a node pool with the auto scaling feature enabled for each zone that requires auto scaling.

节点池多可用区.png

Step 2: Deploy ack-autoscaling-placeholder and placeholder Deployments

In the left-side navigation pane of the ACK console, choose Marketplace > Marketplace.
On the App Catalog tab, find and click ack-autoscaling-placeholder.
On the ack-autoscaling-placeholder page, click Deploy.
In the Basic Information step, select a cluster from the Cluster drop-down list and a namespace from the Namespace drop-down list, and then click Next. Select a chart version from the Chart Version drop-down list, configure the parameters, and then click OK.
After ack-autoscaling-placeholder is deployed, go to the cluster details page. In the left-side navigation, choose Applications > Helm. You can find that the application is in the Deployed state.
In the left-side navigation pane of the details page, choose Applications > Helm.
On the Helm page, find ack-autoscaling-placeholder-default and click Update in the Actions column.

In the Update Release panel, modify the YAML template based on your requirements and click OK. Deploy a placeholder Deployment in each zone.

The following YAML template provides an example on how to deploy a placeholder Deployment in Zone I, Zone K, and Zone H:

deployments:
- affinity: {}
  annotations: {}
  containers:
  - image: registry-vpc.cn-beijing.aliyuncs.com/acs/pause:3.1
    imagePullPolicy: IfNotPresent
    name: placeholder
    resources:
      requests:
        cpu: 3500m     # The CPU request of the placeholder Deployment. 
        memory: 6      # The memory request of the placeholder Deployment. 
  imagePullSecrets: {}
  labels: {}
  name: ack-place-holder-I             # The name of the placeholder Deployment. 
  nodeSelector: {"avaliable_zone":i}   # The zone label. The label must be the same as the label that you specified in Step 1 when you created the node pool. 
  replicaCount: 10                     # The number of pods that are created in each scale-out activity. 
  tolerations: []
- affinity: {}
  annotations: {}
  containers:
  - image: registry-vpc.cn-beijing.aliyuncs.com/acs/pause:3.1
    imagePullPolicy: IfNotPresent
    name: placeholder
    resources:
      requests:
        cpu: 3500m    # The CPU request of the placeholder Deployment. 
        memory: 6     # The memory request of the placeholder Deployment. 
  imagePullSecrets: {}
  labels: {}
  name: ack-place-holder-K            # The name of the placeholder Deployment. 
  nodeSelector: {"avaliable_zone":k}  # The zone label. The label must be the same as the label that you specified in Step 1 when you created the node pool. 
  replicaCount: 10                    # The number of pods that are created in each scale-out activity. 
  tolerations: []
- affinity: {}
  annotations: {}
  containers:
  - image: registry-vpc.cn-beijing.aliyuncs.com/acs/pause:3.1
    imagePullPolicy: IfNotPresent
    name: placeholder
    resources:
      requests:
        cpu: 3500m   # The CPU request of the placeholder Deployment. 
        memory: 6    # The memory request of the placeholder Deployment. 
  imagePullSecrets: {}
  labels: {}
  name: ack-place-holder-H           # The name of the placeholder Deployment. 
  nodeSelector: {"avaliable_zone":h} # The zone label. The label must be the same as the label that you specified in Step 1 when you created the node pool. 
  replicaCount: 10                   # The number of pods that are created in each scale-out activity. 
  tolerations: []
fullnameOverride: ""
nameOverride: ""
podSecurityContext: {}
priorityClassDefault:
  enabled: true
  name: default-priority-class
  value: -1

After you update the YAML file, a placeholder Deployment is deployed in each zone. 创建占位Deployment.png

Step 3: Create a PriorityClass for a workload

Create a file named priorityClass.yaml by using the following YAML template:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000              # Specify the priority value. The value must be higher than the default priority value of the Deployments that you create in Step 2. 
globalDefault: false
description: "This priority class should be used for XYZ service pods only."

If you do not want to configure a PriorityClass for a workload, you can configure a global PriorityClass as the default configuration. After you deploy the configurations, application pods can automatically replace placeholder pods.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: global-high-priority
value: 1                              # Specify the priority value. The value must be greater than the default priority of the Deployments that you create in Step 2. 
globalDefault: true
description: "This priority class should be used for XYZ service pods only."

Run the following command to deploy the PriorityClass for the workload:

kubectl apply -f priorityClass.yaml

Expected output:

priorityclass.scheduling.k8s.io/high-priority created

Step 4: Deploy a workload

In this example, a workload is deployed in Zone I.

Create a file named workload.yaml by using the following YAML template:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: placeholder-test
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:                        # Select a node to deploy the workload. 
        avaliable_zone: "i"
      priorityClassName: high-priority     # Specify the name of the PriorityClass that is created in Step 3. If you enable global configurations, this parameter is optional. 
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 3                         # Specify the resource request of the workload. 
            memory: 5

Run the following command to deploy the workload:
```
kubectl apply -f workload.yaml
```
Expected output:
```
deployment.apps/placeholder-test created
```
Verify the result
After you deploy the workload, go to the cluster details page and choose Workloads > Pods in the left-side navigation pane. On the Pods page, you can find that the PriorityClass of the workload has a higher priority value than that of the placeholder pods. This way, the workload pods can run on the nodes that are added. The placeholder pods trigger the node auto scaling component to create nodes in each zone concurrently and prepare for the next scale-out activity requested by the workload.
Choose Nodes > Nodes. On the Nodes page, you can find that the workload pod runs on the node that hosts the placeholder pod.