All Products
Search
Document Center

Container Service for Kubernetes:Spread Elastic Container Instance-based pods across zones and configure affinities

Last Updated:May 06, 2024

High availability and high performance are essential to distributed jobs. In an ACK Pro cluster, you can use Kubernetes-native scheduling semantics to spread distributed jobs across zones for high availability. You can also use Kubernetes-native scheduling semantics to deploy distributed jobs in specific zones based on affinity settings for high performance.

Prerequisites

  • An ACK Pro cluster that runs Kubernetes 1.20 or later is created. For more information, see Create an ACK managed cluster.

  • The kube-scheduler component in the ACK Pro cluster meets the requirements in the following table.

    ACK Pro cluster Kubernetes version

    kube-scheduler version (minimum requirement)

    1.28

    1.28.3-aliyun-5.9

    1.26

    1.26.3-aliyun-5.9

    1.24

    1.24.6-aliyun-5.9

    1.22

    1.22.15-aliyun-5.9

    Earlier than 1.22

    Note

    To spread Elastic Container Instance-based pods across zones and configure affinities, update the Kubernetes version of the cluster. For more information, see Update an ACK cluster.

    Not supported

    Note

    To spread Elastic Container Instance-based pods across zones and configure affinities, update the version of kube-scheduler. For more information, see kube-scheduler.

  • When you schedule Elastic Container Instance-based pods, make sure that you have specified the desired vSwitches in the eci-profile ConfigMap. For more information, see Configure an eci-profile.

  • Elastic Container Instance-based pods can be spread across zones only when the nodeAffinity, podAffinity, and topologySpreadConstraints parameters are configured for the pods or the pods match an existing resource policy.

    Note

    If you want to schedule pods to virtual nodes that use the ARM architecture, you must configure the tolerations parameter to tolerate the taints of the nodes.

  • The cluster has virtual node scheduling enabled, and the Kubernetes version and component version of the cluster meet the requirements.

Precautions

  • When you spread Elastic Container Instance-based pods across zones, you can set the topologyKey parameter only to topology.kubernetes.io/zone.

  • When you spread Elastic Container Instance-based pods across zones, you cannot add annotations to specify the priorities of the vSwitches to which pods are spread. If you specified the priorities of the vSwitches to which pods can be spread, the pods cannot be spread across zones.

  • You cannot spread Elastic Container Instance-based pods across zones in FastFailed mode. If the FastFailed mode is enabled for pods, the pods cannot be spread across zones.

Terms

For more information about pod topology spread and affinities, see Pod topology spread constraints and Assign pods to nodes. For more information about terms, see NodeAffinity, PodAffinity, and maxSkew.

Spread Elastic Container Instance-based pods across zones and configure affinities

The following examples show how to spread Elastic Container Instance-based pods across zones and configure affinities in an ACK Pro cluster that runs Kubernetes 1.22.

Example 1: Use topology spread constraints to spread Elastic Container Instance-based pods across zones

  1. Add a topology spread constraint to the configuration of a workload.

    Perform the following steps to specify a topology spread constraint in the Spec parameter in the configuration of a pod or the Spec parameter in the configuration of a workload, such as a Deployment or Job.

      topologySpreadConstraints:
        - maxSkew: <integer>
          minDomains: <integer> # This parameter is optional and is in the Beta phase in Kubernetes 1.25 and later. 
          topologyKey: <string>
          whenUnsatisfiable: <string>
          labelSelector: <object>
          matchLabelKeys: <list> # This parameter is optional and is in the Beta phase in Kubernetes 1.27 and later. 
          nodeAffinityPolicy: [Honor|Ignore] # This parameter is optional and is in the Beta phase in Kubernetes 1.26 and later. 
          nodeTaintsPolicy: [Honor|Ignore] # This parameter is optional and is in the Beta phase in Kubernetes 1.26 and later.

    In this example, a Deployment whose pods are evenly distributed to multiple zones is created. The following code block shows the YAML template of the Deployment:

    Show YAML content

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: with-pod-topology-spread
      labels:
        app: with-pod-topology-spread
    spec:
      replicas: 10
      selector:
        matchLabels:
          app: with-pod-topology-spread
      template:
        metadata:
          labels:
            app: with-pod-topology-spread
        spec:
          affinity:
            nodeAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 1
                preference:
                  matchExpressions:
                  - key: type
                    operator: NotIn
                    values:
                    - virtual-kubelet
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
                matchLabels:
                  app: with-pod-topology-spread
          tolerations:
            - key: "virtual-kubelet.io/provider"
              operator: "Exists"
              effect: "NoSchedule"
          containers:
          - name: with-pod-topology-spread
            image: registry.k8s.io/pause:2.0
            resources:
              requests:
                cpu: "1"
                memory: "256Mi"

    Parameter

    Description

    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
        - key: type
          operator: NotIn
          values:
          - virtual-kubelet

    The configuration specifies that the pods are preferentially scheduled to Elastic Compute Service (ECS) nodes.

    For more information about the parameters, see Node affinity.

    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: with-pod-topology-spread

    The configuration specifies that the pods are evenly deployed across multiple zones.

    For more information about the parameters, see topologySpreadConstraints field.

    tolerations:
      - key: "virtual-kubelet.io/provider"
        operator: "Exists"
        effect: "NoSchedule"

    kube-scheduler tolerates the taint of virtual nodes to schedule pods to the virtual nodes.

    For more information about the parameters, see Taints and Tolerations.

    Note

    If you want to schedule pods to ARM-based virtual nodes, you must add a toleration to the pods to tolerate the taint of the ARM-based virtual nodes.

  2. Create a workload.

    Create a file named deployment.yaml and copy the preceding YAML template to the file. Then, run the following command to create a Deployment in the cluster:

    kubectl apply -f deployment.yaml
  3. Verify the scheduling result of the workload.

    • Run the following command to query the nodes on which the Deployment deploys the pods:

      kubectl get po -lapp=with-pod-topology-spread -ocustom-columns=NAME:.metadata.name,NODE:.spec.nodeName --no-headers | grep -v "<none>"
    • Run the following command to query the number of pods that are created by the Deployment in each zone:

      kubectl get po -lapp=with-pod-topology-spread -ocustom-columns=NODE:.spec.nodeName --no-headers | grep -v "<none>" | xargs -I {} kubectl get no {} -ojson | jq '.metadata.labels["topology.kubernetes.io/zone"]' | sort | uniq -c

Example 2: Configure pod affinities and node affinities to deploy pods in specific zones

  1. Add affinities to the configuration of a workload.

    In this example, a Deployment whose pods are deployed in a single zone is created. The following code block shows the YAML template of the Deployment:

    Show YAML content

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: with-affinity
      labels:
        app: with-affinity
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: with-affinity
      template:
        metadata:
          labels:
            app: with-affinity
        spec:
          affinity:
            podAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - with-affinity
                topologyKey: topology.kubernetes.io/zone
            nodeAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 1
                preference:
                  matchExpressions:
                  - key: type
                    operator: NotIn
                    values:
                    - virtual-kubelet
          tolerations:
            - key: "virtual-kubelet.io/provider"
              operator: "Exists"
              effect: "NoSchedule"
          containers:
          - name: with-affinity
            image: registry.k8s.io/pause:2.0

    Parameter

    Description

    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - with-affinity
    		topologyKey: topology.kubernetes.io/zone

    The configuration specifies that all pods are deployed in a single zone.

    For more information about the parameters, see Node affinity.

    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: type
            operator: NotIn
            values:
            - virtual-kubelet

    The configuration specifies that the pods are preferentially scheduled to ECS nodes.

    For more information about the parameters, see Node affinity.

    tolerations:
      - key: "virtual-kubelet.io/provider"
        operator: "Exists"
        effect: "NoSchedule"

    kube-scheduler tolerates the taint of virtual nodes to schedule pods to the virtual nodes.

    For more information about the parameters, see Taints and Tolerations.

    If you want to deploy the pods in a specific zone, delete the podAffinity parameter and add the following constraint to the nodeAffinity parameter: The following configuration specifies that the pods must be deployed in Beijing Zone A.

    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values:
          - cn-beijing-a
  2. Create a workload.

    Create a file named deployment.yaml and copy the preceding YAML template to the file. Then, run the following command to create a Deployment in the cluster:

    kubectl apply -f deployment.yaml
  3. Verify the scheduling result of the workload.

    • Run the following command to query the nodes on which the Deployment deploys the pods:

      kubectl get po -lapp=with-affinity -ocustom-columns=NAME:.metadata.name,NODE:.spec.nodeName --no-headers | grep -v "<none>"
    • Run the following command to query the number of pods that are created by the Deployment in each zone:

      kubectl get po -lapp=with-affinity -ocustom-columns=NODE:.spec.nodeName --no-headers | grep -v "<none>" | xargs -I {} kubectl get no {} -ojson | jq '.metadata.labels["topology.kubernetes.io/zone"]' | sort | uniq -c

Strict Elastic Container Instance-based pod topology spread

By default, if you force the system to spread Elastic Container Instance-based pods across zones, kube-scheduler evenly deploys the pods of a workload across all zones. However, Elastic Container Instance-based pods may fail to be created in some zones. The following figure shows the scheduling result when the maxSkew parameter is set to 1. For more information about maxSkew, see maxSkew.

image

If the Elastic Container Instance-based pods in Zone B and Zone C fail to be created, two Elastic Container Instance-based pods run in Zone A, and no Elastic Container Instance-based pod runs in Zone B or Zone C. This violates the constraint specified by the maxSkew parameter.

In an ACK Pro cluster, you can enable strict Elastic Container Instance-based pod topology spread to ensure that pods are strictly spread across zones. After you enable strict Elastic Container Instance-based pod topology spread, kube-scheduler first schedules a pod to each of Zone A, Zone B, and Zone C. kube-scheduler does not schedule pending pods until the scheduled pods are created, as shown in the following figure.

image

Even if Pod A1 is created, pending pods are not scheduled. This is because if the pod in Zone B or Zone C fails to be created, the constraint specified by the maxSkew parameter is violated. After Pod B1 is created, kube-scheduler schedules a pod to Zone C. Pods with green shading are created.

image

If you want to disable strict Elastic Container Instance-based pod topology spread, set the whenUnsatisfiable parameter to ScheduleAnyway. For more information, see Spread constraint definition.