All Products
Search
Document Center

Container Service for Kubernetes:Schedule pods based on cache affinity

Last Updated:Aug 30, 2024

Fluid allows you to schedule pods based on cache affinity. This way, you can deploy application pods to the nodes on which the cached data is stored, the nodes in the zone where the cached data is located, or the nodes in the region where the cached data is located. This improves data access efficiency.

Limits

Prerequisites

Feature description

Fluid can inject cache affinity rules into pod specifications based on mutating webhooks. When you create a pod, you can configure Fluid to inject different levels of cache affinity rules into the pod specification. This way, kube-scheduler preferentially schedules the pod to the nodes on which the cached data is stored, the nodes in the zone where the cached data is located, or the nodes in the region where the cached data is located.

Important

If the spec.affinity or spec.nodeSelector parameter is already specified in the pod specification, Fluid does not inject cache affinity rules into the pod specification.

Configure the scheduling policy

Default configurations

Fluid supports the following levels of cache affinity scheduling: node, zone, and region. To check the scheduling policy of your cluster, run the following command:

kubectl get cm -n fluid-system webhook-plugins -oyaml

Expected output:

apiVersion: v1
data:
  pluginsProfile: |
    pluginConfig:
    - args: |
        preferred:
          # fluid existed node affinity, the name can not be modified.
          - name: fluid.io/node
            weight: 100
          # runtime worker's zone label name, can be changed according to k8s environment.
          - name: topology.kubernetes.io/zone
            weight: 50
          # runtime worker's region label name, can be changed according to k8s environment.
          - name: topology.kubernetes.io/region
            weight: 20
        # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
        required:
          - fluid.io/node
      name: NodeAffinityWithCache
    plugins:
      serverful:
        withDataset:
        - RequireNodeWithFuse
        - NodeAffinityWithCache
        - MountPropagationInjector
        withoutDataset:
        - PreferNodesWithoutCache
      serverless:
        withDataset:
        - FuseSidecar
        withoutDataset: []

The following table describes the parameters in the pluginsProfile section of the preceding ConfigMap.

Parameter

Description

fluid.io/node

A parameter predefined by Fluid. After this parameter is enabled, Fluid automatically injects a node-specific cache affinity rule into the pod specification. The node-specific cache affinity rule specifies the node on which the cached data is stored. The rule weight is 100.

topology.kubernetes.io/zone

A Kubernetes cluster parameter that specifies a zone-specific cache affinity rule. After this parameter is enabled, Fluid automatically injects a zone-specific cache affinity rule into the pod specification. The zone-specific cache affinity rule specifies the zone in which the cached data is located. The rule weight is 50.

topology.kubernetes.io/region

A Kubernetes cluster parameter that specifies a region-specific cache affinity rule. After this parameter is enabled, Fluid automatically injects a region-specific cache affinity rule into the pod specification. The region-specific cache affinity rule specifies the region in which the cached data is located. The rule weight is 20.

Custom configurations

ACK may use other node labels to identify the topological information of nodes in ACK clusters. To configure Fluid to inject custom affinity rules based on specific node labels into pod specifications, perform the following steps:

  1. Run the following command to modify the webhook-plugins ConfigMap:

    kubectl edit -n fluid-system cm webhook-plugins
  2. Modify the webhook-plugins ConfigMap based on the following sample code.

    • You can delete existing labels that identify the topological information of the cluster based on your business requirements. For more information, see Example 1: Ignore node affinities.

    • You can add a custom affinity rule based on a specific node label (such as <topology_key>) and set the rule weight (such as <topology_weight>). For more information, see Example 2: Add the node pool affinity.

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
              - name: fluid.io/node
                weight: 100
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
              - name: <topology_key>
                weight: <topology_weight>
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []

    Example 1: Ignore node-specific cache affinity rules

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
    -         #- name: fluid.io/node
    -         #  weight: 100
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []

    Example 2: Add node pool-specific cache affinity rules

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
              - name: fluid.io/node
                weight: 100
    +         - name: alibabacloud.com/nodepool-id
    +           weight: 80
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []
  3. Run the following command to restart Fluid Webhook and apply the changes:

    kubectl rollout restart deployment -n fluid-system fluid-webhook

Examples

Example 1: Schedule a pod based on a node-specific cache affinity rule

  1. Create a Secret.

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. Create a Dataset and a Runtime object.

    Important

    In this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"
  3. Create an application pod.

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    The following parameters are used to enable pod scheduling based on a node-specific cache affinity rule.

    Parameter

    Description

    fuse.serverful.fluid.io/inject: "true"

    Enables Fluid to inject cache affinity rules into the pod specification.

    claimName

    The persistent volume claim (PVC) that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.

  4. Check the affinity settings in the pod specification.

    kubectl get pod nginx -oyaml

    Expected output:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"
            weight: 100
    

    A node-specific cache affinity rule (fluid.io/s-default-demo-dataset) is injected into the pod specification. The rule weight depends on the configurations of the node topological parameters in the scheduling policy.

Example 2: Schedule a pod based on a zone-specific cache affinity rule

  1. Create a Secret.

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. Create a Dataset and a Runtime object.

    Important

    In this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: topology.kubernetes.io/zone
                  operator: In
                  values:
                  - "<ZONE_ID>" # e.g. cn-beijing-i
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      master:
        nodeSelector:
          topology.kubernetes.io/zone: <ZONE_ID> # e.g. cn-beijing-i
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"

    To schedule a pod based on a zone-specific cache affinity rule, you need to implicitly specify the zone in which the cached data is located. In the preceding code block, the topology.kubernetes.io/zone=cn-beijing-i label is specified in the nodeAffinity.required.nodeSelectorTerms parameter.

  3. Create an application pod.

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    The following parameters are used to enable pod scheduling based on a zone-specific cache affinity rule.

    Parameter

    Description

    fuse.serverful.fluid.io/inject: "true"

    Enables Fluid to inject cache affinity rules into the pod specification.

    claimName

    The PVC that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.

  4. Check the affinity settings in the pod specification.

    kubectl get pod nginx -oyaml

    Expected output:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"
            weight: 100
          - preference:
              matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - <ZONE_ID> # e.g. cn-beijing-i
            weight: 50
    ...

    A node-specific cache affinity rule (fluid.io/s-default-demo-dataset) and a zone-specific cache affinity rule (topology.kubernetes.io/zone) are injected into the pod specification. The rule weights depend on the configurations of the node topological parameters in the scheduling policy.

Example 3: Force pod scheduling based on a node-specific cache affinity rule

  1. Create a Secret.

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. Create a Dataset and a Runtime object.

    Important

    In this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"
  3. Create an application pod.

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
        fluid.io/dataset.demo-dataset.sched: required
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    The following parameters are used to force pod scheduling based on a node-specific cache affinity rule.

    Parameter

    Description

    fuse.serverful.fluid.io/inject: "true"

    Enables Fluid to inject cache affinity rules into the pod specification.

    fluid.io/dataset.<dataset_name>.sched: required

    Specifies the <dataset_name> Dataset that is related to the forced node-specific affinity rule to be injected.

    claimName

    The PVC that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.

  4. Check the affinity settings in the pod specification.

    kubectl get pod nginx -oyaml

    Expected output:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fluid.io/dataset.demo-dataset.sched: required
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"

    A forced node-specific cache affinity rule (fluid.io/s-default-demo-dataset) is injected into the pod specification.