Fluid allows you to schedule pods based on cache affinity. This way, you can deploy application pods to the nodes on which the cached data is stored, the nodes in the zone where the cached data is located, or the nodes in the region where the cached data is located. This improves data access efficiency.
Limits
This feature is supported only by ACK Pro clusters.
This feature is incompatible with Elastic Container Instance-based scheduling or priority-based resource scheduling.
Prerequisites
An ACK Pro cluster that runs Kubernetes 1.18 or later is created. For more information, see Create an ACK Pro cluster.
The cloud-native AI suite and ack-fluid 1.0.6 or later are deployed in the cluster. For more information, see Deploy the cloud-native AI suite.
ImportantIf you have already installed open source Fluid, uninstall Fluid and deploy the ack-fluid component.
A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Feature description
Fluid can inject cache affinity rules into pod specifications based on mutating webhooks. When you create a pod, you can configure Fluid to inject different levels of cache affinity rules into the pod specification. This way, kube-scheduler preferentially schedules the pod to the nodes on which the cached data is stored, the nodes in the zone where the cached data is located, or the nodes in the region where the cached data is located.
If the spec.affinity
or spec.nodeSelector
parameter is already specified in the pod specification, Fluid does not inject cache affinity rules into the pod specification.
Configure the scheduling policy
Default configurations
Fluid supports the following levels of cache affinity scheduling: node, zone, and region. To check the scheduling policy of your cluster, run the following command:
kubectl get cm -n fluid-system webhook-plugins -oyaml
Expected output:
apiVersion: v1
data:
pluginsProfile: |
pluginConfig:
- args: |
preferred:
# fluid existed node affinity, the name can not be modified.
- name: fluid.io/node
weight: 100
# runtime worker's zone label name, can be changed according to k8s environment.
- name: topology.kubernetes.io/zone
weight: 50
# runtime worker's region label name, can be changed according to k8s environment.
- name: topology.kubernetes.io/region
weight: 20
# used when app pod with label fluid.io/dataset.{dataset name}.sched set true
required:
- fluid.io/node
name: NodeAffinityWithCache
plugins:
serverful:
withDataset:
- RequireNodeWithFuse
- NodeAffinityWithCache
- MountPropagationInjector
withoutDataset:
- PreferNodesWithoutCache
serverless:
withDataset:
- FuseSidecar
withoutDataset: []
The following table describes the parameters in the pluginsProfile
section of the preceding ConfigMap.
Parameter | Description |
| A parameter predefined by Fluid. After this parameter is enabled, Fluid automatically injects a node-specific cache affinity rule into the pod specification. The node-specific cache affinity rule specifies the node on which the cached data is stored. The rule weight is 100. |
| A Kubernetes cluster parameter that specifies a zone-specific cache affinity rule. After this parameter is enabled, Fluid automatically injects a zone-specific cache affinity rule into the pod specification. The zone-specific cache affinity rule specifies the zone in which the cached data is located. The rule weight is 50. |
| A Kubernetes cluster parameter that specifies a region-specific cache affinity rule. After this parameter is enabled, Fluid automatically injects a region-specific cache affinity rule into the pod specification. The region-specific cache affinity rule specifies the region in which the cached data is located. The rule weight is 20. |
Custom configurations
ACK may use other node labels to identify the topological information of nodes in ACK clusters. To configure Fluid to inject custom affinity rules based on specific node labels into pod specifications, perform the following steps:
Run the following command to modify the webhook-plugins ConfigMap:
kubectl edit -n fluid-system cm webhook-plugins
Modify the webhook-plugins ConfigMap based on the following sample code.
You can delete existing labels that identify the topological information of the cluster based on your business requirements. For more information, see Example 1: Ignore node affinities.
You can add a custom affinity rule based on a specific node label (such as
<topology_key>
) and set the rule weight (such as<topology_weight>
). For more information, see Example 2: Add the node pool affinity.
apiVersion: v1 data: pluginsProfile: | pluginConfig: - args: | preferred: # fluid existed node affinity, the name can not be modified. - name: fluid.io/node weight: 100 # runtime worker's zone label name, can be changed according to k8s environment. - name: topology.kubernetes.io/zone weight: 50 # runtime worker's region label name, can be changed according to k8s environment. - name: topology.kubernetes.io/region weight: 20 - name: <topology_key> weight: <topology_weight> # used when app pod with label fluid.io/dataset.{dataset name}.sched set true required: - fluid.io/node name: NodeAffinityWithCache plugins: serverful: withDataset: - RequireNodeWithFuse - NodeAffinityWithCache - MountPropagationInjector withoutDataset: - PreferNodesWithoutCache serverless: withDataset: - FuseSidecar withoutDataset: []
Run the following command to restart Fluid Webhook and apply the changes:
kubectl rollout restart deployment -n fluid-system fluid-webhook
Examples
Example 1: Schedule a pod based on a node-specific cache affinity rule
Create a Secret.
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
Create a Dataset and a Runtime object.
ImportantIn this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8"
Create an application pod.
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-dataset
The following parameters are used to enable pod scheduling based on a node-specific cache affinity rule.
Parameter
Description
fuse.serverful.fluid.io/inject: "true"
Enables Fluid to inject cache affinity rules into the pod specification.
claimName
The persistent volume claim (PVC) that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.
Check the affinity settings in the pod specification.
kubectl get pod nginx -oyaml
Expected output:
apiVersion: v1 kind: Pod metadata: labels: fuse.serverful.fluid.io/inject: "true" name: nginx namespace: default ... spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true" weight: 100
A node-specific cache affinity rule (
fluid.io/s-default-demo-dataset
) is injected into the pod specification. The rule weight depends on the configurations of the node topological parameters in the scheduling policy.
Example 2: Schedule a pod based on a zone-specific cache affinity rule
Create a Secret.
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
Create a Dataset and a Runtime object.
ImportantIn this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - "<ZONE_ID>" # e.g. cn-beijing-i mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 master: nodeSelector: topology.kubernetes.io/zone: <ZONE_ID> # e.g. cn-beijing-i tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8"
To schedule a pod based on a zone-specific cache affinity rule, you need to implicitly specify the zone in which the cached data is located. In the preceding code block, the
topology.kubernetes.io/zone=cn-beijing-i
label is specified in thenodeAffinity.required.nodeSelectorTerms
parameter.Create an application pod.
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-dataset
The following parameters are used to enable pod scheduling based on a zone-specific cache affinity rule.
Parameter
Description
fuse.serverful.fluid.io/inject: "true"
Enables Fluid to inject cache affinity rules into the pod specification.
claimName
The PVC that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.
Check the affinity settings in the pod specification.
kubectl get pod nginx -oyaml
Expected output:
apiVersion: v1 kind: Pod metadata: labels: fuse.serverful.fluid.io/inject: "true" name: nginx namespace: default ... spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true" weight: 100 - preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - <ZONE_ID> # e.g. cn-beijing-i weight: 50 ...
A node-specific cache affinity rule (
fluid.io/s-default-demo-dataset
) and a zone-specific cache affinity rule (topology.kubernetes.io/zone
) are injected into the pod specification. The rule weights depend on the configurations of the node topological parameters in the scheduling policy.
Example 3: Force pod scheduling based on a node-specific cache affinity rule
Create a Secret.
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
Create a Dataset and a Runtime object.
ImportantIn this example, a JindoRuntime is created. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For more information about how to use JindoFS to accelerate access to Object Storage Service (OSS), see Use JindoFS to accelerate access to OSS.
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8"
Create an application pod.
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" fluid.io/dataset.demo-dataset.sched: required spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-dataset
The following parameters are used to force pod scheduling based on a node-specific cache affinity rule.
Parameter
Description
fuse.serverful.fluid.io/inject: "true"
Enables Fluid to inject cache affinity rules into the pod specification.
fluid.io/dataset.<dataset_name>.sched: required
Specifies the
<dataset_name>
Dataset that is related to the forced node-specific affinity rule to be injected.claimName
The PVC that is mounted to the pod. The PVC is automatically created by Fluid and named after the Dataset that you created.
Check the affinity settings in the pod specification.
kubectl get pod nginx -oyaml
Expected output:
apiVersion: v1 kind: Pod metadata: labels: fluid.io/dataset.demo-dataset.sched: required fuse.serverful.fluid.io/inject: "true" name: nginx namespace: default ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true"
A forced node-specific cache affinity rule (
fluid.io/s-default-demo-dataset
) is injected into the pod specification.