All Products
Search
Document Center

Container Service for Kubernetes:Enable the distributed caching feature of the CNFS client

Last Updated:Nov 01, 2024

The Container Network File System (CNFS) client allows you to access data through multiple connections, cache metadata, and cache data in a distributed manner to increase read speeds. The CNFS client also supports performance monitoring and quality of service (QoS). This topic describes how to enable the distributed caching feature of the CNFS client and how to use the distributed caching feature of the CNFS client to increase read speeds.

Table of contents

Prerequisites

  • Alibaba Cloud Linux 2 whose kernel version is 4.19.91-23 to 4.19.91-26 is used. The distributed caching feature supports Alibaba Cloud Linux 2.

  • A Container Service for Kubernetes (ACK) cluster that runs Kubernetes 1.20 or later is created. The Container Storage Interface (CSI) plug-in is used as the volume plug-in. For more information, see Create an ACK managed cluster.

  • The versions of csi-plugin and csi-provisioner are v1.22.11-abbb810e-aliyun or later. For more information about how to update csi-plugin and csi-provisioner, see Install and update the CSI plug-in.

  • The version of storage-operator is v1.22.86-041b094-aliyun or later. For more information about how to update storage-operator, see Manage components.

  • A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Key performance indicators of the distributed caching feature

Indicator

Benchmarking scenario

Distributed caching disabled

Distributed caching enabled

Read and write performance on metadata

Duration of traversing one million directories

18 minutes

< 60 seconds

Duration of creating a file of 4 KB in size

3,000 microseconds

< 200 microseconds

Duration of reading a file of 4 KB in size for the second time

400 microseconds

< 100 microseconds

Read and write throughput

Read and write throughput of a single node

200 to 500 MB/s

> 800 MB/s

Overall performance in comprehensive scenarios

Duration of extracting 5,000 images that each is 150 KB in size

52 seconds

About 15 seconds

Duration of creating a Redis project

27 seconds

About 21 seconds

Important

The values provided in the preceding table are only theoretical values (reference values). The actual values are subject to your operating environment.

  • Note ①: The type of Elastic Compute Service (ECS) instance that is used to run the benchmark test is ecs.hfg6.4xlarge. The benchmark data may vary based on the environment.

  • Note ②: The bandwidth of the ECS instance and the type of the File Storage NAS (NAS) file system impact the read and write throughput of the node.

Step 1: Mount a NAS file system that has distributed caching enabled

  1. Run the following command to create and deploy the ConfigMap of csi-plugin in the cluster and install the CNFS client:

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: csi-plugin
      namespace: kube-system
    data:
      cnfs-client-properties: |
        nas-efc=true
      nas-efc-cache: |
        enable=true
        container-number=3
        volume-type=memory
        volume-size=15Gi
      node-selector: |
        cache=true
    EOF

    Parameter

    Description

    cnfs-client-properties

    Specifies whether to enable distributed caching. If you specify cnfs-cache-properties. enable=true, the distributed caching feature is enabled to increase read speeds.

    nfs-efc-cache.

    container-number

    This parameter is required if you enable distributed caching.

    This parameter specifies the number of containers that the DaemonSet creates for the caching feature. If the caching feature reaches a performance bottleneck, you can increase the value of this parameter.

    nfs-efc-cache.

    volume-type

    This parameter is required if you enable distributed caching.

    The medium that is used by the emptyDir volume of the pod created by the DaemonSet. Valid values:

    • Disk

    • Memory

    nfs-efc-cache.

    volume-size

    This parameter is required if you enable distributed caching. The size of the volume that you want to cache. Unit: GiB.

    cnfs-client-properties

    The dependency of the caching feature. If you want to enable the distributed caching feature, specify cnfs-client-properties. enable=true.

    node-selector

    The pod created by the DaemonSet for the distributed caching feature is scheduled based on labels. If you do not specify this parameter, the DaemonSet is scheduled to each node in the cluster.

    Important
    • After you set the medium to disk or memory, the data disk or memory resources of the node are used. Make sure that this does not adversely affect your workloads.

    • In this example, the DaemonSet for the distributed caching feature creates three containers that each is mounted with a tmpfs volume that is 5 GiB in size. The containers can be scheduled only to a node that has the cache=true label.

    After the ConfigMap is configured, the system automatically deploys the DaemonSet and Service based on the ConfigMap.

  2. Run the following command to restart CSI-Plugin with installation dependencies:

    kubectl get pod -nkube-system -owide | grep csi-plugin | awk '{print $1}' | xargs kubectl -nkube-system delete pod
  3. Run the following command to enable the distributed caching feature of the CNFS client:

    Create a NAS file system that is managed by CNFS. The StatefulSet mounts the NAS volume accelerated by CNFS as a dynamically provisioned volume. The busybox image is used in this example. After the pod is launched, the system runs the dd command to write a file of 1 GB in size to the /data path. The file is used to check whether the distributed caching feature takes effect.

    Show YAML content:

    cat << EOF | kubectl apply -f -
    apiVersion: storage.alibabacloud.com/v1beta1
    kind: ContainerNetworkFileSystem
    metadata:
      name: cnfs-nas-filesystem
    spec:
      description: "cnfs"
      type: nas
      reclaimPolicy: Retain
      parameters:
        filesystemType: standard
        storageType: Capacity
        protocolType: NFS
        encryptType: None
        enableTrashCan: "true"
        trashCanReservedDays: "5"
        useClient: "EFCClient" # Use the EFC client to mount a NAS file system that has distributed caching enabled. 
    ---
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: alibabacloud-cnfs-nas-sc
    mountOptions:
      - g_tier_EnableClusterCache=true              # Configure caching settings when mounting the volume. 
      - g_tier_EnableClusterCachePrefetch=true      # Configure prefetching when mounting the volume. 
    parameters:
      volumeAs: subpath
      containerNetworkFileSystem: cnfs-nas-filesystem
      path: "/"
    provisioner: nasplugin.csi.alibabacloud.com
    reclaimPolicy: Retain
    allowVolumeExpansion: true
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: cnfs-nas-sts
      labels:
        app: busybox
    spec:
      serviceName: "busybox"
      replicas: 1
      selector:
        matchLabels:
          app: busybox
      template:
        metadata:
          labels:
            app: busybox
        spec:
          containers:
          - name: busybox
            image: busybox
            command: ["/bin/sh"]
            args: ["-c", "dd if=/dev/zero of=/data/1G.tmpfile bs=1G count=1;sleep 3600;"]
            volumeMounts:
            - mountPath: "/data"
              name: www
      volumeClaimTemplates:
      - metadata:
          name: www
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "alibabacloud-cnfs-nas-sc"
          resources:
            requests:
              storage: 50Gi
    EOF
  4. Run the following command to check whether the NAS volume that has distributed caching enabled is mounted:

    kubectl exec cnfs-nas-sts-0   -- mount | grep /data

    Expected output:

    xxx.cn-xxx.nas.aliyuncs.com:/nas-6b9d1397-6542-4410-816b-4dfd0633****:2fMaQdxU on /data type alifuse.aliyun-alinas-eac (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)

    A mount target is displayed in the output. This indicates that the NAS volume that has distributed caching enabled is mounted.

  5. Run the following command to check whether the DaemonSet for the distributed caching feature is launched:

    kubectl get ds/cnfs-cache-ds -n kube-system -owide

    Expected output:

    NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS              IMAGES                                                         SELECTOR
    cnfs-cache-ds   3         3         3       3            3           <none>          19d   alinas-dadi-container   registry-vpc.cn-shenzhen.aliyuncs.com/acs/nas-cache:20220420   app=cnfs-cache-ds

    In this example, the cluster contains three nodes. The three pods that are created by the DaemonSet are ready. This indicates that the DaemonSet for the distributed caching feature is launched.

  6. Run the following command to check whether the distributed caching Service can discover the backend pods:

    kubectl get ep cnfs-cache-ds-service  -n kube-system -owide

    Expected output:

    NAME                     ENDPOINTS                                          AGE
    cnfs-cache-ds-service   10.19.1.130:6500,10.19.1.40:6500,10.19.1.66:6500   19d

    The output shows that the Service has discovered the backend pods. The endpoints of the pods are 10.19.1.130, 10.19.1.40, and 10.19.1.66. The port is 6500.

Step 2: Verify the caching feature

  1. Run the following command to copy the tmpfile file in the /data path to the / directory and check the amount of time that is consumed. The size of the file is 1 GB.

    kubectl exec cnfs-nas-sts-0 -- time cp /data/1G.tmpfile /

    Expected output:

    real    0m 5.66s
    user    0m 0.00s
    sys     0m 0.75s

    The output shows that the amount of time required for copying the file is about five seconds when the distributed caching feature is disabled.

  2. Run the following command multiple times to check the amount of time that is required for copying the file:

    kubectl exec cnfs-nas-sts-0 -- time cp /data/1G.tmpfile /

    Expected output:

    real    0m 0.79s
    user    0m 0.00s
    sys     0m 0.58s

    The output shows that the amount of time required for copying the file is reduced by six to seven times if the file is accessed more than one time.