How to enable the distributed caching feature of the CNFS client - File Storage NAS

The Container Network File System (CNFS) client allows you to access data through multiple connections, cache metadata, and cache data in a distributed manner to increase read speeds. The CNFS client also supports performance monitoring and quality of service (QoS). This topic describes how to enable the distributed caching feature of the CNFS client and how to use the distributed caching feature of the CNFS client to increase read speeds.

Prerequisites
Key performance indicators of the distributed caching feature
Step 1: Mount a NAS file system that has distributed caching enabled
Step 2: Verify the caching feature

Prerequisites

Alibaba Cloud Linux 2 whose kernel version is 4.19.91-23 to 4.19.91-26 is used. The distributed caching feature supports Alibaba Cloud Linux 2.
A Container Service for Kubernetes (ACK) cluster that runs Kubernetes 1.20 or later is created. The Container Storage Interface (CSI) plug-in is used as the volume plug-in. For more information, see Create an ACK managed cluster.
The versions of csi-plugin and csi-provisioner are v1.22.11-abbb810e-aliyun or later. For more information about how to update csi-plugin and csi-provisioner, see Install and update the CSI plug-in.
The version of storage-operator is v1.22.86-041b094-aliyun or later. For more information about how to update storage-operator, see Manage components.
A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Key performance indicators of the distributed caching feature

Indicator	Benchmarking scenario^①	Distributed caching disabled	Distributed caching enabled
Read and write performance on metadata	Duration of traversing one million directories	18 minutes	< 60 seconds
	Duration of creating a file of 4 KB in size	3,000 microseconds	< 200 microseconds
	Duration of reading a file of 4 KB in size for the second time	400 microseconds	< 100 microseconds
Read and write throughput	Read and write throughput of a single node^②	200 to 500 MB/s	> 800 MB/s
Overall performance in comprehensive scenarios	Duration of extracting 5,000 images that each is 150 KB in size	52 seconds	About 15 seconds
Overall performance in comprehensive scenarios	Duration of creating a Redis project	27 seconds	About 21 seconds

Important

The values provided in the preceding table are only theoretical values (reference values). The actual values are subject to your operating environment.

Note ①: The type of Elastic Compute Service (ECS) instance that is used to run the benchmark test is ecs.hfg6.4xlarge. The benchmark data may vary based on the environment.
Note ②: The bandwidth of the ECS instance and the type of the File Storage NAS (NAS) file system impact the read and write throughput of the node.

Step 1: Mount a NAS file system that has distributed caching enabled

Run the following command to create and deploy the ConfigMap of csi-plugin in the cluster and install the CNFS client:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: csi-plugin
  namespace: kube-system
data:
  cnfs-client-properties: |
    nas-efc=true
  nas-efc-cache: |
    enable=true
    container-number=3
    volume-type=memory
    volume-size=15Gi
  node-selector: |
    cache=true
EOF

Parameter	Description
cnfs-client-properties	Specifies whether to enable distributed caching. If you specify `cnfs-cache-properties. enable=true`, the distributed caching feature is enabled to increase read speeds.
nfs-efc-cache. container-number	This parameter is required if you enable distributed caching. This parameter specifies the number of containers that the DaemonSet creates for the caching feature. If the caching feature reaches a performance bottleneck, you can increase the value of this parameter.
nfs-efc-cache. volume-type	This parameter is required if you enable distributed caching. The medium that is used by the emptyDir volume of the pod created by the DaemonSet. Valid values: Disk Memory
nfs-efc-cache. volume-size	This parameter is required if you enable distributed caching. The size of the volume that you want to cache. Unit: GiB.
cnfs-client-properties	The dependency of the caching feature. If you want to enable the distributed caching feature, specify `cnfs-client-properties. enable=true`.
node-selector	The pod created by the DaemonSet for the distributed caching feature is scheduled based on labels. If you do not specify this parameter, the DaemonSet is scheduled to each node in the cluster.

Important

After you set the medium to disk or memory, the data disk or memory resources of the node are used. Make sure that this does not adversely affect your workloads.
In this example, the DaemonSet for the distributed caching feature creates three containers that each is mounted with a tmpfs volume that is 5 GiB in size. The containers can be scheduled only to a node that has the cache=true label.

After the ConfigMap is configured, the system automatically deploys the DaemonSet and Service based on the ConfigMap.

Run the following command to restart CSI-Plugin with installation dependencies:

kubectl get pod -nkube-system -owide | grep csi-plugin | awk '{print $1}' | xargs kubectl -nkube-system delete pod

Run the following command to enable the distributed caching feature of the CNFS client:

Create a NAS file system that is managed by CNFS. The StatefulSet mounts the NAS volume accelerated by CNFS as a dynamically provisioned volume. The busybox image is used in this example. After the pod is launched, the system runs the dd command to write a file of 1 GB in size to the /data path. The file is used to check whether the distributed caching feature takes effect.

Show YAML content:

cat << EOF | kubectl apply -f -
apiVersion: storage.alibabacloud.com/v1beta1
kind: ContainerNetworkFileSystem
metadata:
  name: cnfs-nas-filesystem
spec:
  description: "cnfs"
  type: nas
  reclaimPolicy: Retain
  parameters:
    filesystemType: standard
    storageType: Capacity
    protocolType: NFS
    encryptType: None
    enableTrashCan: "true"
    trashCanReservedDays: "5"
    useClient: "EFCClient" # Use the EFC client to mount a NAS file system that has distributed caching enabled. 
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alibabacloud-cnfs-nas-sc
mountOptions:
  - g_tier_EnableClusterCache=true              # Configure caching settings when mounting the volume. 
  - g_tier_EnableClusterCachePrefetch=true      # Configure prefetching when mounting the volume. 
parameters:
  volumeAs: subpath
  containerNetworkFileSystem: cnfs-nas-filesystem
  path: "/"
provisioner: nasplugin.csi.alibabacloud.com
reclaimPolicy: Retain
allowVolumeExpansion: true
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cnfs-nas-sts
  labels:
    app: busybox
spec:
  serviceName: "busybox"
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
      - name: busybox
        image: busybox
        command: ["/bin/sh"]
        args: ["-c", "dd if=/dev/zero of=/data/1G.tmpfile bs=1G count=1;sleep 3600;"]
        volumeMounts:
        - mountPath: "/data"
          name: www
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "alibabacloud-cnfs-nas-sc"
      resources:
        requests:
          storage: 50Gi
EOF

Run the following command to check whether the NAS volume that has distributed caching enabled is mounted:

kubectl exec cnfs-nas-sts-0   -- mount | grep /data

Expected output:

xxx.cn-xxx.nas.aliyuncs.com:/nas-6b9d1397-6542-4410-816b-4dfd0633****:2fMaQdxU on /data type alifuse.aliyun-alinas-eac (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)

A mount target is displayed in the output. This indicates that the NAS volume that has distributed caching enabled is mounted.

Run the following command to check whether the DaemonSet for the distributed caching feature is launched:

kubectl get ds/cnfs-cache-ds -n kube-system -owide

Expected output:

NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS              IMAGES                                                         SELECTOR
cnfs-cache-ds   3         3         3       3            3           <none>          19d   alinas-dadi-container   registry-vpc.cn-shenzhen.aliyuncs.com/acs/nas-cache:20220420   app=cnfs-cache-ds

In this example, the cluster contains three nodes. The three pods that are created by the DaemonSet are ready. This indicates that the DaemonSet for the distributed caching feature is launched.

Run the following command to check whether the distributed caching Service can discover the backend pods:
```
kubectl get ep cnfs-cache-ds-service  -n kube-system -owide
```
Expected output:
```
NAME                     ENDPOINTS                                          AGE
cnfs-cache-ds-service   10.19.1.130:6500,10.19.1.40:6500,10.19.1.66:6500   19d
```
The output shows that the Service has discovered the backend pods. The endpoints of the pods are 10.19.1.130, 10.19.1.40, and 10.19.1.66. The port is 6500.

Step 2: Verify the caching feature

Run the following command to copy the tmpfile file in the /data path to the / directory and check the amount of time that is consumed. The size of the file is 1 GB.
```
kubectl exec cnfs-nas-sts-0 -- time cp /data/1G.tmpfile /
```
Expected output:
```
real    0m 5.66s
user    0m 0.00s
sys     0m 0.75s
```
The output shows that the amount of time required for copying the file is about five seconds when the distributed caching feature is disabled.
Run the following command multiple times to check the amount of time that is required for copying the file:
```
kubectl exec cnfs-nas-sts-0 -- time cp /data/1G.tmpfile /
```
Expected output:
```
real    0m 0.79s
user    0m 0.00s
sys     0m 0.58s
```
The output shows that the amount of time required for copying the file is reduced by six to seven times if the file is accessed more than one time.

File Storage NAS:Enable the distributed caching feature of the CNFS client

Table of contents

Prerequisites

Key performance indicators of the distributed caching feature

Step 1: Mount a NAS file system that has distributed caching enabled

Step 2: Verify the caching feature