Accelerate Jobs - Container Service for Kubernetes - Alibaba Cloud Documentation Center

You can use Fluid to accelerate access to data stored in ACK Serverless clusters. You can deploy all Fluid components, including the Fluid controllers and cache runtime engine, and your application in an ACK Serverless cluster. This topic describes how to accelerate Jobs in ACK Serverless clusters.

Prerequisites

An ACK Serverless cluster is created and the Kubernetes version of the cluster is 1.18 or later. CoreDNS is installed in the cluster. For more information, see Create an ACK Serverless cluster.
A kubectl client is connected to the ACK cluster. For more information, see Connect to an ACK cluster by using kubectl.
Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create buckets.

Limits

This feature is mutually exclusive with the virtual node-based pod scheduling feature of ACK Serverless clusters. For more information about the virtual node-based pod scheduling feature, see Enable the virtual node-based pod scheduling policy for an ACK cluster.

Deploy the control plane components of Fluid

Important

If you have installed open source Fluid, you must uninstall Fluid before you can install the ack-fluid component.

Deploy the control plane components of Fluid.
1. Log on to the ACK console. In the left-side navigation pane, click Clusters.
2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Applications > Helm.
3. On the Helm page, click Deploy.
4. On the Basic Information wizard page, configure the parameters and click Next.
  The following table describes some parameters that are displayed.
  Parameter
  Description
  Source
  Select Marketplace.
  Chart
  Find and click ack-fluid.
  Note
  The default release name for the ack-fluid chart is ack-fluid. The default namespace for the ack-fluid chart is fluid-system. Click Next. If the actual release name and namespace for the ack-fluid chart are different from the default release name and default namespace, the Confirm message appears. Click Yes to use the default release name and default namespace.
5. On the Parameters wizard page, click OK.
Run the following command to check whether Fluid is deployed:
```
kubectl get pod -n fluid-system
```
Expected output:
```
NAME                                  READY   STATUS    RESTARTS   AGE
dataset-controller-d99998f79-dgkmh    1/1     Running   0          2m48s
fluid-webhook-55c6d9d497-dmrzb        1/1     Running   0          2m49s
```
The output indicates that Fluid is deployed. The following content describes the control plane components of Fluid:
- Dataset Controller: manages the lifecycle of the Dataset objects that are referenced by Fluid. Dataset objects are custom resource (CR) objects.
- Fluid Webhook: performs sidecar injection on pods that need to access data. This makes data access transparent to users in serverless scenarios.
Note
In addition to the preceding components, the control plane of Fluid also includes controllers that are used to manage the lifecycle of cache runtimes, such as the JindoFS runtime, JuiceFS runtime, and Alluxio runtime. The controllers corresponding to a cache runtime are deployed only after the cache runtime is used.

Examples of accelerating data access in an ACK Serverless cluster

Step 1: Upload the test dataset to the OSS bucket

Create a test dataset of 2 GB in size. In this example, the test dataset is used.
Upload the test dataset to the OSS bucket that you created.
You can use the ossutil tool provided by OSS to upload data. For more information, see Install ossutil.

Step 2: Create a Dataset object and a Runtime object

After you upload the test dataset to OSS, you can use Fluid to claim the dataset. Create a Dataset object (CR object) and a Runtime object (CR object).

The Dataset object is used to specify the URL of the test dataset that is uploaded to OSS.
The Runtime object is used to define and configure the cache system that is used.

Run the following command to create a Secret that stores the credentials used to access the OSS bucket.

kubectl create secret generic oss-access-key \
  --from-literal=fs.oss.accessKeyId=<access_key_id> \
  --from-literal=fs.oss.accessKeySecret=<access_key_secret>

Create a file named dataset.yaml and copy the following content to the file. The file is used to create a Dataset object and a Runtime object.

In this topic, the JindoRuntime is used to interface with JindoFS.

apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo-dataset
spec:
  mounts:
    - mountPoint: oss://<bucket_name>/<bucket_path>
      name: demo
      path: /
      options:
        fs.oss.endpoint: oss-<region>.aliyuncs.com # The endpoint of the OSS bucket. 
      encryptOptions:
        - name: fs.oss.accessKeyId
          valueFrom:
            secretKeyRef:
              name: oss-access-key
              key: fs.oss.accessKeyId
        - name: fs.oss.accessKeySecret
          valueFrom:
            secretKeyRef:
              name: oss-access-key
              key: fs.oss.accessKeySecret
---
apiVersion: data.fluid.io/v1alpha1
kind: JindoRuntime
metadata:
  name: demo-dataset
spec:
  # The number of worker pods to be created in the JindoFS cluster. 
  replicas: 2
  worker:
    podMetadata:
      annotations:
        # Disable the virtual node-based pod scheduling policy. 
        alibabacloud.com/burst-resource: eci_only
        # The type of instance that is used to run the worker pods. 
        k8s.aliyun.com/eci-use-specs: <eci_instance_spec>
        # Use an image cache to accelerate pod creation. 
        k8s.aliyun.com/eci-image-cache: "true"
  tieredstore:
    levels:
      # Specify 10 GiB of memory as the cache for each worker pod. 
      - mediumtype: MEM
        volumeType: emptyDir
        path: /dev/shm
        quota: 10Gi
        high: "0.99"
        low: "0.99"

The following table describes some parameters that are displayed.

Parameter	Description
`mountPoint`	The path to which the UFS is mounted. The format of the path is `oss://<oss_bucket>/<bucket_dir>`. Do not include endpoint information in the path. If you use only one mount point, you can set `path` to `/`.
`options`	Information about the endpoint of the OSS bucket. You can specify a public or internal endpoint.
`fs.oss.endpoint`	The public or internal endpoint of the OSS bucket. You can specify the internal endpoint of the bucket to enhance data security. However, if you specify the internal endpoint, make sure that your cluster is deployed in the region where OSS is activated. For example, if your OSS bucket is created in the China (Hangzhou) region, the public endpoint of the bucket is `oss-cn-hangzhou.aliyuncs.com` and the internal endpoint is `oss-cn-hangzhou-internal.aliyuncs.com`.
`fs.oss.accessKeyId`	The AccessKey ID that is used to access the bucket.
`fs.oss.accessKeySecret`	The AccessKey secret that is used to access the bucket.
`replicas`	The number of worker pods that are created by the JindoRuntime. This parameter determines the maximum cache size that can be provided by the distributed cache runtime.
`worker.podMetadata.annotations`	You can specify an instance type and an image cache.
`tieredstore.levels`	You can use the `quota` field to specify the maximum size of the cache used by each worker pod.
`tieredstore.levels.mediumtype`	The cache type. Supported cache types are HDD, SSD, and MEM. For more information about the recommended configurations of the mediumtype, see Policy 2: Select proper cache media.
`tieredstore.levels.volumeType`	The volume type of the cache medium. Valid values: `emptyDir` and `hostPath`. Default value: `hostPath`. If you use memory or local system disks as the cache medium, we recommend that you use the `emptyDir` type to avoid residual cache data on the node and ensure node availability. If you use local data disks as the cache medium, you can use the `hostPath` type and configure the `path` to specify the mount path of the data disk on the host. For more information about the recommended configurations of the volumeType, see Policy 2: Select proper cache media.
`tieredstore.levels.path`	The path of the cache. You can specify only one path.
`tieredstore.levels.quota`	The maximum cache size. For example, a value of 100 Gi indicates that the maximum cache size is 100 GiB.
`tieredstore.levels.high`	The upper limit of the storage.
`tieredstore.levels.low`	The lower limit of the storage.

Run the following commands to create a Dataset object and a JindoRuntime object:
```
kubectl create -f dataset.yaml
```

Run the following command to check whether the Dataset object is created.

It requires 1 to 2 minutes to create the Dataset object and the JindoRuntime object. After the objects are created, you can query information about the cache system and cached data.

kubectl get dataset demo-dataset

Expected output:

NAME           UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
demo-dataset   1.16GiB          0.00B    20.00GiB         0.0%                Bound   2m58s

The output shows information about the Dataset object that you created in Fluid. The following table describes the parameters in the output.

Parameter	Description
UFS TOTAL SIZE	The size of the dataset that is uploaded to OSS.
CACHED	The size of the cached data.
CACHE CAPACITY	The total cache size.
CACHED PERCENTAGE	The percentage of the cached data in the dataset.
PHASE	The status of the Dataset object. If the value is Bound, the Dataset object is created.

(Optional) Step 3: Preheat data

Prefetching can efficiently accelerate first-time data access. We recommend that you use this feature if this is the first time you retrieve data.

Create a file named dataload.yaml based on the following content:

apiVersion: data.fluid.io/v1alpha1
kind: DataLoad
metadata:
  name: data-warmup
spec:
  dataset:
    name: demo-dataset
    namespace: default
  loadMetadata: true

Run the following command to create a DataLoad object:
```
kubectl create -f dataload.yaml
```
Expected output:
```
NAME          DATASET        PHASE      AGE   DURATION
data-warmup   demo-dataset   Complete   99s   58s
```
The output shows that the duration of data prefetching is 58 seconds.

Step 4: Create a Job to test data access acceleration

You can create applications to test whether data access is accelerated by JindoFS, or submit machine learning Jobs to use relevant features. This section describes how to use a Job to access the data stored in OSS.

Create a file named job.yaml and copy the following content to the file:

apiVersion: batch/v1
kind: Job
metadata:
  name: demo-app
spec:
  template:
    metadata:
      labels:
        alibabacloud.com/fluid-sidecar-target: eci
      annotations:
        # Disable the virtual node-based pod scheduling policy. 
        alibabacloud.com/burst-resource: eci_only
        # Select an instance type for pods. 
        k8s.aliyun.com/eci-use-specs: ecs.g7.4xlarge
    spec:
      containers:
        - name: demo
          image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
          args:
            - -c
            - du -sh /data && time cp -r /data/ /tmp
          command:
            - /bin/bash
          volumeMounts:
            - mountPath: /data
              name: demo
      restartPolicy: Never
      volumes:
        - name: demo
          persistentVolumeClaim:
            claimName: demo-dataset
  backoffLimit: 4

Run the following command to create a Job:
```
kubectl create -f job.yaml
```
Run the following command to query the boot log of the pod created by the Job:
```
kubectl logs demo-app-jwktf -c demo
```
Expected output:
```
1.2G    /data

real    0m0.992s
user    0m0.004s
sys     0m0.674s
```
The real field in the output shows that it took 0.992 seconds (0m0.992s) to replicate the file.

Step 5: Clear data

After you test data access acceleration, clear the relevant data at the earliest opportunity.

Run the following command to delete the pods of the Job:
```
kubectl delete job demo-app
```
Run the following command to delete the Dataset object and the components corresponding to the caching runtime.
```
kubectl delete dataset demo-dataset
```
Important
It requires about 1 minute to delete the components. Before you perform the following step, make sure that the components are deleted.

Run the following command to delete the control plane components of Fluid:

kubectl get deployments.apps -n fluid-system | awk 'NR>1 {print $1}' | xargs kubectl scale deployments -n fluid-system --replicas=0

To enable data access acceleration again, you must run the following command to create the control plane components of Fluid before you create a Dataset object and a Runtime object:

kubectl scale -n fluid-system deployment dataset-controller --replicas=1
kubectl scale -n fluid-system deployment fluid-webhook --replicas=1

Parameter	Description
Source	Select Marketplace.
Chart	Find and click `ack-fluid`.