how to configure GPU sharing without GPU memory isolation - Container Service for Kubernetes

In some scenarios, you may need to use GPU sharing without GPU memory isolation. For example, some applications come with built-in GPU memory limitation capabilities. In such cases, using the GPU memory isolation may not be appropriate. To avoid this situation, you can choose not to install the GPU memory isolation module on nodes where GPU sharing is configured. This topic describes how to configure GPU sharing without GPU memory isolation.

Prerequisites

Step 1: Create a node pool

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage, and choose Nodes > Node Pools in the left-side navigation pane.
On the Node Pools page, click Create Node Pool.

In the Create Node Pool dialog box, configure the node pool and click Confirm Order. The following table describes the key parameters. For more information about other parameters, see Create and manage a node pool.

Parameter	Description
Instance Type	Set Architecture to GPU-accelerated and select multiple GPU-accelerated instance types. In this example, instance types that use the V100 GPU are selected.
Expected Nodes	Specify the initial number of nodes in the node pool. If you do not want to create nodes in the node pool, set this parameter to 0.
Node Labels	Click set Key to `ack.node.gpu.schedule`, and then set Value to `share`. Enable GPU sharing and scheduling. For more information about node labels, see Labels for enabling GPU scheduling policies.

Step 2: Submit a job

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Jobs.

Click Create from YAML in the upper-left part of the page, copy the following content to the Template section, and then click Create:

Click to view YAML content

apiVersion: batch/v1
kind: Job
metadata:
  name: tensorflow-mnist-share
spec:
  parallelism: 1
  template:
    metadata:
      labels:
        app: tensorflow-mnist-share
    spec:
      containers:
      - name: tensorflow-mnist-share
        image: registry.cn-beijing.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5
        command:
        - python
        - tensorflow-sample-code/tfjob/docker/mnist/main.py
        - --max_steps=100000
        - --data_dir=tensorflow-sample-code/data
        resources:
          limits:
            aliyun.com/gpu-mem: 4 # Request 4 GiB of memory. 
        workingDir: /root
      restartPolicy: Never

YAML template description:

This YAML template defines a TensorFlow MNIST job. The job creates one pod and the pod requests 4 GiB of memory.
The aliyun.com/gpu-mem: 4 resource limit is added to request 4 GiB of memory.

Step 3: Verify the configuration

On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Pods.

Click Terminal in the Actions column of the pod that you created, such as tensorflow-mnist-multigpu-***. Select the name of the pod that you want to manage, and run the following command:

nvidia-smi

Expected output:

Wed Jun 14 06:45:56 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.105.01   Driver Version: 515.105.01   CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:09.0 Off |                    0 |
| N/A   35C    P0    59W / 300W |    334MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

In this example, a V100 GPU is used. The output indicates that the pod can use all memory provided by the GPU, which is 16,384 MiB in size. This means that GPU sharing is implemented without GPU memory isolation. If GPU memory isolation is enabled, the memory size displayed in the output will equal the amount of memory requested by the pod, which is 4 GiB in this example.

The pod determines the amount of GPU memory that it can use based on the following environment variables:

ALIYUN_COM_GPU_MEM_CONTAINER=4 # The amount of GPU memory that the pod can use. 
ALIYUN_COM_GPU_MEM_DEV=16 # The memory size of each GPU.

To calculate the ratio of the GPU memory required by the application, use the following formula:

percetange = ALIYUN_COM_GPU_MEM_CONTAINER / ALIYUN_COM_GPU_MEM_DEV = 4 / 16 = 0.25