All Products
Search
Document Center

Container Service for Kubernetes:How to enable GPU acceleration for DirectX in Windows containers

Last Updated:Nov 28, 2024

GPUs provide higher parallel computing power than CPUs for workloads on Windows nodes and can accelerate operations by orders of magnitude. This reduces costs and improves throughput. Windows containers support GPU acceleration for Direct eXtension (DirectX) and all the frameworks that are built on top of DirectX. This topic describes how to install the DirectX device plug-in on Windows nodes and how to enable GPU acceleration for DirectX.

Prerequisites

Introduction

DirectX is a type of API that improves execution efficiency and enhances 3D graphics and sound effects for Windows-based games and multimedia programs. It provides designers with a common hardware driver standard, simplifying installation and setup. DirectX allows you to use GPUs to handle parallel and compute-intensive tasks. It also reduces overload and optimizes the use of GPUs as parallel processors.

Step 1: Create an elastic Windows node pool with GPU acceleration

Create a standard Windows node pool

  1. Activate the GRID driver with a license. You can install a GRID driver in the following two ways:

  2. Create a Windows node pool that meets the following requirements. For more information, see Create a Windows node pool.

Create an elastic Windows node pool

ACK only supports using ECS public images as node images by default. You need to use a custom image to create an elastic Windows node. The process is as follows.

  1. Submit a ticket to request a shared Windows image with a GRID driver that has an activated license. Only Windows Server 2019 and Windows Server 2022 are supported. Specify the Windows version in the ticket If you have other requirements.

  2. Create a Windows node pool that meets the following requirements. For more information, see Create a Windows node pool.

    1. Instance type: GPU-accelerated compute-optimized instance types or vGPU-accelerated instance types. For more information about supported instance types, see GPU-accelerated compute-optimized instance families or vGPU-accelerated instance families.

    2. Operating system: Select the OS based on your business requirements. Example: Windows Server 2022.

    3. Custom image: Select the image that you requested.

Step 2: Install the DirectX device plug-in on Windows nodes

Deploy the DirectX device plug-in as a DaemonSet on Windows nodes.

  1. Create a file named directx-device-plugin-windows.yaml and copy the following code to the file:

    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      labels:
        k8s-app: directx-device-plugin-windows
      name: directx-device-plugin-windows
      namespace: kube-system
    spec:
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          k8s-app: directx-device-plugin-windows
      template:
        metadata:
          annotations:
            scheduler.alpha.kubernetes.io/critical-pod: ""
          labels:
            k8s-app: directx-device-plugin-windows
        spec:
          tolerations:
            - operator: Exists
          # since 1.18, we can specify "hostNetwork: true" for Windows workloads, so we can deploy an application without NetworkReady.
          hostNetwork: true
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                  - matchExpressions:
                      - key: type
                        operator: NotIn
                        values:
                          - virtual-kubelet
                      - key: beta.kubernetes.io/os
                        operator: In
                        values:
                          - windows
                      - key: windows.alibabacloud.com/deployment-topology
                        operator: In
                        values:
                          - "2.0"
                      - key: windows.alibabacloud.com/directx-supported
                        operator: In
                        values:
                          - "true"
                  - matchExpressions:
                      - key: type
                        operator: NotIn
                        values:
                          - virtual-kubelet
                      - key: kubernetes.io/os
                        operator: In
                        values:
                          - windows
                      - key: windows.alibabacloud.com/deployment-topology
                        operator: In
                        values:
                          - "2.0"
                      - key: windows.alibabacloud.com/directx-supported
                        operator: In
                        values:
                          - "true"
          containers:
            - name: directx
              command:
                - pwsh.exe
                - -NoLogo
                - -NonInteractive
                - -File
                - entrypoint.ps1
              # Modify the region information in the image address below according to the region of your cluster.
              image: registry-cn-hangzhou-vpc.ack.aliyuncs.com/acs/directx-device-plugin-windows:v1.0.0
              imagePullPolicy: IfNotPresent
              volumeMounts:
                - name: host-binary
                  mountPath: c:/host/opt/bin
                - name: wins-pipe
                  mountPath: \\.\pipe\rancher_wins
          volumes:
            - name: host-binary
              hostPath:
                path: c:/opt/bin
                type: DirectoryOrCreate
            - name: wins-pipe
              hostPath:
                path: \\.\pipe\rancher_wins
  2. Run the following command to deploy the directx-device-plugin-windows.yaml file and install the DirectX device plug-in.

    kubectl create -f directx-device-plugin-windows.yaml

Step 3: Deploy a Windows workload that has GPU acceleration enabled for DirectX

The DirectX device plug-in can automatically add the class/<interface class GUID> device to Windows containers to enable accessing DirectX services on the Elastic Compute Service (ECS) host. For more information, see Devices in containers on Windows.

Add the following resources parameter for the Windows workload that requires GPU acceleration and redeploy the workload:

spec:
  ...
  template:
    ...
    spec:
      ...
      containers:
        - name: gpu-user
          ...
+         resources:
+           limits:
+             windows.alibabacloud.com/directx: "1"
+           requests:
+             windows.alibabacloud.com/directx: "1"
Important

The preceding configuration does not allocate all GPU resources on the ECS host to the containers, nor prevent other applications from accessing the GPUs on the ECS host. Instead, GPU resources are dynamically scheduled between the ECS host and containers. This means that you can run multiple Windows containers on the ECS host and each container can use DirectX hardware acceleration.

For more information about GPU acceleration in Windows containers, see GPU acceleration in Windows containers.

Step 4: Verify whether GPU acceleration is enabled for the Windows workload

You can use the following method to verify whether the DirectX device plug-in is deployed on Windows nodes.

  1. Create a file named gpu-job-windows.yaml and copy the following code to the file:

    apiVersion: batch/v1
    kind: Job
    metadata:
      labels:
        k8s-app: gpu-job-windows
      name: gpu-job-windows
      namespace: default
    spec:
      parallelism: 1
      completions: 1
      backoffLimit: 3
      manualSelector: true
      selector:
        matchLabels:
          k8s-app: gpu-job-windows
      template:
        metadata:
          labels:
            k8s-app: gpu-job-windows
        spec:
          restartPolicy: Never
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                  - matchExpressions:
                      - key: type
                        operator: NotIn
                        values:
                          - virtual-kubelet
                      - key: beta.kubernetes.io/os
                        operator: In
                        values:
                          - windows
                  - matchExpressions:
                      - key: type
                        operator: NotIn
                        values:
                          - virtual-kubelet
                      - key: kubernetes.io/os
                        operator: In
                        values:
                          - windows
          tolerations:
            - key: os
              value: windows
          containers:
            - name: gpu
              # Modify the region information in the image address below according to the region of your cluster.
              image: registry-cn-hangzhou-vpc.ack.aliyuncs.com/acs/sample-gpu-windows:v1.0.0
              imagePullPolicy: IfNotPresent
              resources:
                limits:
                  windows.alibabacloud.com/directx: "1"
                requests:
                  windows.alibabacloud.com/directx: "1"
    Note
    • Image registry-{region}-vpc.ack.aliyuncs.com/acs/sample-gpu-windows is a sample image for GPU acceleration in Windows containers provided by ACK. This image is built on top of Microsoft Windows. For more information, see microsoft-windows.

    • In this example, WinMLRunner is used to generate simulated input data. After GPU acceleration is enabled for the gpu-job-windows task, 100 evaluations are performed based on the Tiny YOLOv2 model to output the final performance data. Actual results may vary depending on your operating environment.

    • The image file is 15.3 GB in size and may require a long time to pull the image when you use it to deploy applications.

  2. Run the following command to deploy gpu-job-windows.yaml and create the sample application:

    kubectl create -f gpu-job-windows.yaml
  3. Run the following command to query the log of the gpu-job-windows application:

    kubectl logs -f gpu-job-windows

    Expected output:

    INFO: Executing model of "tinyyolov2-7" 100 times within GPU driver ...
    
    Created LearningModelDevice with GPU: NVIDIA GRID T4-8Q
    Loading model (path = c:\data\tinyyolov2-7\model.onnx)...
    =================================================================
    Name: Example Model
    Author: OnnxMLTools
    Version: 0
    Domain: onnxconverter-common
    Description: The Tiny YOLO network from the paper 'YOLO9000: Better, Faster, Stronger' (2016), arXiv:1612.08242
    Path: c:\data\tinyyolov2-7\model.onnx
    Support FP16: false
    
    Input Feature Info:
    Name: image
    Feature Kind: Image (Height: 416, Width:  416)
    
    Output Feature Info:
    Name: grid
    Feature Kind: Float

    The output shows that GPU acceleration is enabled for the gpu-job-windows application.