All Products
Search
Document Center

Container Service for Kubernetes:Add a GPU-accelerated node

Last Updated:Nov 13, 2024

Container Service for Kubernetes (ACK) Edge clusters facilitate the management of on-premises GPU resources within edge node pools. This topic describes how to add a GPU-accelerated node to an edge node pool in an ACK Edge cluster.

Prerequisites

Limits

  • Make sure that your cluster has a sufficient node quota. To add more nodes, submit an application in the Quota Center console. For more information about the quota limits of ACK Edge clusters, see Quotas and limits.

  • When you add a GPU-accelerated node, access to some endpoints is required. You must configure a security group on the node side to remove any restrictions and allow this access. For more information, see Configure endpoints and IP routing for edge nodes.

  • When adding a GPU-accelerated node, select a supported GPU model from the following table. For more information about how to add a GPU-accelerated node, see Procedure. If your GPU model is not listed, submit a ticket for support.

    System architecture

    GPU model

    Edge Kubernetes cluster version

    AMD64/x86_64

    Nvidia_Tesla_T4

    ≥1.16.9-aliyunedge.1

    AMD64/x86_64

    Nvidia_Tesla_P4

    ≥1.16.9-aliyunedge.1

    AMD64/x86_64

    Nvidia_Tesla_P100

    ≥1.16.9-aliyunedge.1

    AMD64/x86_64

    Nvidia_Tesla_V100

    ≥1.18.8-aliyunedge.1

    AMD64/x86_64

    Nvidia_Tesla_A10

    ≥1.20.11-aliyunedge.1

    AMD64/x86_64

    Nvidia_L40

    ≥1.26.3-aliyun.1

Procedure

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.

  3. On the Node Pools page, find the node pool that you want to manage and choose More > Add Existing Node in the Actions column.

  4. On the Select Existing ECS Instance wizard page, choose Manual and select an existing instance.

  5. Click Next Step to go to the Specify Instance Information wizard page. You can set the parameters that are used to add the node. For more information about the parameters, see Parameters.

    image

    Note
    • You must configure the gpuVersion parameter in the script to connect the node to the cloud. For more information about the supported GPU models, see Limits.

    • After you configure the parameters, the connection tool automatically installs nvidia-containerd-runtime. For more information, see nvidia-containerd-runtime.

  6. After you set the parameters, click Next Step. On the Complete wizard page, click Copy to copy the script to the edge node that you want to add. Then, execute the script on the node.

    If the following result is returned, the node is added to the cluster.

    接入成功

References

  • If you have any problems when you add edge nodes, see Diagnose edge node problems.

  • For more information about how to remove an edge node, see Remove edge nodes.

  • ACK Edge clusters support edge node autonomy. Edge node autonomy ensures that applications on an edge node can still run as expected when the edge node is disconnected from the cloud. For more information, see Configure edge node autonomy.