Container Service for Kubernetes (ACK) Edge clusters facilitate the management of on-premises GPU resources within edge node pools. This topic describes how to add a GPU-accelerated node to an edge node pool in an ACK Edge cluster.
Prerequisites
An ACK Edge cluster is created. For more information, see Create an ACK Edge cluster in the console.
A GPU driver is installed in the cluster before the node is added. For more information about driver versions, see NVIDIA driver versions supported by ACK.
Limits
Make sure that your cluster has a sufficient node quota. To add more nodes, submit an application in the Quota Center console. For more information about the quota limits of ACK Edge clusters, see Quotas and limits.
When you add a GPU-accelerated node, access to some endpoints is required. You must configure a security group on the node side to remove any restrictions and allow this access. For more information, see Configure endpoints and IP routing for edge nodes.
When adding a GPU-accelerated node, select a supported GPU model from the following table. For more information about how to add a GPU-accelerated node, see Procedure. If your GPU model is not listed, submit a ticket for support.
System architecture
GPU model
Edge Kubernetes cluster version
AMD64/x86_64
Nvidia_Tesla_T4
≥1.16.9-aliyunedge.1
AMD64/x86_64
Nvidia_Tesla_P4
≥1.16.9-aliyunedge.1
AMD64/x86_64
Nvidia_Tesla_P100
≥1.16.9-aliyunedge.1
AMD64/x86_64
Nvidia_Tesla_V100
≥1.18.8-aliyunedge.1
AMD64/x86_64
Nvidia_Tesla_A10
≥1.20.11-aliyunedge.1
AMD64/x86_64
Nvidia_L40
≥1.26.3-aliyun.1
Procedure
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, find the node pool that you want to manage and choose in the Actions column.
On the Select Existing ECS Instance wizard page, choose Manual and select an existing instance.
Click Next Step to go to the Specify Instance Information wizard page. You can set the parameters that are used to add the node. For more information about the parameters, see Parameters.
NoteYou must configure the
gpuVersion
parameter in the script to connect the node to the cloud. For more information about the supported GPU models, see Limits.After you configure the parameters, the connection tool automatically installs nvidia-containerd-runtime. For more information, see nvidia-containerd-runtime.
After you set the parameters, click Next Step. On the Complete wizard page, click Copy to copy the script to the edge node that you want to add. Then, execute the script on the node.
If the following result is returned, the node is added to the cluster.
References
If you have any problems when you add edge nodes, see Diagnose edge node problems.
For more information about how to remove an edge node, see Remove edge nodes.
ACK Edge clusters support edge node autonomy. Edge node autonomy ensures that applications on an edge node can still run as expected when the edge node is disconnected from the cloud. For more information, see Configure edge node autonomy.