When you install an NVIDIA driver on a node in a Container Service for Kubernetes (ACK) cluster, you must make sure that the version of the NVIDIA driver that you install is supported by ACK. This topic describes the NVIDIA driver versions supported by ACK.
The following table describes the NVIDIA driver versions supported by ACK.
If the version of Alibaba Cloud Linux 3 that you use is 3.7 or later, use an NVIDIA driver that is released after October 2022. Alibaba Cloud Linux 3.7 was released on May 15, 2023.
For more information about the release notes for Alibaba Cloud Linux 3, see Release notes for Alibaba Cloud Linux 3.
For more information about driver versions and release dates, see NVIDIA official website.
The XID 119 or XID 120 error occasionally occurs in NVIDIA driver 510 and later. If you encounter these errors, refer to What do I do if the XID 119 or XID 120 error occurs and the GPU cannot be found?
ACK occasionally updates the default driver versions for different cluster versions. This may lead to changes in driver versions for newly added GPU nodes in your cluster. To avoid this, we recommend that you specify a driver version for your cluster node pool. For more information about how to configure node pool labels to specify the GPU driver version, see Specify an NVIDIA driver version for nodes by adding a label.
Kubernetes version | Default NVIDIA driver version | Custom NVIDIA driver version | Supported NVIDIA driver version |
1.30 and later | 535.161.07 | Supported |
|
1.28 | 535.161.07 | Supported | |
1.26 | 535.161.07 | Supported | |
1.24 | 535.161.07 | Supported | |
1.22 | 535.161.07 | Supported | |
1.20 | 535.161.07 | Supported | |
1.18.8 | 418.181.07 | Supported |
|
1.16.9 | 418.181.07 | Supported | |
1.16.6 | 418.87.01 | Not supported | |
1.14.8 | 418.181.07 | Supported |