All Products
Search
Document Center

Elastic GPU Service:Automatically install or load the Tesla driver when you create a GPU-accelerated instance

Last Updated:Oct 14, 2024

In general-purpose computing and graphics acceleration scenarios, GPU-accelerated instances can provide enhanced computing and graphics rendering capabilities after you install the NVIDIA Tesla driver on the instances. You can configure parameters to automatically install or load the Tesla driver when you create a GPU-accelerated instance. You can also manually install the Tesla driver after a GPU-accelerated instance is created. This topic describes how to automatically install or load the Tesla driver when you create a GPU-accelerated instance.

Driver installation methods

The following table describes the methods that can be used to automatically install or load the Tesla driver. You can choose a method based on the performance requirements in general-purpose computing and graphics acceleration scenarios.

Method

Description

References

Public image

When you create a GPU-accelerated instance, select a public image and select Auto-install GPU Driver.

Automatically install the driver by using a public image

Automatic installation script

When you create a GPU-accelerated instance, do not select Auto-install GPU Driver in the Image section. Instead, enter an automatic installation script in the field in the User Data part to install the Tesla driver.

Install the driver by using an automatic installation script

Automatically install the driver by using a public image

You can select Auto-install GPU Driver only for specific Linux public images. If you use a public image and select Auto-install GPU Driver, the system automatically installs the Tesla driver when you create the GPU-accelerated instance.

  1. Go to the instance buy page in the Elastic Compute Service (ECS) console.

  2. Click the Custom Launch tab.

  3. Configure parameters for the instance based on your business requirements. The parameters include Billing Method, Region, Network and Zone, Instance Type, and Image.

    This section describes how to configure the Instance Type and Image parameters. For more information about other parameters, see Parameter settings. The following table lists the instance families of GPU-accelerated instances for which you can install the Tesla driver when you create the instances, the supported image versions, and the corresponding driver versions.

    Note

    The Tesla driver is used to drive physical GPUs and can be used together with the CUDA and cuDNN libraries to improve GPU utilization. The CUDA and cuDNN libraries are installed together with the Tesla driver. To keep your system up-to-date, we recommend that you use the latest versions of the Tesla driver, CUDA library, and cuDNN library.

    Instance family

    Public image version

    Tesla driver version

    CUDA library version

    cuDNN library version

    • gn7e, gn7s, gn7i, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7e, ebmgn7i, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3

    • Ubuntu 22.04, Ubuntu 20.04, and Ubuntu 18.04

    • CentOS 8.x and CentOS 7.x

    Note

    ebmgn7e does not support images of Ubuntu 18.04.

    550.90.07

    12.4.1

    9.2.0.82

    • gn7e, gn7s, gn7i, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7e, ebmgn7i, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3

    • Ubuntu 20.04 and Ubuntu 18.04

    • CentOS 8.x and CentOS 7.x

    Note

    ebmgn7e does not support images of Ubuntu 18.04.

    535.154.05

    12.1.1

    8.9.7.29

    • gn7e, gn7s, gn7i, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7, ebmgn7i, ebmgn7e, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3

    • Ubuntu 20.04 and Ubuntu 18.04

    • CentOS 8.x and CentOS 7.x

    525.105.17

    12.0.1

    8.9.1.23

    • gn7i, gn7e, gn7s, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7, ebmgn7i, ebmgn7e, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3

    • Ubuntu 20.04, Ubuntu 18.04, and Ubuntu 16.04

    • CentOS 8.x and CentOS 7.x

    • Debian 10.10

    470.161.03

    11.4.1

    8.2.4

    • gn7, gn7i, gn7e, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7, ebmgn7i, ebmgn7e, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2

    • Ubuntu 20.04, Ubuntu 18.04, and Ubuntu 16.04

    • CentOS 8.x and CentOS 7.x

    460.91.03

    11.2.2

    8.1.1

    • gn7, gn7e, gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn7, ebmgn7e, ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2

    • Ubuntu 20.04, Ubuntu 18.04, and Ubuntu 16.04

    • CentOS 8.x and CentOS 7.x

    460.91.03

    11.0.2

    • 8.1.1

    • 8.0.4

    • gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Alibaba Cloud Linux 2

    • Ubuntu 18.04 and Ubuntu 16.04

    • CentOS 8.x and CentOS 7.x

    460.91.03

    10.2.89

    • 8.1.1

    • 8.0.4

    • 7.6.5

    • gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Ubuntu 18.04 and Ubuntu 16.04

    • CentOS 7.x

    • 450.80.02

    • 440.64.00

    10.1.168

    • 8.0.4

    • 7.6.5

    • 7.5.0

    • gn6v, gn6i, gn6e, gn5, and gn5i

    • ebmgn6v, ebmgn6i, ebmgn6e, and ebmgn5i

    • Ubuntu 18.04 and Ubuntu 16.04

    • CentOS 7.x

    • 450.80.02

    • 440.64.00

    10.0.130

    • 7.6.5

    • 7.5.0

    • 7.4.2

    • 7.3.1

    Important

    In this example, a gn7i instance is used. On the Public Images tab of the Image section, select a Linux distribution and version, such as Alibaba Cloud Linux 3.2104 LTS 64-bit. Then, select Auto-install GPU Driver, and select a CUDA library version, driver version, and cuDNN library version. This way, the system automatically installs the Tesla driver when you create the GPU-accelerated instance.

    Dingtalk_20240906134235.jpg

    After the instance is created or started, take note of the following information about the Tesla driver:

    The system requires approximately 10 to 20 minutes to automatically install the Tesla driver. The duration varies based on the private bandwidth and the number of vCPUs supported by the instance type. To view the installation process, you can connect to the instance. You can also check the installation log in /root/auto_install/auto_install.log after the installation is complete. The following table describes the information displayed during the installation process.

    Installation state

    Displayed information

    Installing

    The installation progress bar appears.

    Installed

    The installation result ALL INSTALL OK appears.

    Installation failed

    The installation result INSTALL FAIL appears.

    Important

    Do not perform operations on the instance during the installation process. This is because the GPU becomes unavailable during the installation process. If specific GPU-related software fails to be automatically installed, the instance may become unavailable.

  4. Follow the on-screen instructions to complete the payment.

Install the driver by using an automatic installation script

If you do not select Auto-install GPU Driver in the Image section when you create a GPU-accelerated instance, you can enter an automatic installation script in the field in the User Data part to install the Tesla driver.

Parameters in an automatic installation script

If you use an automatic installation script, you must modify the following parameters based on your business requirements.

Change the versions of the Tesla driver, CUDA library, and cuDNN library based on the instance family and image that you use. For more information about the supported versions, see the "table" provided in the Automatically install the driver by using a public image section.

In this example, the Tesla driver version is changed to 470.161.03, the CUDA library version is changed to 11.4.1, and the cuDNN library version is changed to 8.2.4. Sample code:

DRIVER_VERSION="470.161.03"
CUDA_VERSION="11.4.1"
CUDNN_VERSION="8.2.4"

Procedure

  1. Go to the instance buy page in the ECS console.

  2. Click the Custom Launch tab.

  3. Configure parameters for the instance based on your business requirements. The parameters include Billing Method, Region, Network and Zone, Instance Type, Image, and User Data.

    For more information about the parameters, see Parameter settings.

  4. In the field in the User Data part of the Advanced Settings(Optional) section, enter the automatic installation script that you prepared.

    You can prepare an automatic installation script. For more information, see Parameters in an automatic installation script.

    In this example, the script uses the .run installation package to install modules, such as the Tesla driver. Sample script:

    #!/bin/sh
    
    #Please input version to install
    DRIVER_VERSION="550.90.07"
    CUDA_VERSION="12.4.1"
    CUDNN_VERSION="9.2.0.82"
    IS_INSTALL_eRDMA="FALSE"
    IS_INSTALL_RDMA="FALSE"
    INSTALL_DIR="/root/auto_install"
    
    #using .run to install driver and cuda
    auto_install_script="auto_install_v4.0.sh"
    
    script_download_url=$(curl http://100.100.100.200/latest/meta-data/source-address | head -1)"/opsx/ecs/linux/binary/script/${auto_install_script}"
    echo $script_download_url
    
    rm -rf $INSTALL_DIR
    mkdir -p $INSTALL_DIR
    cd $INSTALL_DIR && wget -t 10 --timeout=10 $script_download_url && bash ${INSTALL_DIR}/${auto_install_script} $DRIVER_VERSION $CUDA_VERSION $CUDNN_VERSION $IS_INSTALL_RDMA $IS_INSTALL_eRDMA

    Dingtalk_20240906131054.jpg

  5. Follow the on-screen instructions to complete the payment.

    Note
    • If you call the RunInstances operation to create a GPU-accelerated instance, you can install the Tesla driver only by using the UserData parameter to upload the automatic installation script. For more information, see RunInstances.

    • If the system does not automatically install the Tesla driver when you create a GPU-accelerated instance, you can run an automatic installation script after the instance is created to install software, such as the Tesla driver. To install software, you must log on to the instance by using SSH, create a file on the instance, copy your automatic installation script to the instance, and then run the script as a shell script. For more information about how to connect to an instance, see ECS instance connection method overview.

References

If the system does not automatically install or load the Tesla driver when you create a GPU-accelerated compute-optimized instance in general-purpose computing and graphics acceleration scenarios, you must install the driver after the instance is created. For more information, see the following topics: