All Products
Search
Document Center

Elastic GPU Service:Elastic GPU Service FAQ

Last Updated:Sep 09, 2024

This topic provides answers to some frequently asked questions about Elastic GPU Service to help you troubleshoot related issues.

Item

FAQ

GPU-accelerated instances

GPUs

Tesla or GRID drivers

GPU monitoring

How do I view GPU monitoring data?

Others

How do I install cGPU?

Do GPU-accelerated instances support Android emulators?

Only the following GPU-accelerated compute-optimized Elastic Compute Service (ECS) Bare Metal Instance families support Android emulators:

ebmgn7e, ebmgn7i, ebmgn7, ebmgn6ia, ebmgn6e, ebmgn6v, and ebmgn6i.

Can I change the instance types of GPU-accelerated instances?

You can change the instance types of only specific GPU-accelerated instances.

For more information, see Instance types and families that support instance type changes.

Can regular ECS instance families be upgraded or changed to GPU-accelerated instance families?

No, regular ECS instance families cannot be upgraded or changed to GPU-accelerated instance families.

For more information, see Instance types and families that support instance type changes.

How do I transmit data between GPU-accelerated instances and regular ECS instances?

GPU-accelerated instances and regular ECS instances can transmit data to each other without additional connectivity configurations.

GPU-accelerated instances provide GPU acceleration and the same performance level as regular ECS instances. By default, data can be transmitted between GPU-accelerated instances and regular ECS instances within the same security group over an internal network without the need to configure network connectivity.

What are the differences between GPUs and CPUs?

The following table describes the differences between GPUs and CPUs.

Item

GPU

CPU

ALU

A large number of arithmetic logic units (ALUs) that can be used for large-scale parallel computing.

A small number of powerful ALUs.

Logic control unit

Simple logic control units.

Complex logic control units.

Cache

A small size of cache that is used for threads and cannot be used to store accessed data.

A large size of cache that stores data to increase the speed of data access and reduce latency.

Response mode

GPUs can integrate all tasks before the GPUs perform batch processing.

CPUs can respond to a task in real time.

Scenario

Compute-intensive and high-throughput scenarios where multiple threads run in parallel to process highly similar tasks.

Serial computing scenarios that involve complex logic and require high response speeds.

Why am I unable to view GPUs by running the nvidia-smi command after I purchase GPU-accelerated instances?

Cause: This issue may occur if you have not installed or failed to install Tesla or GRID drivers on the GPU-accelerated instances.

Solution: Install drivers based on the instance families of the GPU-accelerated instances to use the high-performance features of the instances. The following section describes the drivers that you can install and how to install the drivers:

How do I view the details of the GPUs that are used by GPU-accelerated instances?

The methods that you can use to view the details of the GPUs vary based on the operating system types of your GPU-accelerated instances. The following section describes how to view the details:

  • If your GPU-accelerated instances run Linux, run the nvidia-smi command to view the details of the GPUs.

  • If your GPU-accelerated instances run Windows, view the details of the GPUs in from your computer.

Note

If you want to view information about the GPUs, such as the idle rate, utilization, temperature, and power, go to the CloudMonitor console. For more information, see GPU monitoring.

Which drivers do I need to install on vGPU-accelerated instances?

You must install GRID drivers on vGPU-accelerated instances.

In general-purpose computing or graphics acceleration scenarios, you can load the GRID driver when you create a GPU-accelerated instance. You can also install the GRID driver by using Cloud Assistant after you create the GPU-accelerated instance.

Which drivers do I need to install when I use tools such as OpenGL and Direct3D for graphics computing on GPU-accelerated compute-optimized instances?

Install drivers on GPU-accelerated compute-optimized instances based on the operating system types of the instances.

Why does the CUDA version become inconsistent after I create and install a GPU-accelerated instance?

After you run the nvidia-smi command, the system displays the latest CUDA version that the GPU-accelerated instance supports, instead of the CUDA version that you selected when you created the GPU-accelerated instance.

What do I do if a black screen appears on a VNC client when I use the VNC client to connect to a GPU-accelerated Windows instance on which a GRID driver is installed?

  • Cause: After you install a GRID driver on a GPU-accelerated Windows instance, the GRID driver controls the output display of the virtual machine. The Virtual Network Computing (VNC) client can no longer obtain the output display that is processed by the integrated GPU on the instance. Then, a black screen appears on the VNC client, which is normal.

  • Solution: Connect to the GPU-accelerated Windows instance by using Workbench. For more information, see Connect to a Windows instance by using a password or key.

How do I obtain GRID licenses?

You can obtain GRID licenses based on the operating system types of your GPU-accelerated instances.

How do I view GPU monitoring data?

You can view GPU monitoring data in the CloudMonitor console or by calling the DescribeMetricList operation. For more information, see GPU monitoring.

How do I install cGPU?

We recommend that you install and use cGPU by using the Docker runtime environment of ACK, regardless of whether you are an enterprise user or an individual user. For more information, see Configure the GPU sharing component.