All Products
Search
Document Center

Elastic Compute Service:Elastic GPU Service instance families (gn, vgn, and sgn series)

Last Updated:Dec 19, 2024

Elastic GPU Service provides GPU-accelerated computing capabilities and ready-to-use, scalable GPU computing resources. Elastic GPU Service is an elastic computing service provided by Alibaba Cloud. This service combines the computing power of GPUs and CPUs to address the challenges of scenarios such as AI, high-performance computing, and professional graphics and image processing.

Background information

Before you read further in this topic, you must be familiar with the following information:

After you determine an instance type for your use case, you may need to learn about the following information:

  • Regions in which the instance type is available for purchase. Instance types that are available for purchase vary based on the region. You can go to the Instance Types Available for Each Region page to view the instance types available for purchase in each region. Alternatively, you can call the DescribeRegions and DescribeZones operations to query the available regions and the zones in a specific region.

  • Estimated instance costs. You can calculate the price of instances that uses different billing methods in the Price Calculator. You can also call the DescribePrice operation to query information about the most recent prices of ECS resources.

  • Instructions for purchasing an instance. You can go to the ECS instance buy page to place a purchase order for instances.

You may be concerned about the following information:

  • Retired instance families. If you cannot find an instance type in this topic, the instance type may be in a retired instance family. For information about retired instance families, see Retired instance families.

  • Supported instance type changes. Before you change the instance type of an instance, check whether the instance type can be changed and identify compatible instance types. For more information, see Instance types and families that support instance type changes.

vGPU-accelerated instance family

GPU-accelerated compute-optimized instance family

Not recommended instance families (If the following instance families are sold out, we recommend that you use the instance families in the preceding columns.)

sgn7i-vws, vGPU-accelerated instance family with shared CPUs

  • Introduction:

    • This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.

    • Instances of this instance family share CPU and network resources to maximize the utilization of underlying resources. Each instance has exclusive access to its memory and GPU memory to provide data isolation and performance assurance.

      Note

      If you want to use exclusive CPU resources, select the vgn7i-vws instance family.

    • This instance family comes with an NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for Computer Aided Design (CAD) software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.

  • Supported scenarios:

    • Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification

    • Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming

    • 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design

  • Compute:

    • Uses NVIDIA A10 GPUs that have the following features:

      • Innovative NVIDIA Ampere architecture

      • Support for acceleration features, such as vGPU, RTX, and TensorRT, to provide diversified business support

    • Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports Enterprise SSDs (ESSDs) and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The sgn7i-vws instance family includes the following instance types: ecs.sgn7i-vws-m2.xlarge, ecs.sgn7i-vws-m4.2xlarge, ecs.sgn7i-vws-m8.4xlarge, ecs.sgn7i-vws-m2s.xlarge, ecs.sgn7i-vws-m4s.2xlarge, and ecs.sgn7i-vws-m8s.4xlarge. The following table describes the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline/burst bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.sgn7i-vws-m2.xlarge

4

15.5

NVIDIA A10 * 1/12

24GB * 1/12

1.5/5

500,000

4

2

2

1

ecs.sgn7i-vws-m4.2xlarge

8

31

NVIDIA A10 * 1/6

24GB * 1/6

2.5/10

1,000,000

4

4

6

1

ecs.sgn7i-vws-m8.4xlarge

16

62

NVIDIA A10 * 1/3

24GB * 1/3

5/20

2,000,000

8

4

10

1

ecs.sgn7i-vws-m2s.xlarge

4

8

NVIDIA A10 * 1/12

24GB * 1/12

1.5/5

500,000

4

2

2

1

ecs.sgn7i-vws-m4s.2xlarge

8

16

NVIDIA A10 * 1/6

24GB * 1/6

2.5/10

1,000,000

4

4

6

1

ecs.sgn7i-vws-m8s.4xlarge

16

32

NVIDIA A10 * 1/3

24GB * 1/3

5/20

2,000,000

8

4

10

1

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:

NVIDIA A10 * 1/12. NVIDIA A10 is the GPU model. 1/12 indicates that a GPU is sliced into 12 GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

vgn7i-vws, vGPU-accelerated instance family

  • Introduction:

    • This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.

    • This instance family comes with an NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for CAD software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.

  • Supported scenarios:

    • Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification

    • Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming

    • 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design

  • Compute:

    • Uses NVIDIA A10 GPUs that have the following features:

      • Innovative NVIDIA Ampere architecture

      • Support for acceleration features, such as vGPU, RTX, and TensorRT, to provide diversified business support

    • Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The vgn7i-vws instance family includes the following instance types: ecs.vgn7i-vws-m4.xlarge, ecs.vgn7i-vws-m8.2xlarge, ecs.vgn7i-vws-m12.3xlarge, and ecs.vgn7i-vws-m24.7xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.vgn7i-vws-m4.xlarge

4

30

NVIDIA A10 * 1/6

24GB * 1/6

3

1,000,000

4

4

10

1

ecs.vgn7i-vws-m8.2xlarge

10

62

NVIDIA A10 * 1/3

24GB * 1/3

5

2,000,000

8

6

10

1

ecs.vgn7i-vws-m12.3xlarge

14

93

NVIDIA A10 * 1/2

24GB * 1/2

8

3,000,000

8

6

15

1

ecs.vgn7i-vws-m24.7xlarge

30

186

NVIDIA A10 * 1

24GB * 1

16

6,000,000

12

8

30

1

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:

NVIDIA A10 * 1/6. NVIDIA A10 is the GPU model. 1/6 indicates that a GPU is sliced into six GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

vgn6i-vws, vGPU-accelerated instance family

Important
  • In light of the NVIDIA GRID driver upgrade, Alibaba Cloud upgrades the vgn6i instance family to the vgn6i-vws instance family. The vgn6i-vws instance family uses the latest NVIDIA GRID driver and provides an NVIDIA GRID vWS license. To apply for free images for which the NVIDIA GRID driver is pre-installed, submit a ticket.

  • To use other public images or custom images that do not contain an NVIDIA GRID driver, submit a ticket to apply for the GRID driver file and install the NVIDIA GRID driver. Alibaba Cloud does not charge additional license fees for the GRID driver.

  • Supported scenarios:

    • Real-time rendering for cloud gaming

    • Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications

    • AI (deep learning and machine learning) inference for elastic Internet service deployment

    • Educational environment of deep learning

    • Modeling experiment environment of deep learning

  • Compute:

    • Uses NVIDIA T4 GPUs.

    • Uses vGPUs.

      • Supports the 1/4 and 1/2 compute capacity of NVIDIA Tesla T4 GPUs.

      • Supports 4 GB and 8 GB of GPU memory.

    • Offers a CPU-to-memory ratio of 1:5.

    • Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports standard SSDs and ultra disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The vgn6i-vws instance family includes the following instance types: ecs.vgn6i-m4-vws.xlarge, ecs.vgn6i-m8-vws.2xlarge, and ecs.vgn6i-m16-vws.5xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.vgn6i-m4-vws.xlarge

4

23

NVIDIA T4 * 1/4

16GB * 1/4

2

500,000

4/2

3

10

1

ecs.vgn6i-m8-vws.2xlarge

10

46

NVIDIA T4 * 1/2

16GB * 1/2

4

800,000

8/2

4

10

1

ecs.vgn6i-m16-vws.5xlarge

20

92

NVIDIA T4 * 1

16GB * 1

7.5

1,200,000

6

4

10

1

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:

NVIDIA T4 * 1/4. NVIDIA T4 is the GPU model. 1/4 indicates that a GPU is sliced into four GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

gn8v, GPU-accelerated compute-optimized instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

  • Introduction: This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud for AI model training and the inference tasks of ultra-large models. This instance family consists of multiple instance types that provide one, two, four, or eight GPUs per instance.

  • Supported scenarios:

    • Multi-GPU parallel inference computing for large language models (LLMs) that have more than 70 billion parameters

    • Traditional AI model training and autonomous driving training, for which each GPU delivers computing power of up to 39.5 TFLOPS in the single-precision floating-point format (FP32)

    • Small and medium-sized model training scenarios that leverage the NVLink connections among the eight GPUs

  • Benefits and positioning:

    • High-speed and large-capacity GPU memory: Each GPU is equipped with 96 GB of HBM3E memory and delivers up to 4 TB/s of memory bandwidth, which greatly accelerates model training and inference.

    • High bandwidth between GPUs: Multiple GPUs are interconnected by using 900 GB/s NVLink connections. The efficiency of multi-GPU training and inference is much higher than that of previous generations of GPU-accelerated instances.

    • Quantization of large models: This instance family supports computing power in the 8-bit floating point format (FP8) and optimizes computing power for large-scale parameter training and inference. This significantly improves the computing speed of training and inference and reduces memory usage.

    • High security: This instance family supports confidential computing capabilities that cover the full link of model inference tasks. The capabilities include CPU-based Intel Trust Domain Extensions (TDX) confidential computing and GPU-based NVIDIA Confidential Computing (CC). The confidential computing capabilities ensure the security of user inference data and enterprise models in model inference and training.

  • Compute:

    • Uses the latest Cloud Infrastructure Processing Unit (CIPU) 1.0 processors.

      • Decouples computing capabilities from storage capabilities, allowing you to flexibly select storage resources based on your business requirements.

      • Provides bare metal capabilities to support peer-to-peer (P2P) communication between GPU-accelerated instances.

    • Uses the 4th-generation Intel Xeon Scalable processors that deliver a base frequency of up to 2.8 GHz and an all-core turbo frequency of up to 3.1 GHz.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, and elastic ephemeral disks (EEDs).

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Supports the Jumbo Frames feature. For more information, see Jumbo Frames.

    • Provides ultra-high network performance with a packet forwarding rate of up to 30,000,000 pps (for instances equipped with eight GPUs).

    • Supports elastic RDMA interfaces (ERIs).

      Note

      For information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.

The gn8v instance family includes the following instance types: ecs.gn8v.4xlarge, ecs.gn8v.6xlarge, ecs.gn8v-2x.8xlarge, ecs.gn8v-4x.8xlarge, ecs.gn8v-2x.12xlarge, ecs.gn8v-8x.16xlarge, ecs.gn8v-4x.24xlarge, and ecs.gn8v-8x.48xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPU memory

Network baseline bandwidth (Gbit/s)

ENIs

NIC queues per primary ENI

Private IPv4 addresses per ENI

IPv6 addresses per ENI

Maximum disks

Disk baseline IOPS

Disk baseline bandwidth (Gbit/s)

ecs.gn8v.4xlarge

16

96

96GB * 1

12

8

16

30

30

17

100,000

0.75

ecs.gn8v.6xlarge

24

128

96GB * 1

15

8

24

30

30

17

120,000

0.937

ecs.gn8v-2x.8xlarge

32

192

96GB * 2

20

8

32

30

30

25

200,000

1.25

ecs.gn8v-4x.8xlarge

32

384

96GB * 4

20

8

32

30

30

25

200,000

1.25

ecs.gn8v-2x.12xlarge

48

256

96GB * 2

25

8

48

30

30

33

300,000

1.50

ecs.gn8v-8x.16xlarge

64

768

96GB * 8

32

8

64

30

30

33

360,000

2.5

ecs.gn8v-4x.24xlarge

96

512

96GB * 4

50

15

64

30

30

49

500,000

3

ecs.gn8v-8x.48xlarge

192

1,024

96GB * 8

100

15

64

50

50

65

1,000,000

6

gn8is, GPU-accelerated compute-optimized instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

  • Introduction: This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud in response to the recent developments in the AI generation field. This instance family consists of multiple instance types that provide one, two, four, or eight GPUs per instance and have different CPU-to-GPU ratios to fit various use cases.

  • Benefits and positioning:

    • Graphic processing: This instance family uses high-frequency 5th-generation Intel Xeon Scalable processors to provide sufficient CPU capacity for smooth graphics rendering and design in 3D modeling scenarios.

    • Inference tasks: This instance family uses innovative GPUs, each with 48 GB of memory, which accelerate inference tasks and support the FP8 floating-point format. You can use this instance family together with Container Service for Kubernetes (ACK) to support the inference of various AI-generated content (AIGC) models and especially accommodate inference tasks for LLMs that have less than 70 billion parameters.

  • Supported scenarios:

    • Animation, special effects for film and television, and rendering

    • Generation of AIGC images and inference of LLMs

    • Other general-purpose AI recognition, image recognition, and speech recognition scenarios

  • Compute:

    • Uses innovative GPUs that have the following features:

      • Support for acceleration features, such as TensorRT, and the FP8 floating-point format to improve LLM inference performance.

      • Up to 48 GB of memory per GPU and support for the inference of 70B or larger LLMs on a single instance with multiple GPUs.

      • Improved graphic processing capabilities. For example, after you install a GRID driver on a gn8is instance by using Cloud Assistant or an Alibaba Cloud Marketplace image, the instance can provide graphic processing performance twice that of a 7th-generation instance.

    • Uses the latest high-frequency Intel® Xeon® processors that deliver an all-core turbo frequency of 3.9 GHz to meet complex 3D modeling requirements.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, and EEDs.

  • Network:

The gn8is instance family includes the following instance types: ecs.gn8is.2xlarge, ecs.gn8is.4xlarge, ecs.gn8is-2x.8xlarge, ecs.gn8is-4x.16xlarge, and ecs.gn8is-8x.32xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPU memory

Network baseline bandwidth (Gbit/s)

ENIs

NIC queues per primary ENI

Private IPv4 addresses per ENI

IPv6 addresses per ENI

Maximum disks

Disk baseline IOPS

Disk baseline bandwidth (Gbit/s)

ecs.gn8is.2xlarge

8

64

48GB * 1

8

4

8

15

15

17

60,000

0.75

ecs.gn8is.4xlarge

16

128

48GB * 1

16

8

16

30

30

17

120,000

1.25

ecs.gn8is-2x.8xlarge

32

256

48GB * 2

32

8

32

30

30

33

250,000

2

ecs.gn8is-4x.16xlarge

64

512

48GB * 4

64

8

64

30

30

33

450,000

4

ecs.gn8is-8x.32xlarge

128

1,024

48GB * 8

100

15

64

50

50

65

900,000

8

gn7e, GPU-accelerated compute-optimized instance family

Features:

  • Introduction:

    • This instance family allows you to select instance types that provide different numbers of GPUs and CPUs to meet your business requirements in AI use cases.

    • This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of virtual private clouds (VPCs), networks, and disks compared with instance families of the previous generation.

  • Supported scenarios:

    • Small- and medium-scale AI training

    • High-performance computing (HPC) business accelerated by using Compute Unified Device Architecture (CUDA)

    • AI inference tasks that require high GPU processing capabilities or large amounts of GPU memory

    • Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition

    • Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

    Important

    When you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn7e instance family includes the following instance types: ecs.gn7e-c16g1.4xlarge, ecs.gn7e-c16g1.8xlarge, ecs.gn7e-c16g1.16xlarge, and ecs.gn7e-c16g1.32xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn7e-c16g1.4xlarge

16

125

80GB * 1

8

3,000,000

8

8

10

1

ecs.gn7e-c16g1.8xlarge

32

250

80GB * 2

16

6,000,000

16

8

10

1

ecs.gn7e-c16g1.16xlarge

64

500

80GB * 4

32

12,000,000

32

8

10

1

ecs.gn7e-c16g1.32xlarge

128

1,000

80GB * 8

64

24,000,000

32

16

15

1

gn7i, GPU-accelerated compute-optimized instance family

  • Introduction: This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude.

  • Supported scenarios:

    • Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification

    • Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming

  • Compute:

    • Uses NVIDIA A10 GPUs that have the following features:

      • Innovative NVIDIA Ampere architecture

      • Support for acceleration features, such as RTX and TensorRT

    • Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.

    • Provides up to 752 GiB of memory, which is much larger than the memory sizes of the gn6i instance family.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn7i instance family includes the following instance types: ecs.gn7i-c8g1.2xlarge, ecs.gn7i-c16g1.4xlarge, ecs.gn7i-c32g1.8xlarge, ecs.gn7i-c32g1.16xlarge, ecs.gn7i-c32g1.32xlarge, ecs.gn7i-c48g1.12xlarge, ecs.gn7i-c56g1.14xlarge, ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn7i-c8g1.2xlarge

8

30

NVIDIA A10 * 1

24GB * 1

16

1,600,000

8

4

15

15

ecs.gn7i-c16g1.4xlarge

16

60

NVIDIA A10 * 1

24GB * 1

16

3,000,000

8

8

30

30

ecs.gn7i-c32g1.8xlarge

32

188

NVIDIA A10 * 1

24GB * 1

16

6,000,000

12

8

30

30

ecs.gn7i-c32g1.16xlarge

64

376

NVIDIA A10 * 2

24GB * 2

32

12,000,000

16

15

30

30

ecs.gn7i-c32g1.32xlarge

128

752

NVIDIA A10 * 4

24GB * 4

64

24,000,000

32

15

30

30

ecs.gn7i-c48g1.12xlarge

48

310

NVIDIA A10 * 1

24GB * 1

16

9,000,000

16

8

30

30

ecs.gn7i-c56g1.14xlarge

56

346

NVIDIA A10 * 1

24GB * 1

16

12,000,000

16

12

30

30

ecs.gn7i-2x.8xlarge

32

128

NVIDIA A10 * 2

24GB * 2

16

6,000,000

16

8

30

30

ecs.gn7i-4x.8xlarge

32

128

NVIDIA A10 * 4

24GB * 4

16

6,000,000

16

8

30

30

ecs.gn7i-4x.16xlarge

64

256

NVIDIA A10 * 4

24GB * 4

32

12,000,000

32

8

30

30

ecs.gn7i-8x.32xlarge

128

512

NVIDIA A10 * 8

24GB * 8

64

24,000,000

32

16

30

30

ecs.gn7i-8x.16xlarge

64

256

NVIDIA A10 * 8

24GB * 8

32

12,000,000

32

8

30

30

Important

You can change the following instance types only to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge: ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge.

gn7s, GPU-accelerated compute-optimized instance family

To use the gn7s instance family, submit a ticket.

  • Introduction:

    • This instance family uses the latest Intel Ice Lake processors and NVIDIA A30 GPUs that are based on NVIDIA Ampere architecture. You can select instance types that comprise appropriate mixes of GPUs and vCPUs to meet your business requirements in AI scenarios.

    • This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of VPCs, networks, and disks compared with instance families of the previous generation.

  • Supported scenarios: concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification.

  • Compute:

    • Uses NVIDIA A30 GPUs that have the following features:

      • Innovative NVIDIA Ampere architecture

      • Support for the multi-instance GPU (MIG) feature and acceleration features (based on second-generation Tensor cores) to provide diversified business support

    • Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.

    • Improves memory sizes significantly from instance families of the previous generation.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn7s instance family includes the following instance types: ecs.gn7s-c8g1.2xlarge, ecs.gn7s-c16g1.4xlarge, ecs.gn7s-c32g1.8xlarge, ecs.gn7s-c32g1.16xlarge, ecs.gn7s-c32g1.32xlarge, ecs.gn7s-c48g1.12xlarge, and ecs.gn7s-c56g1.14xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

Private IPv4 addresses per ENI

IPv6 addresses per ENI

NIC queues

ENIs

ecs.gn7s-c8g1.2xlarge

8

60

NVIDIA A30 * 1

24GB * 1

16

6,000,000

5

1

12

8

ecs.gn7s-c16g1.4xlarge

16

120

NVIDIA A30 * 1

24GB * 1

16

6,000,000

5

1

12

8

ecs.gn7s-c32g1.8xlarge

32

250

NVIDIA A30 * 1

24GB * 1

16

6,000,000

5

1

12

8

ecs.gn7s-c32g1.16xlarge

64

500

NVIDIA A30 * 2

24GB * 2

32

12,000,000

5

1

16

15

ecs.gn7s-c32g1.32xlarge

128

1,000

NVIDIA A30 * 4

24GB * 4

64

24,000,000

10

1

32

15

ecs.gn7s-c48g1.12xlarge

48

380

NVIDIA A30 * 1

24GB * 1

16

6,000,000

8

1

12

8

ecs.gn7s-c56g1.14xlarge

56

440

NVIDIA A30 * 1

24GB * 1

16

6,000,000

8

1

12

8

gn7, GPU-accelerated compute-optimized instance family

  • Supported scenarios:

    • Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition

    • Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs and ESSD AutoPL disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn7 instance family includes the following instance types: ecs.gn7-c12g1.3xlarge, ecs.gn7-c13g1.13xlarge, and ecs.gn7-c13g1.26xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn7-c12g1.3xlarge

12

94

40GB * 1

4

2,500,000

4

8

10

1

ecs.gn7-c13g1.13xlarge

52

378

40GB * 4

16

9,000,000

16

8

30

30

ecs.gn7-c13g1.26xlarge

104

756

40GB * 8

30

18,000,000

16

15

10

1

gn6i, GPU-accelerated compute-optimized instance family

  • Supported scenarios:

    • AI (deep learning and machine learning) inference for computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems

    • Real-time rendering for cloud gaming

    • Real-time rendering for AR and VR applications

    • Graphics workstations or graphics-heavy computing

    • GPU-accelerated databases

    • High-performance computing

  • Compute:

    • Uses NVIDIA T4 GPUs that have the following features:

      • Innovative NVIDIA Turing architecture

      • 16 GB of memory (320 GB/s bandwidth) per GPU

      • 2,560 CUDA cores per GPU

      • Up to 320 Turing Tensor cores per GPU

      • Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS

    • Offers a CPU-to-memory ratio of 1:4.

    • Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn6i instance family includes the following instance types: ecs.gn6i-c4g1.xlarge, ecs.gn6i-c8g1.2xlarge, ecs.gn6i-c16g1.4xlarge, ecs.gn6i-c24g1.6xlarge, ecs.gn6i-c40g1.10xlarge, ecs.gn6i-c24g1.12xlarge, and ecs.gn6i-c24g1.24xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

Disk baseline IOPS

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn6i-c4g1.xlarge

4

15

NVIDIA T4 * 1

16GB * 1

4

500,000

None

2

2

10

1

ecs.gn6i-c8g1.2xlarge

8

31

NVIDIA T4 * 1

16GB * 1

5

800,000

None

2

2

10

1

ecs.gn6i-c16g1.4xlarge

16

62

NVIDIA T4 * 1

16GB * 1

6

1,000,000

None

4

3

10

1

ecs.gn6i-c24g1.6xlarge

24

93

NVIDIA T4 * 1

16GB * 1

7.5

1,200,000

None

6

4

10

1

ecs.gn6i-c40g1.10xlarge

40

155

NVIDIA T4 * 1

16GB * 1

10

1,600,000

None

16

10

10

1

ecs.gn6i-c24g1.12xlarge

48

186

NVIDIA T4 * 2

16GB * 2

15

2,400,000

None

12

6

10

1

ecs.gn6i-c24g1.24xlarge

96

372

NVIDIA T4 * 4

16GB * 4

30

4,800,000

250,000

24

8

10

1

gn6e, GPU-accelerated compute-optimized instance family

  • Supported scenarios:

    • Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition

    • Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

  • Compute:

    • Uses NVIDIA V100 GPUs that each have 32 GB of GPU memory and support NVLink.

    • Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:

      • Innovative NVIDIA Volta architecture

      • 32 GB of HBM2 memory (900 GB/s bandwidth) per GPU

      • 5,120 CUDA cores per GPU

      • 640 Tensor cores per GPU

      • Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)

    • Offers a CPU-to-memory ratio of 1:8.

    • Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn6e instance family includes the following instance types: ecs.gn6e-c12g1.3xlarge, ecs.gn6e-c12g1.6xlarge, ecs.gn6e-c12g1.12xlarge, and ecs.gn6e-c12g1.24xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn6e-c12g1.3xlarge

12

92

NVIDIA V100 * 1

32GB * 1

5

800,000

8

6

10

1

ecs.gn6e-c12g1.6xlarge

24

182

NVIDIA V100 * 2

32GB * 2

8

1,200,000

8

8

20

1

ecs.gn6e-c12g1.12xlarge

48

368

NVIDIA V100 * 4

32GB * 4

16

2,400,000

8

8

20

1

ecs.gn6e-c12g1.24xlarge

96

736

NVIDIA V100 * 8

32GB * 8

32

4,800,000

16

8

20

1

gn6v, GPU-accelerated compute-optimized instance family

  • Supported scenarios:

    • Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition

    • Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

  • Compute:

    • Uses NVIDIA V100 GPUs.

    • Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:

      • Innovative NVIDIA Volta architecture

      • 16 GB of HBM2 memory (900 GB/s bandwidth) per GPU

      • 5,120 CUDA cores per GPU

      • 640 Tensor cores per GPU

      • Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)

    • Offers a CPU-to-memory ratio of 1:4.

    • Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn6v instance family includes the following instance types: ecs.gn6v-c8g1.2xlarge, ecs.gn6v-c8g1.4xlarge, ecs.gn6v-c8g1.8xlarge, ecs.gn6v-c8g1.16xlarge, and ecs.gn6v-c10g1.20xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

Disk baseline IOPS

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn6v-c8g1.2xlarge

8

32

NVIDIA V100 * 1

16GB * 1

2.5

800,000

None

4

4

10

1

ecs.gn6v-c8g1.4xlarge

16

64

NVIDIA V100 * 2

16GB * 2

5

1,000,000

None

4

8

20

1

ecs.gn6v-c8g1.8xlarge

32

128

NVIDIA V100 * 4

16GB * 4

10

2,000,000

None

8

8

20

1

ecs.gn6v-c8g1.16xlarge

64

256

NVIDIA V100 * 8

16GB * 8

20

2,500,000

None

16

8

20

1

ecs.gn6v-c10g1.20xlarge

82

336

NVIDIA V100 * 8

16GB * 8

32

4,500,000

250,000

16

8

20

1

gn5, GPU-accelerated compute-optimized instance family

  • Supported scenarios:

    • Deep learning

    • Scientific computing applications, such as computational fluid dynamics, computational finance, genomics, and environmental analytics

    • Server-side GPU compute workloads, such as high-performance computing, rendering, and multi-media encoding and decoding

  • Compute:

    • Uses NVIDIA P100 GPUs.

    • Offers multiple CPU-to-memory ratios.

    • Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.

  • Storage:

    • Supports high-performance local Non-Volatile Memory Express (NVMe) SSDs.

    • Is an instance family in which all instances are I/O optimized.

    • Supports standard SSDs and ultra disks.

  • Network:

    • Supports only IPv4.

    • Provides high network performance based on large computing capacity.

The gn5 instance family includes the following instance types: ecs.gn5-c4g1.xlarge, ecs.gn5-c8g1.2xlarge, ecs.gn5-c4g1.2xlarge, ecs.gn5-c8g1.4xlarge, ecs.gn5-c28g1.7xlarge, ecs.gn5-c8g1.8xlarge, ecs.gn5-c28g1.14xlarge, and ecs.gn5-c8g1.14xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

ecs.gn5-c4g1.xlarge

4

30

440

NVIDIA P100 * 1

16GB * 1

3

300,000

1

3

10

ecs.gn5-c8g1.2xlarge

8

60

440

NVIDIA P100 * 1

16GB * 1

3

400,000

1

4

10

ecs.gn5-c4g1.2xlarge

8

60

880

NVIDIA P100 * 2

16GB * 2

5

1,000,000

2

4

10

ecs.gn5-c8g1.4xlarge

16

120

880

NVIDIA P100 * 2

16GB * 2

5

1,000,000

4

8

20

ecs.gn5-c28g1.7xlarge

28

112

440

NVIDIA P100 * 1

16GB * 1

5

1,000,000

8

8

20

ecs.gn5-c8g1.8xlarge

32

240

1,760

NVIDIA P100 * 4

16GB * 4

10

2,000,000

8

8

20

ecs.gn5-c28g1.14xlarge

56

224

880

NVIDIA P100 * 2

16GB * 2

10

2,000,000

14

8

20

ecs.gn5-c8g1.14xlarge

54

480

3,520

NVIDIA P100 * 8

16GB * 8

25

4,000,000

14

8

20

gn5i, GPU-accelerated compute-optimized instance family

  • Supported scenarios: server-side GPU compute workloads, such as deep learning inference and multi-media encoding and decoding.

  • Compute:

    • Uses NVIDIA P4 GPUs.

    • Offers a CPU-to-memory ratio of 1:4.

    • Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports standard SSDs and ultra disks.

  • Network:

    • Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.

    • Provides high network performance based on large computing capacity.

The gn5i instance family includes the following instance types: ecs.gn5i-c2g1.large, ecs.gn5i-c4g1.xlarge, ecs.gn5i-c8g1.2xlarge, ecs.gn5i-c16g1.4xlarge, ecs.gn5i-c16g1.8xlarge, and ecs.gn5i-c28g1.14xlarge. Expand the following section to see a table describing the specifications of each instance type in this instance family. For information about the metrics of instance types, see Instance type metrics.

Instance types

Instance type

vCPUs

Memory (GiB)

GPUs

GPU memory

Network baseline bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IPv4 addresses per ENI

IPv6 addresses per ENI

ecs.gn5i-c2g1.large

2

8

NVIDIA P4 * 1

8GB * 1

1

100,000

2

2

6

1

ecs.gn5i-c4g1.xlarge

4

16

NVIDIA P4 * 1

8GB * 1

1.5

200,000

2

3

10

1

ecs.gn5i-c8g1.2xlarge

8

32

NVIDIA P4 * 1

8GB * 1

2

400,000

4

4

10

1

ecs.gn5i-c16g1.4xlarge

16

64

NVIDIA P4 * 1

8GB * 1

3

800,000

4

8

20

1

ecs.gn5i-c16g1.8xlarge

32

128

NVIDIA P4 * 2

8GB * 2

6

1,200,000

8

8

20

1

ecs.gn5i-c28g1.14xlarge

56

224

NVIDIA P4 * 2

8GB * 2

10

2,000,000

14

8

20

1