Alibaba Cloud Elastic GPU Service instance families (gn, vgn, and sgn series) - Elastic Compute Service

Elastic GPU Service provides on-demand GPU-accelerated computing capabilities with auto scaling. As part of the Alibaba Cloud elastic computing family, Elastic GPU Service combines GPU and CPU computing power to meet your requirements in scenarios such as artificial intelligence (AI), high-performance computing (HPC), and professional graphics and image processing.

Note

View instance availability by region: Instance types may vary by region. We recommend that you check the purchase availability in each region.
View instance type selection guide: First, determine which instance families are suitable for your business scenario. Then, use this topic to select a specific instance type.
View instance metric descriptions: Read this topic to understand the metrics for instance types.
Use the ECS Price Calculator: You can use the price calculator to estimate instance fees.

vGPU-accelerated	GPU-accelerated compute-optimized	Not recommended (If the following instance families are sold out, use the instance families in the preceding columns)
sgn8ia, vGPU-accelerated instance family sgn7i-vws, vGPU-accelerated instance family with shared CPUs vgn7i-vws, vGPU-accelerated instance family vgn6i-vws, vGPU-accelerated instance family	gn8v and gn8v-tee, GPU-accelerated compute-optimized instance families gn8is, GPU-accelerated compute-optimized instance family gn7e, GPU-accelerated compute-optimized instance family gn7i, GPU-accelerated compute-optimized instance family gn7, GPU-accelerated compute-optimized instance family gn6i, GPU-accelerated compute-optimized instance family gn6e, GPU-accelerated compute-optimized instance family gn6v, GPU-accelerated compute-optimized instance family	gn7s, GPU-accelerated compute-optimized instance family gn5, GPU-accelerated compute-optimized instance family gn5i, GPU-accelerated compute-optimized instance family

sgn8ia, vGPU-accelerated instance family

Introduction:
- Powered by the third-generation SHENLONG architecture to provide stable and predictable high performance. Chip-level acceleration significantly improves storage performance, network performance, and computing stability. This lets you store data and load models faster.
- Includes the NVIDIA GRID Virtual Workstation (vWS) software license. This provides certified graphics acceleration for various professional computer-aided design (CAD) applications to meet professional graphic design requirements. The instances can also be used as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI reasoning.
Use cases:
- Concurrent AI reasoning tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- 3D modeling in fields that require the use of AMD Genoa processors with high clock speeds, such as animation and film production, cloud gaming, and mechanical design
Compute:
- Uses NVIDIA Lovelace GPUs that have the following features:
  - Large GPU memory and multiple GPU slicing solutions
  - Support for acceleration features, such as vGPU, RTX, and TensorRT, to provide diversified business support
- Uses AMD Genoa processors that deliver a clock speed of 3.4 GHz to 3.75 GHz to provide high computing power for 3D modeling.
Storage:
- Is an instance family in which all instances are I/O optimized.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported disk categories: enterprise SSD (ESSD), ESSD AutoPL disk, and regional Enterprise SSD (ESSD). For more information about disks, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- The larger the instance specification is, the higher network performance it has.

sgn8ia instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4/IPv6 addresses per ENI	Maximum disks	Disk baseline IOPS	Disk baseline BPS (MB/s)
ecs.sgn8ia-m2.xlarge	4	16	2 GB	2.5	1,000,000	4	4	15/15	9	30,000	244
ecs.sgn8ia-m4.2xlarge	8	32	4 GB	4	1,600,000	8	4	15/15	9	45,000	305
ecs.sgn8ia-m8.4xlarge	16	64	8 GB	7	2,000,000	16	8	30/30	17	60,000	427
ecs.sgn8ia-m16.8xlarge	32	128	16 GB	10	3,000,000	32	8	30/30	33	80,000	610
ecs.sgn8ia-m24.12xlarge	48	192	24 GB	16	4,500,000	48	8	30/30	33	120,000	1000
ecs.sgn8ia-m48.24xlarge	96	384	48 GB	32	9,000,000	64	15	30/30	33	240,000	2000

Note

The columns related to GPUs in the preceding table are for vGPUs that are sliced by using the vGPU slicing technology.
The memory and GPU memory of an sgn8ia instance are exclusive to the instance. The CPUs of the instance are shared resources with an overcommit ratio of approximately 1:1.5. If you have special requirements for the CPU computing power, we recommend that you use GPU-accelerated dedicated instance families, such as gn7i GPU-accelerated compute-optimized instances.

sgn7i-vws, vGPU-accelerated instance family with shared CPUs

Introduction:
- This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.
- Instances of this instance family share CPU and network resources to maximize the utilization of underlying resources. Each instance has exclusive access to its memory and GPU memory to provide data isolation and performance assurance.
  Note
  If you want to use exclusive CPU resources, select the vgn7i-vws instance family.
- This instance family comes with an NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for CAD software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.
Use cases:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for acceleration features, such as vGPU, RTX, and TensorRT, to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

sgn7i-vws includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.sgn7i-vws-m2.xlarge	4	15.5	NVIDIA A10 × 1/12	24 GB × 1/12	1.5/5	500,000	4	2	2	1
ecs.sgn7i-vws-m4.2xlarge	8	31	NVIDIA A10 × 1/6	24 GB × 1/6	2.6/10	1,000,000	4	4	6	1
ecs.sgn7i-vws-m8.4xlarge	16	62	NVIDIA A10 × 1/3	24 GB × 1/3	5/20	2,000,000	8	4	10	1
ecs.sgn7i-vws-m2s.xlarge	4	8	NVIDIA A10 × 1/12	24 GB × 1/12	1.5/5	500,000	4	2	2	1
ecs.sgn7i-vws-m4s.2xlarge	8	16	NVIDIA A10 × 1/6	24 GB × 1/6	2.6/10	1,000,000	4	4	6	1
ecs.sgn7i-vws-m8s.4xlarge	16	32	NVIDIA A10 × 1/3	24 GB × 1/3	5/20	2,000,000	8	4	10	1

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:

NVIDIA A10 * 1/12. NVIDIA A10 is the GPU model. 1/12 indicates that a GPU is sliced into 12 GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

vgn7i-vws, vGPU-accelerated instance family

Introduction:
- This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.
- This instance family comes with an NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for CAD software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.
Use cases:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for acceleration features, such as vGPU, RTX, and TensorRT, to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

vgn7i-vws includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.vgn7i-vws-m4.xlarge	4	30	NVIDIA A10 × 1/6	24 GB × 1/6	3	1,000,000	4	4	10	1
ecs.vgn7i-vws-m8.2xlarge	10	62	NVIDIA A10 × 1/3	24 GB × 1/3	5	2,000,000	8	6	10	1
ecs.vgn7i-vws-m12.3xlarge	14	93	NVIDIA A10 × 1/2	24 GB × 1/2	8	3,000,000	8	6	15	1
ecs.vgn7i-vws-m24.7xlarge	30	186	NVIDIA A10 × 1	24 GB × 1	16	6,000,000	12	8	30	1

Note

NVIDIA A10 * 1/6. NVIDIA A10 is the GPU model. 1/6 indicates that a GPU is sliced into six GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

vgn6i-vws, vGPU-accelerated instance family

Important

In light of the NVIDIA GRID driver upgrade, Alibaba Cloud upgrades the vgn6i instance family to the vgn6i-vws instance family. The vgn6i-vws instance family uses the latest NVIDIA GRID driver and provides an NVIDIA GRID vWS license. To apply for free images for which the NVIDIA GRID driver is pre-installed, submit a ticket.
To use other public images or custom images that do not contain an NVIDIA GRID driver, submit a ticket to apply for the GRID driver file and install the NVIDIA GRID driver. Alibaba Cloud does not charge additional license fees for the GRID driver.

Use cases:
- Real-time rendering for cloud gaming
- Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications
- AI (deep learning and machine learning) inference for elastic Internet service deployment
- Educational environment of deep learning
- Modeling experiment environment of deep learning
Compute:
- Uses NVIDIA T4 GPUs.
- Uses vGPUs.
  - Supports the 1/4 and 1/2 compute capacity of NVIDIA Tesla T4 GPUs.
  - Supports 4 GB and 8 GB of GPU memory.
- Offers a CPU-to-memory ratio of 1:5.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, Regional ESSDs, standard SSD, and ultra disk. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

vgn6i-vws includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.vgn6i-m4-vws.xlarge	4	23	NVIDIA T4 × 1/4	16 GB × 1/4	2	500,000	4/2	3	10	1
ecs.vgn6i-m8-vws.2xlarge	10	46	NVIDIA T4 × 1/2	16 GB × 1/2	4	800,000	8/2	4	10	1
ecs.vgn6i-m16-vws.5xlarge	20	92	NVIDIA T4 × 1	16 GB × 1	7.5	1,200,000	6	4	10	1

Note

NVIDIA T4 * 1/4. NVIDIA T4 is the GPU model. 1/4 indicates that a GPU is sliced into four GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.

gn8v and gn8v-tee, GPU-accelerated compute-optimized instance families

The gn8v and gn8v-tee instance families are available only in specific regions, including regions outside China. To use the instance families, contact Alibaba Cloud sales personnel.

Introduction:
- gn8v: This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud for AI model training and the inference tasks of ultra large language models (LLMs). This instance family consists of multiple instance types that provide one, two, four, or eight GPUs per instance.
- gn8v-tee: To meet security requirements for training and inferring large language models, Alibaba Cloud provides an eighth-generation instance family based on gn8v that includes the confidential computing feature. This instance type encrypts data during GPU computing to ensure user data security.
Use cases:
- Multi-GPU parallel inference computing for LLMs that have more than 70 billion parameters
- Traditional AI model training and autonomous driving training, for which each GPU delivers computing power of up to 39.5 TFLOPS in the single-precision floating-point format (FP32)
- Small and medium-sized model training scenarios that leverage the NVLink connections among the eight GPUs
Benefits and positioning:
- High-speed and large-capacity GPU memory: Each GPU is equipped with 96 GB of HBM3 memory and delivers up to 4 TB/s of memory bandwidth, which greatly accelerates model training and inference.
- High bandwidth between GPUs: Multiple GPUs are interconnected by using 900 GB/s NVLink connections. The efficiency of multi-GPU training and inference is much higher than that of previous generations of GPU-accelerated instances.
- Quantization of LLMs: This instance family supports computing power in the 8-bit floating point format (FP8) and optimizes computing power for large-scale parameter training and inference. This significantly improves the computing speed of training and inference and reduces memory usage.
- (Only for the gn8v-tee instance family) High security: The gn8v-tee instance family supports confidential computing capabilities that cover the full link of model inference tasks. The capabilities include CPU-based Intel Trust Domain Extensions (TDX) confidential computing and GPU-based NVIDIA Confidential Computing (CC). The confidential computing capabilities ensure the security of user inference data and enterprise models in model inference and training.
Compute:
- Uses the latest Cloud Infrastructure Processing Unit (CIPU) 1.0 processors.
  - Decouples computing capabilities from storage capabilities, allowing you to flexibly select storage resources based on your business requirements.
  - Provides bare metal capabilities to support peer-to-peer (P2P) communication between GPU-accelerated instances.
- Uses the 4th-generation Intel Xeon Scalable processors that deliver a base frequency of up to 2.8 GHz and an all-core turbo frequency of up to 3.1 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supports elastic ephemeral disks, ESSDs, ESSD AutoPL disks, and Regional ESSDs. For information about disks, see Overview of Block Storage.
Network:
- Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- These instances support jumbo frames. For more information, see Jumbo frames.
- Provides ultra-high network performance with a packet forwarding rate of up to 30,000,000 pps (for instances equipped with eight GPUs).
- Supports elastic RDMA interfaces (ERIs).
- Note
  For information about how to use ERIs, see Enable eRDMA on an enterprise-level instance.
Security: Supports the Trusted Platform Module (vTPM) feature. This feature is supported by gn8v but not by gn8v-tee. For more information, see Overview.

gn8v includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	ENIs	NIC queues per primary ENI	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Maximum cloud disks	Disk baseline IOPS	Disk baseline bandwidth (GB/s)
ecs.gn8v.4xlarge	16	96	96 GB × 1	12	8	16	30	30	17	100,000	0.75
ecs.gn8v.6xlarge	24	128	96 GB × 1	15	8	24	30	30	17	120,000	0.937
ecs.gn8v-2x.8xlarge	32	192	96 GB × 2	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-4x.8xlarge	32	384	96 GB × 4	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-2x.12xlarge	48	256	96 GB × 2	25	8	48	30	30	33	300,000	1.50
ecs.gn8v-8x.16xlarge	64	768	96 GB × 8	32	8	64	30	30	33	360,000	2.5
ecs.gn8v-4x.24xlarge	96	512	96 GB × 4	50	15	64	30	30	49	500,000	3
ecs.gn8v-8x.48xlarge	192	1024	96 GB × 8	100	15	64	50	50	65	1,000,000	6

gn8v-tee includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	ENIs	NIC queues per primary ENI	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Maximum cloud disks	Disk baseline IOPS	Disk baseline bandwidth (GB/s)
ecs.gn8v-tee.4xlarge	16	96	96 GB × 1	12	8	16	30	30	17	100,000	0.75
ecs.gn8v-tee.6xlarge	24	128	96 GB × 1	15	8	24	30	30	17	120,000	0.937
ecs.gn8v-tee-8x.16xlarge	64	768	96 GB × 8	32	8	64	30	30	33	360,000	2.5
ecs.gn8v-tee-8x.48xlarge	192	1024	96 GB × 8	100	15	64	50	50	65	1,000,000	6

Note

The gn8v-tee instance family only supports Alibaba Cloud Linux 3 images. If you use a custom image built from Alibaba Cloud Linux 3 to create an instance, ensure that the kernel version is at least 5.10.134-18.

gn8is, GPU-accelerated compute-optimized instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

Introduction: gn8is is the eighth-generation GPU-accelerated compute-optimized instance family from Alibaba Cloud, developed in response to the growth of AI-generated content (AIGC) services. It uses the latest NVIDIA L20 GPUs and provides 1-GPU, 2-GPU, 4-GPU, and 8-GPU instance types, along with instance types with different CPU-to-GPU ratios, to meet various application requirements.
Benefits and positioning:
- Graphics processing: This instance family uses 4th-generation Intel Xeon Scalable high-frequency processors to provide sufficient CPU computing power for 3D modeling scenarios, which makes graphics rendering and design smoother.
- Inference tasks: It uses the new NVIDIA L20 GPU and provides 48 GB of video memory per GPU to accelerate inference tasks. It supports the FP8 floating-point number format and can be used with ACK containers to flexibly support the inference of various AIGC models. It is especially suitable for inference tasks on LLM models with fewer than 70 billion parameters.
Use cases:
- Animation, special effects for film and television, and rendering
- Generation of AIGC images and inference of LLMs
- Other general-purpose AI recognition, image recognition, and speech recognition scenarios

Compute:

Uses the new NVIDIA L20 enterprise-grade GPUs.
- Support for acceleration features, such as TensorRT, and the FP8 floating-point format to improve LLM inference performance.
- Up to 48 GB of memory per GPU and support for the inference of 70B or larger LLMs on a single instance with multiple GPUs.
- Improved graphic processing capabilities. For example, after you install a GRID driver on a gn8is instance by using Cloud Assistant or an Alibaba Cloud Marketplace image, the instance can provide graphic processing performance twice that of a 7th-generation instance.

Key parameters of NVIDIA L20:

GPU architecture	GPU memory	Compute performance	Video encoding/decoding capabilities	Inter-card connection
NVIDIA Ada Lovelace	Capacity: 48 GB Bandwidth: 864 GB/s	FP64: N/A FP32: 59.3 TFLOPS FP16/BF16: 119 TFLOPS FP8/INT8: 237 TFLOPS	3 × Video Encoder (+AV1) 3 × Video Decoder 4 × JPEG Decoder	PCIe interface: PCIe Gen4 x16 Bandwidth: 64 GB/s

Uses the latest high-frequency Intel^® Xeon^® processors that deliver an all-core turbo frequency of 3.9 GHz to meet complex 3D modeling requirements.

Storage:
- Is an instance family in which all instances are I/O optimized.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supports elastic ephemeral disks, Enterprise SSDs (ESSDs), ESSD AutoPL disks, and Regional ESSDs. For information about disks, see Overview of Block Storage.
Network:
- Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- Supports ERIs.
  Note
  For information about how to use ERIs, see Enable eRDMA on an enterprise-level instance.
Security: These instances support the vTPM feature. For more information, see Overview of trusted computing.

gn8is includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	ENIs	NIC queues per primary ENI	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Maximum cloud disks	Disk baseline IOPS	Disk baseline bandwidth (GB/s)
ecs.gn8is.2xlarge	8	64	L20 × 1	48 GB × 1	8	4	8	15	15	17	60,000	0.75
ecs.gn8is.4xlarge	16	128	L20 × 1	48 GB × 1	16	8	16	30	30	17	120,000	1.25
ecs.gn8is-2x.8xlarge	32	256	L20 × 2	48 GB × 2	32	8	32	30	30	33	250,000	2
ecs.gn8is-4x.16xlarge	64	512	L20 × 4	48 GB × 4	64	8	64	30	30	33	450,000	4
ecs.gn8is-8x.32xlarge	128	1024	L20 × 8	48 GB × 8	100	15	64	50	50	65	900,000	8

gn7e, GPU-accelerated compute-optimized instance family

Features

Introduction:
- This instance family allows you to select instance types that provide different numbers of GPUs and CPUs to meet your business requirements in AI use cases.
- This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of virtual private clouds (VPCs), networks, and disks compared with instance families of the previous generation.
Use cases:
- Small- and medium-scale AI training
- High-performance computing (HPC) business accelerated by using Compute Unified Device Architecture (CUDA)
- AI inference tasks that require high GPU processing capabilities or large amounts of GPU memory
- Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Important
When you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn7e includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7e-c16g1.4xlarge	16	125	80 GB × 1	8	3,000,000	8	8	10	1
ecs.gn7e-c16g1.8xlarge	32	250	80 GB × 2	16	6,000,000	16	8	10	1
ecs.gn7e-c16g1.16xlarge	64	500	80 GB × 4	32	12,000,000	32	8	10	1
ecs.gn7e-c16g1.32xlarge	128	1000	80 GB × 8	64	24,000,000	32	16	15	1

gn7i, GPU-accelerated compute-optimized instance family

Introduction: This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude.
Use cases:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for acceleration features, such as RTX and TensorRT
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
- Provides up to 752 GiB of memory, which is much larger than the memory sizes of the gn6i instance family.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn7i includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7i-c8g1.2xlarge	8	30	NVIDIA A10 × 1	24 GB × 1	16	1,600,000	8	4	15	15
ecs.gn7i-c16g1.4xlarge	16	60	NVIDIA A10 × 1	24 GB × 1	16	3,000,000	8	8	30	30
ecs.gn7i-c32g1.8xlarge	32	188	NVIDIA A10 × 1	24 GB × 1	16	6,000,000	12	8	30	30
ecs.gn7i-c32g1.16xlarge	64	376	NVIDIA A10 × 2	24 GB × 2	32	12,000,000	16	15	30	30
ecs.gn7i-c32g1.32xlarge	128	752	NVIDIA A10 × 4	24 GB × 4	64	24,000,000	32	15	30	30
ecs.gn7i-c48g1.12xlarge	48	310	NVIDIA A10 × 1	24 GB × 1	16	9,000,000	16	8	30	30
ecs.gn7i-c56g1.14xlarge	56	346	NVIDIA A10 × 1	24 GB × 1	16	10,000,000	16	8	30	30
ecs.gn7i-2x.8xlarge	32	128	NVIDIA A10 × 2	24 GB × 2	16	6,000,000	16	8	30	30
ecs.gn7i-4x.8xlarge	32	128	NVIDIA A10 × 4	24 GB × 4	32	6,000,000	16	8	30	30
ecs.gn7i-4x.16xlarge	64	256	NVIDIA A10 × 4	24 GB × 4	64	12,000,000	32	8	30	30
ecs.gn7i-8x.32xlarge	128	512	NVIDIA A10 × 8	24 GB × 8	64	24,000,000	32	16	30	30
ecs.gn7i-8x.16xlarge	64	256	NVIDIA A10 × 8	24 GB × 8	32	12,000,000	32	8	30	30

Important

You can change the following instance types only to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge: ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge.

gn7s, GPU-accelerated compute-optimized instance family

To use the gn7s instance family, submit a ticket to apply.

Introduction:
- This instance family uses the latest Intel Ice Lake processors and NVIDIA A30 GPUs that are based on NVIDIA Ampere architecture. You can select instance types that comprise appropriate mixes of GPUs and vCPUs to meet your business requirements in AI scenarios.
- This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of VPCs, networks, and disks compared with instance families of the previous generation.
Use cases: concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification.
Compute:
- Uses NVIDIA A30 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for the multi-instance GPU (MIG) feature and acceleration features (based on second-generation Tensor cores) to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
- Improves memory sizes significantly from instance families of the previous generation.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn7s includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Private IPv4 addresses per ENI	IPv6 addresses per ENI	NIC queues	ENIs
ecs.gn7s-c8g1.2xlarge	8	60	NVIDIA A30 × 1	24 GB × 1	16	1,600,000	5	1	8	4
ecs.gn7s-c16g1.4xlarge	16	120	NVIDIA A30 × 1	24 GB × 1	16	3,000,000	5	1	8	8
ecs.gn7s-c32g1.8xlarge	32	250	NVIDIA A30 × 1	24 GB × 1	16	6,000,000	5	1	12	8
ecs.gn7s-c32g1.16xlarge	64	500	NVIDIA A30 × 2	24 GB × 2	32	12,000,000	5	1	16	15
ecs.gn7s-c32g1.32xlarge	128	1000	NVIDIA A30 × 4	24 GB × 4	64	24,000,000	10	1	32	15
ecs.gn7s-c48g1.12xlarge	48	380	NVIDIA A30 × 1	24 GB × 1	16	9,000,000	8	1	16	8
ecs.gn7s-c56g1.14xlarge	56	440	NVIDIA A30 × 1	24 GB × 1	16	10,000,000	8	1	16	8

gn7, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn7 includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7-c12g1.3xlarge	12	94	40 GB × 1	4	2,500,000	4	8	10	1
ecs.gn7-c13g1.13xlarge	52	378	40 GB × 4	16	9,000,000	16	8	30	30
ecs.gn7-c13g1.26xlarge	104	756	40 GB × 8	30	18,000,000	16	15	10	1

gn6i, GPU-accelerated compute-optimized instance family

Use cases:
- AI (deep learning and machine learning) inference for computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems
- Real-time rendering for cloud gaming
- Real-time rendering for AR and VR applications
- Graphics workstations or graphics-heavy computing
- GPU-accelerated databases
- High-performance computing
Compute:
- Uses NVIDIA T4 GPUs that have the following features:
  - Innovative NVIDIA Turing architecture
  - 16 GB of memory (320 GB/s bandwidth) per GPU
  - 2,560 CUDA cores per GPU
  - Up to 320 Turing Tensor cores per GPU
  - Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks. For information about disks, see Overview of Block Storage.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn6i includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Baseline disk IOPS	Multi-queue	ENIs	Number of private IPv4 addresses per ENI	Number of IPv6 addresses per ENI
ecs.gn6i-c4g1.xlarge	4	15	NVIDIA T4 × 1	16 GB × 1	4	2,500,000	None	2	2	10	1
ecs.gn6i-c8g1.2xlarge	8	31	NVIDIA T4 × 1	16 GB × 1	5	2,500,000	None	2	2	10	1
ecs.gn6i-c16g1.4xlarge	16	62	NVIDIA T4 × 1	16 GB × 1	6	2,500,000	None	4	3	10	1
ecs.gn6i-c24g1.6xlarge	24	93	NVIDIA T4 × 1	16 GB × 1	7.5	2,500,000	None	6	4	10	1
ecs.gn6i-c40g1.10xlarge	40	155	NVIDIA T4 × 1	16 GB × 1	10	1,600,000	None	16	10	10	1
ecs.gn6i-c24g1.12xlarge	48	186	NVIDIA T4 × 2	16 GB × 2	15	4,500,000	None	12	6	10	1
ecs.gn6i-c24g1.24xlarge	96	372	NVIDIA T4 × 4	16 GB × 4	30	4,500,000	250,000	24	8	10	1

gn6e, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Uses NVIDIA V100 GPUs that each have 32 GB of GPU memory and support NVLink.
- Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 32 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)
- Offers a CPU-to-memory ratio of 1:8.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supported disk types: ESSDs, ESSD AutoPL disks, Regional ESSDs, standard SSD, and ultra disk. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

The following table lists the instance types and specifications of the gn6e instance family.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn6e-c12g1.3xlarge	12	92	1 × NVIDIA V100	1 × 32 GB	5	800,000	8	6	10	1
ecs.gn6e-c12g1.6xlarge	24	184	2 × NVIDIA V100	2 × 32 GB	8	1,200,000	8	8	20	1
ecs.gn6e-c12g1.12xlarge	48	368	4 × NVIDIA V100	4 × 32 GB	16	2,400,000	8	8	20	1
ecs.gn6e-c12g1.24xlarge	96	736	8 × NVIDIA V100	8 × 32 GB	32	4,500,000	16	8	20	1

gn6v, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Uses NVIDIA V100 GPUs.
- Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 16 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks. For information about disks, see Overview of Block Storage.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn6v includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding (pps)	Disk baseline IOPS	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn6v-c8g1.2xlarge	8	32	NVIDIA V100 × 1	16 GB × 1	2.5	800,000	N/A	4	4	10	1
ecs.gn6v-c8g1.4xlarge	16	64	NVIDIA V100 × 2	16 GB × 2	5	1,000,000	N/A	4	8	20	1
ecs.gn6v-c8g1.8xlarge	32	128	NVIDIA V100 × 4	16 GB × 4	10	2,000,000	N/A	8	8	20	1
ecs.gn6v-c8g1.16xlarge	64	256	NVIDIA V100 × 8	16 GB × 8	20	2,500,000	N/A	16	8	20	1
ecs.gn6v-c10g1.20xlarge	82	336	NVIDIA V100 × 8	16 GB × 8	35	4,500,000	250,000	16	8	20	1

gn5, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning
- Scientific computing applications, such as computational fluid dynamics, computational finance, genomics, and environmental analytics
- Server-side GPU compute workloads, such as high-performance computing, rendering, and multi-media encoding and decoding
Compute:
- Uses NVIDIA P100 GPUs.
- Offers multiple CPU-to-memory ratios.
- Uses 2.5 GHz Intel^® Xeon^® E5-2682 v4 (Broadwell) processors.
Storage:
- Supports high-performance local Non-Volatile Memory Express (NVMe) SSDs.
- Is an instance family in which all instances are I/O optimized.
- Supports standard SSDs and ultra disks.
Network:
- Supports only IPv4.
- Provides high network performance based on large computing capacity.

gn5 includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Local storage (GiB)	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI
ecs.gn5-c4g1.xlarge	4	30	1 × NVIDIA P100	1 × 16 GB	440	3	300,000	1	3	10
ecs.gn5-c8g1.2xlarge	8	60	1 × NVIDIA P100	1 × 16 GB	440	3	400,000	1	4	10
ecs.gn5-c4g1.2xlarge	8	60	2 × NVIDIA P100	2 × 16 GB	880	5	1,000,000	4	4	10
ecs.gn5-c8g1.4xlarge	16	120	2 × NVIDIA P100	2 × 16 GB	880	5	1,000,000	4	8	20
ecs.gn5-c28g1.7xlarge	28	112	1 × NVIDIA P100	1 × 16 GB	440	5	2,250,000	7	8	10
ecs.gn5-c8g1.8xlarge	32	240	4 × NVIDIA P100	4 × 16 GB	1760	10	2,000,000	8	8	20
ecs.gn5-c28g1.14xlarge	56	224	2 × NVIDIA P100	2 × 16 GB	880	10	4,500,000	14	8	20
ecs.gn5-c8g1.14xlarge	54	480	8 × NVIDIA P100	8 × 16 GB	3520	25	4,000,000	14	8	10

gn5i, GPU-accelerated compute-optimized instance family

Use cases: Server-side GPU computing workloads, such as deep learning inference and multimedia encoding and decoding.
Compute:
- Uses NVIDIA P4 GPUs.
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® E5-2682 v4 (Broadwell) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports standard SSDs and ultra disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

gn5i includes the instance types and metric data listed in the following table.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn5i-c2g1.large	2	8	NVIDIA P4 × 1	8 GB × 1	1	100,000	2	2	6	1
ecs.gn5i-c4g1.xlarge	4	16	NVIDIA P4 × 1	8 GB × 1	1.5	200,000	2	3	10	1
ecs.gn5i-c8g1.2xlarge	8	32	NVIDIA P4 × 1	8 GB × 1	2	400,000	4	4	10	1
ecs.gn5i-c16g1.4xlarge	16	64	NVIDIA P4 × 1	8 GB × 1	3	800,000	4	8	20	1
ecs.gn5i-c16g1.8xlarge	32	128	NVIDIA P4 × 2	8 GB × 2	6	1,200,000	8	8	20	1
ecs.gn5i-c28g1.14xlarge	56	224	NVIDIA P4 × 2	8 GB × 2	10	2,000,000	14	8	20	1