GPU-accelerated compute-optimized instance families (gn, ebm, and scc series) - Elastic GPU Service

GPU-accelerated compute-optimized instances provide high performance and high parallel computing capabilities, and are suitable for large-scale parallel computing scenarios. You can use GPU-accelerated compute-optimized instances to achieve improved computing performance and efficiency for your business. This topic describes the features of GPU-accelerated compute-optimized instance families of Elastic Compute Service (ECS) and lists the instance types in each instance family.

Background information

Before you read further in this topic, you must be familiar with the following information:

Classification and naming of instance types. Familiarize yourself with the instance family categories, naming conventions of instance types, and differences between instance families. For more information, see Classification and naming of instance types.
Instance type metrics. For information about the metrics of instance types, see Instance type metrics.
Instructions for selecting instance types based on your business scenarios. For more information, see Instance type selection.

After you determine an instance type for your use case, you may need to learn about the following information:

Regions in which the instance type is available for purchase. Instance types that are available for purchase vary based on the region. You can go to the Instance Types Available for Each Region page to view the instance types available for purchase in each region.
Estimated instance costs. You can calculate the price of instances that uses different billing methods in the Price Calculator.
Instructions for purchasing an instance. You can go to the ECS instance buy page to place a purchase order for instances.

You may be concerned about the following information:

Retired instance families. If you cannot find an instance type in this topic, the instance type may be in a retired instance family. For information about retired instance families, see Retired instance families.
Supported instance type changes. Before you change the instance type of an instance, check whether the instance type can be changed and identify compatible instance types. For more information, see Instance types and families that support instance type changes.

Category	References
GPU-accelerated compute-optimized instance families (gn series)	gn8v, GPU-accelerated compute-optimized instance family gn8is, GPU-accelerated compute-optimized instance family gn7e, GPU-accelerated compute-optimized instance family gn7i, GPU-accelerated compute-optimized instance family gn7s, GPU-accelerated compute-optimized instance family gn7, GPU-accelerated compute-optimized instance family gn6i, GPU-accelerated compute-optimized instance family gn6e, GPU-accelerated compute-optimized instance family gn6v, GPU-accelerated compute-optimized instance family
ECS Bare Metal Instance families	ebmgn8v, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn8is, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn7e, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn7i, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn7, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn6ia, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn6e, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn6v, GPU-accelerated compute-optimized ECS Bare Metal Instance family ebmgn6i, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Not recommended instance families (If the following instance families are sold out, we recommend that you use the instance families in the preceding columns.)	gn5i, GPU-accelerated compute-optimized instance family gn5, GPU-accelerated compute-optimized instance family

gn8v, GPU-accelerated compute-optimized instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

Introduction: This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud for AI model training and the inference tasks of ultra-large models. This instance family consists of multiple instance types that provide one, two, four, or eight GPUs per instance.
Supported scenarios:
- Multi-GPU parallel inference computing for large language models (LLMs) that have more than 70 billion parameters
- Traditional AI model training and autonomous driving training, for which each GPU delivers computing power of up to 39.5 TFLOPS in the single-precision floating-point format (FP32)
- Small and medium-sized model training scenarios that leverage the NVLink connections among the eight GPUs
Benefits and positioning:
- High-speed and large-capacity GPU memory: Each GPU is equipped with 96 GB of HBM3E memory and delivers up to 4 TB/s of memory bandwidth, which greatly accelerates model training and inference.
- High bandwidth between GPUs: Multiple GPUs are interconnected by using 900 GB/s NVLink connections. The efficiency of multi-GPU training and inference is much higher than that of previous generations of GPU-accelerated instances.
- Quantization of large models: This instance family supports computing power in the 8-bit floating point format (FP8) and optimizes computing power for large-scale parameter training and inference. This significantly improves the computing speed of training and inference and reduces memory usage.
- High security: This instance family supports confidential computing capabilities that cover the full link of model inference tasks. The capabilities include CPU-based Intel Trust Domain Extensions (TDX) confidential computing and GPU-based NVIDIA Confidential Computing (CC). The confidential computing capabilities ensure the security of user inference data and enterprise models in model inference and training.
Compute:
- Uses the latest Cloud Infrastructure Processing Unit (CIPU) 1.0 processors.
  - Decouples computing capabilities from storage capabilities, allowing you to flexibly select storage resources based on your business requirements.
  - Provides bare metal capabilities to support peer-to-peer (P2P) communication between GPU-accelerated instances.
- Uses the 4th-generation Intel Xeon Scalable processors that deliver a base frequency of up to 2.8 GHz and an all-core turbo frequency of up to 3.1 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, and elastic ephemeral disks (EEDs).
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides ultra-high network performance with a packet forwarding rate of up to 30,000,000 pps (for instances equipped with eight GPUs).
- Supports elastic RDMA interfaces (ERIs).
  Note
  For information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	ENIs	NIC queues per primary ENI	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Maximum disks	Disk baseline IOPS	Disk baseline bandwidth (Gbit/s)
ecs.gn8v.4xlarge	16	96	96GB * 1	12	8	16	30	30	17	100,000	0.75
ecs.gn8v.6xlarge	24	128	96GB * 1	15	8	24	30	30	17	120,000	0.937
ecs.gn8v-2x.8xlarge	32	192	96GB * 2	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-4x.8xlarge	32	384	96GB * 4	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-2x.12xlarge	48	256	96GB * 2	25	8	48	30	30	33	300,000	1.50
ecs.gn8v-8x.16xlarge	64	768	96GB * 8	32	8	64	30	30	33	360,000	2.5
ecs.gn8v-4x.24xlarge	96	512	96GB * 4	50	15	64	30	30	49	500,000	3
ecs.gn8v-8x.48xlarge	192	1024	96GB * 8	100	15	64	50	50	65	1,000,000	6

gn8is, GPU-accelerated compute-optimized instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

Introduction: This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud in response to the recent developments in the AI generation field. This instance family consists of multiple instance types that provide one, two, four, or eight GPUs per instance and have different CPU-to-GPU ratios to fit various use cases.
Benefits and positioning:
- Graphic processing: This instance family uses high-frequency 5th-generation Intel Xeon Scalable processors to provide sufficient CPU capacity for smooth graphics rendering and design in 3D modeling scenarios.
- Inference tasks: This instance family uses innovative GPUs, each with 48 GB of memory, which accelerate inference tasks and support the FP8 floating-point format. You can use this instance family together with Container Service for Kubernetes (ACK) to support the inference of various AI-generated content (AIGC) models and accommodate inference tasks for LLMs that have less than 70 billion parameters.
Supported scenarios:
- Animation, special effects for film and television, and rendering
- Generation of AIGC images and inference of LLMs
- Other general-purpose AI recognition, image recognition, and speech recognition scenarios
Compute:
- Uses innovative GPUs that have the following features:
  - Support for acceleration features, such as TensorRT, and the FP8 floating-point format to improve LLM inference performance.
  - Up to 48 GB of memory per GPU and support for the inference of 70B or larger LLMs on a single instance with multiple GPUs.
  - Improved graphic processing capabilities. For example, after you install a GRID driver on a gn8is instance by using Cloud Assistant or an Alibaba Cloud Marketplace image, the instance can provide graphic processing performance twice that of a 7th-generation instance.
- Uses the latest high-frequency Intel^® Xeon^® processors that deliver an all-core turbo frequency of 3.9 GHz to meet complex 3D modeling requirements.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, and EEDs.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Supports ERIs.
  Note
  For information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	ENIs	NIC queues per primary ENI	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Maximum disks	Disk baseline IOPS	Disk baseline bandwidth (Gbit/s)
ecs.gn8is.2xlarge	8	64	48GB * 1	8	4	8	15	15	17	60,000	0.75
ecs.gn8is.4xlarge	16	128	48GB * 1	16	8	16	30	30	17	120,000	1.25
ecs.gn8is-2x.8xlarge	32	256	48GB * 2	32	8	32	30	30	33	250,000	2
ecs.gn8is-4x.16xlarge	64	512	48GB * 4	64	8	64	30	30	33	450,000	4
ecs.gn8is-8x.32xlarge	128	1024	48GB * 8	100	15	64	50	50	65	900,000	8

gn7e, GPU-accelerated compute-optimized instance family

Features:

Introduction:
- This instance family allows you to select instance types that provide different numbers of GPUs and CPUs to meet your business requirements in AI use cases.
- This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of virtual private clouds (VPCs), networks, and disks compared with instance families of the previous generation.
Supported scenarios:
- Small- and medium-scale AI training
- High-performance computing (HPC) business accelerated by using Compute Unified Device Architecture (CUDA)
- AI inference tasks that require high GPU processing capabilities or large amounts of GPU memory
- Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Important
When you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7e-c16g1.4xlarge	16	125	80GB * 1	8	3,000,000	8	8	10	1
ecs.gn7e-c16g1.8xlarge	32	250	80GB * 2	16	6,000,000	16	8	10	1
ecs.gn7e-c16g1.16xlarge	64	500	80GB * 4	32	12,000,000	32	8	10	1
ecs.gn7e-c16g1.32xlarge	128	1000	80GB * 8	64	24,000,000	32	16	15	1

gn7i, GPU-accelerated compute-optimized instance family

Introduction: This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude.
Supported scenarios:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for acceleration features, such as RTX and TensorRT
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
- Provides up to 752 GiB of memory, which is much larger than the memory sizes of the gn6i instance family.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7i-c8g1.2xlarge	8	30	NVIDIA A10 * 1	24GB * 1	16	1,600,000	8	4	15	15
ecs.gn7i-c16g1.4xlarge	16	60	NVIDIA A10 * 1	24GB * 1	16	3,000,000	8	8	30	30
ecs.gn7i-c32g1.8xlarge	32	188	NVIDIA A10 * 1	24GB * 1	16	6,000,000	12	8	30	30
ecs.gn7i-c32g1.16xlarge	64	376	NVIDIA A10 * 2	24GB * 2	32	12,000,000	16	15	30	30
ecs.gn7i-c32g1.32xlarge	128	752	NVIDIA A10 * 4	24GB * 4	64	24,000,000	32	15	30	30
ecs.gn7i-c48g1.12xlarge	48	310	NVIDIA A10 * 1	24GB * 1	16	9,000,000	16	8	30	30
ecs.gn7i-c56g1.14xlarge	56	346	NVIDIA A10 * 1	24GB * 1	16	12,000,000	16	12	30	30
ecs.gn7i-2x.8xlarge	32	128	NVIDIA A10 * 2	24GB * 2	16	6,000,000	16	8	30	30
ecs.gn7i-4x.8xlarge	32	128	NVIDIA A10 * 4	24GB * 4	16	6,000,000	16	8	30	30
ecs.gn7i-4x.16xlarge	64	256	NVIDIA A10 * 4	24GB * 4	32	12,000,000	32	8	30	30
ecs.gn7i-8x.32xlarge	128	512	NVIDIA A10 * 8	24GB * 8	64	24,000,000	32	16	30	30
ecs.gn7i-8x.16xlarge	64	256	NVIDIA A10 * 8	24GB * 8	32	12,000,000	32	8	30	30

Note

You can change the following instance types only to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge: ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge.

gn7s, GPU-accelerated compute-optimized instance family

To use the gn7s instance family, submit a ticket.

Introduction:
- This instance family uses the latest Intel Ice Lake processors and NVIDIA A30 GPUs that are based on NVIDIA Ampere architecture. You can select instance types that comprise appropriate mixes of GPUs and vCPUs to meet your business requirements in AI scenarios.
- This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of VPCs, networks, and disks compared with instance families of the previous generation.
Supported scenarios: concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification.
Compute:
- Uses NVIDIA A30 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for the multi-instance GPU (MIG) feature and acceleration features (based on second-generation Tensor cores) to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
- Improves memory sizes significantly from instance families of the previous generation.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Private IPv4 addresses per ENI	IPv6 addresses per ENI	NIC queues	ENIs
ecs.gn7s-c8g1.2xlarge	8	60	NVIDIA A30 * 1	24GB * 1	16	6,000,000	5	1	12	8
ecs.gn7s-c16g1.4xlarge	16	120	NVIDIA A30 * 1	24GB * 1	16	6,000,000	5	1	12	8
ecs.gn7s-c32g1.8xlarge	32	250	NVIDIA A30 * 1	24GB * 1	16	6,000,000	5	1	12	8
ecs.gn7s-c32g1.16xlarge	64	500	NVIDIA A30 * 2	24GB * 2	32	12,000,000	5	1	16	15
ecs.gn7s-c32g1.32xlarge	128	1000	NVIDIA A30 * 4	24GB * 4	64	24,000,000	10	1	32	15
ecs.gn7s-c48g1.12xlarge	48	380	NVIDIA A30 * 1	24GB * 1	16	6,000,000	8	1	12	8
ecs.gn7s-c56g1.14xlarge	56	440	NVIDIA A30 * 1	24GB * 1	16	6,000,000	8	1	12	8

gn7, GPU-accelerated compute-optimized instance family

Supported scenarios:
- Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics

Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn7-c12g1.3xlarge	12	94	40GB * 1	4	2,500,000	4	8	10	1
ecs.gn7-c13g1.13xlarge	52	378	40GB * 4	16	9,000,000	16	8	30	30
ecs.gn7-c13g1.26xlarge	104	756	40GB * 8	30	18,000,000	16	15	10	1

gn6i, GPU-accelerated compute-optimized instance family

Supported scenarios:
- AI (deep learning and machine learning) inference for computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems
- Real-time rendering for cloud gaming
- Real-time rendering for AR and VR applications
- Graphics workstations or graphics-heavy computing
- GPU-accelerated databases
- High-performance computing
Compute:
- Uses NVIDIA T4 GPUs that have the following features:
  - Innovative NVIDIA Turing architecture
  - 16 GB of memory (320 GB/s bandwidth) per GPU
  - 2,560 CUDA cores per GPU
  - Up to 320 Turing Tensor cores per GPU
  - Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Disk baseline IOPS	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn6i-c4g1.xlarge	4	15	NVIDIA T4 * 1	16GB * 1	4	500,000	None	2	2	10	1
ecs.gn6i-c8g1.2xlarge	8	31	NVIDIA T4 * 1	16GB * 1	5	800,000	None	2	2	10	1
ecs.gn6i-c16g1.4xlarge	16	62	NVIDIA T4 * 1	16GB * 1	6	1,000,000	None	4	3	10	1
ecs.gn6i-c24g1.6xlarge	24	93	NVIDIA T4 * 1	16GB * 1	7.5	1,200,000	None	6	4	10	1
ecs.gn6i-c40g1.10xlarge	40	155	NVIDIA T4 * 1	16GB * 1	10	1,600,000	None	16	10	10	1
ecs.gn6i-c24g1.12xlarge	48	186	NVIDIA T4 * 2	16GB * 2	15	2,400,000	None	12	6	10	1
ecs.gn6i-c24g1.24xlarge	96	372	NVIDIA T4 * 4	16GB * 4	30	4,800,000	250,000	24	8	10	1

gn6e, GPU-accelerated compute-optimized instance family

Supported scenarios:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Uses NVIDIA V100 GPUs that each have 32 GB of GPU memory and support NVLink.
- Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 32 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)
- Offers a CPU-to-memory ratio of 1:8.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn6e-c12g1.3xlarge	12	92	NVIDIA V100 * 1	32GB * 1	5	800,000	8	6	10	1
ecs.gn6e-c12g1.6xlarge	24	182	NVIDIA V100 * 2	32GB * 2	8	1,200,000	8	8	20	1
ecs.gn6e-c12g1.12xlarge	48	368	NVIDIA V100 * 4	32GB * 4	16	2,400,000	8	8	20	1
ecs.gn6e-c12g1.24xlarge	96	736	NVIDIA V100 * 8	32GB * 8	32	4,800,000	16	8	20	1

gn6v, GPU-accelerated compute-optimized instance family

Supported scenarios:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Uses NVIDIA V100 GPUs.
- Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 16 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink bidirectional connections per GPU, each of which provides a bandwidth of 25 Gbit/s in each direction for a total bandwidth of 300 Gbit/s (6 × 25 × 2 = 300)
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Disk baseline IOPS	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn6v-c8g1.2xlarge	8	32	NVIDIA V100 * 1	16GB * 1	2.5	800,000	None	4	4	10	1
ecs.gn6v-c8g1.4xlarge	16	64	NVIDIA V100 * 2	16GB * 2	5	1,000,000	None	4	8	20	1
ecs.gn6v-c8g1.8xlarge	32	128	NVIDIA V100 * 4	16GB * 4	10	2,000,000	None	8	8	20	1
ecs.gn6v-c8g1.16xlarge	64	256	NVIDIA V100 * 8	16GB * 8	20	2,500,000	None	16	8	20	1
ecs.gn6v-c10g1.20xlarge	82	336	NVIDIA V100 * 8	16GB * 8	32	4,500,000	250,000	16	8	20	1

ebmgn8v, GPU-accelerated compute-optimized ECS Bare Metal Instance family

ebmgn8is, GPU-accelerated compute-optimized ECS Bare Metal Instance family

This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.

Introduction:
This instance family is an 8th-generation GPU-accelerated compute-optimized ECS Bare Metal instance family provided by Alibaba Cloud in response to the recent developments in the AI generation field. Each instance of this instance family is equipped with eight GPUs.
Supported scenarios:
- Production and rendering of special effects for animation, film, and television based on workstation-level graphics processing capabilities in scenarios in which Alibaba Cloud Marketplace GRID images are used, the GRID driver is installed, and OpenGL and Direct3D graphics capabilities are enabled
- Scenarios in which the management services provided by Container Service for Kubernetes (ACK) for containerized applications are used to support AI-generated graphic content and LLM inference tasks with up to 130 billion parameters
- Other general-purpose AI recognition, image recognition, and speech recognition scenarios
Benefits and positioning:
- Graphic processing: This instance family uses high-frequency 5th-generation Intel Xeon Scalable processors to deliver sufficient CPU computing power in 3D modeling scenarios and achieve smooth graphics rendering and design.
- Inference tasks: This instance family uses innovative GPUs, each with 48 GB of memory, which accelerate inference tasks and support the FP8 floating-point format. You can use this instance family together with ACK to support the inference of various AI-generated content (AIGC) models and accommodate inference tasks for LLMs that have less than 70 billion parameters.
- Training tasks: This instance family provides cost-effective computing capabilities and delivers the FP32 computing performance double that of the 7th-generation inference instances. Instances of this instance family are suitable for training FP32-based CV models and other small and medium-sized models.
This instance family uses the latest CIPU 1.0 processors that provide the following benefits:
- Decouples computing capabilities from storage capabilities, allowing you to flexibly select storage resources based on your business requirements, and increases inter-instance bandwidth to 160 Gbit/s for faster data transmission and processing compared with previous-generation instance families.
- Uses the bare metal capabilities provided by CIPU processors to support Peripheral Component Interconnect Express (PCIe) P2P communication between GPU-accelerated instances.
Compute:
- Uses innovative GPUs that have the following features:
  - Support for acceleration features such as vGPU, RTX technology, and TensorRT inference engine
  - Support for PCIe Switch interconnect, which achieves a 36% increase in NVIDIA Collective Communications Library (NCCL) performance compared with the CPU direct connection scheme and helps improve inference performance by up to 9% when you run LLM inference tasks on multiple GPUs in parallel
  - Support for eight GPUs per instance with 48 GB of memory per GPU to support LLM inference tasks with 70 billion or more parameters on a single instance
- Uses 3.4 GHz Intel^® Xeon^® Scalable (SPR) processors that deliver an all-core turbo frequency of 3.9 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, and EEDs.
Network:
- Supports IPv4 and IPv6.
- Provides ultra-high network performance with a packet forwarding rate of 30,000,000 pps.
- Supports ERIs to allow inter-instance RDMA-based communication in VPCs and provides up to 160 Gbit/s of bandwidth per instance, which is suitable for training tasks based on CV models and traditional models.
  Note
  For information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	Private IPv4 addresses per ENI	IPv6 addresses per ENI	NIC queues (Primary ENI/Secondary ENI)	ENIs	Maximum attached data disks	Maximum disk bandwidth (Gbit/s)
ecs.ebmgn8is.32xlarge	128	1024	48 GB × 8	160 (80 × 2)	30,000,000	30	30	64/16	32	31	6

Note

The boot mode of the images that are used by instances of this instance family must be UEFI. If you want to use custom images on the instances, make sure that the images support the UEFI boot mode and the boot mode of the images is set to UEFI. For information about how to set the boot mode of a custom image, see Set the boot mode of custom images to the UEFI mode by calling API operations.

ebmgn7e, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Supported scenarios:
- Deep learning training and development
- High-performance computing (HPC) and simulations
Important
When you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.
Compute:
- Uses 2.9 GHz Intel^® Xeon^® Scalable processors that deliver an all-core turbo frequency of 3.5 GHz and supports PCIe 4.0 interfaces.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6.
- Provides ultra-high network performance with a packet forwarding rate of 24,000,000 pps.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues (Primary NIC/Secondary NIC)	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn7e.32xlarge	128	1024	80 GB × 8	64	24,000,000	32/12	32	10	1

You must check the status of the multi-instance GPU (MIG) feature and enable or disable the MIG feature after you start an ebmgn7e instance. For information about the MIG feature, see NVIDIA Multi-Instance GPU User Guide.

The following table describes whether the MIG feature is supported by the instance types in the ebmgn7e instance family.

Instance type	Support for MIG	Description
ecs.ebmgn7e.32xlarge	Yes	The MIG feature is supported by ebmgn7e instances.

ebmgn7i, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Supported scenarios:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- Scenarios that require high network bandwidth and disk bandwidth, such as the creation of high-performance render farms
- Small-scale deep learning and training applications that require high network bandwidth
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative NVIDIA Ampere architecture
  - Support for acceleration features such as vGPU, RTX technology, and TensorRT inference engine
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6.
- Provides ultra-high network performance with a packet forwarding rate of 24,000,000 pps.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn7i.32xlarge	128	768	NVIDIA A10 × 4	24 GB × 4	64	24,000,000	32	32	10	1

ebmgn7, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Supported scenarios:
- Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications that require robust GPU computing capabilities, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8269CY (Cascade Lake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn7.26xlarge	104	768	40 GB × 8	30	18,000,000	16	15	10	1

You must manually check the status of the MIG feature and enable or disable the MIG feature after you start an ebmgn7 instance. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.

The following table describes whether the MIG feature is supported by the instance types in the ebmgn7 instance family.

Instance type	MIG	Description
ecs.ebmgn7.26xlarge	Yes	The MIG feature is supported by ebmgn7 instances.

ebmgn6ia, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
- This instance family uses the third-generation SHENLONG architecture and fast path acceleration on chips to provide predictable and consistent ultra-high computing, storage, and network performance.
- This instance family uses NVIDIA T4 GPUs to offer GPU acceleration capabilities for graphics and AI applications and adopts container technology to start at least 60 virtual Android devices and provide hardware-accelerated video transcoding.
Supported scenarios:
- Remote application services based on Android, such as always-on cloud-based services, cloud-based mobile games, cloud-based mobile phones, and Android service crawlers.
Compute:
- Offers a CPU-to-memory ratio of 1:3.
- Uses 2.8 GHz Ampere^® Altra^® Arm-based processors that deliver a turbo frequency of 3.0 GHz and provides high performance and high compatibility with applications for Android servers.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv4 and IPv6.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn6ia.20xlarge	80	256	NVIDIA T4 × 2	16 GB × 2	32	24,000,000	32	15	10	1

Note

Ampere^® Altra^® processors have specific requirements for operating system kernels. Instances of the preceding instance type can use Alibaba Cloud Linux 3 images and CentOS 8.4 or later images. We recommend that you use Alibaba Cloud Linux 3 images on the instances. If you want to use another operating system distribution, patch the kernel of an instance that runs an operating system of that distribution, create a custom image from the instance, and then use the custom image to create instances of the instance type. For information about kernel patches, visit Ampere Altra (TM) Linux Kernel Porting Guide.

ebmgn6e, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
- This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
- This instance family uses NVIDIA V100 GPUs that each have 32 GB of GPU memory and support NVLink.
- This instance family uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 32 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink connections per GPU, each of which provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300)
Supported scenarios:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Offers a CPU-to-memory ratio of 1:8.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn6e.24xlarge	96	768	NVIDIA V100 × 8	32 GB × 8	32	4,800,000	16	15	10	1

ebmgn6v, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
- This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
- This instance family uses NVIDIA V100 GPUs.
- This instance family uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
  - Innovative NVIDIA Volta architecture
  - 16 GB of HBM2 memory (900 GB/s bandwidth) per GPU
  - 5,120 CUDA cores per GPU
  - 640 Tensor cores per GPU
  - Up to six NVLink connections per GPU, each of which provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300)
Supported scenarios:
- Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
- Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Compute:
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn6v.24xlarge	96	384	NVIDIA V100 × 8	16 GB × 8	30	4,500,000	8	32	10	1

ebmgn6i, GPU-accelerated compute-optimized ECS Bare Metal Instance family

Introduction:
- This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
- This instance family uses NVIDIA T4 GPUs that have the following features:
  - Innovative NVIDIA Turing architecture
  - 16 GB of memory (320 GB/s bandwidth) per GPU
  - 2,560 CUDA cores per GPU
  - Up to 320 Turing Tensor cores per GPU
  - Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS
Supported scenarios:
- AI (deep learning and machine learning) inference for computer vision, voice recognition, speech synthesis, natural language processing (NLP), machine translation, and reference systems
- Real-time rendering for cloud gaming
- Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications
- Graphics workstations or graphics-heavy computing
- GPU-accelerated databases
- High-performance computing
Compute:
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
- Supports IPv4 and IPv6.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.ebmgn6i.24xlarge	96	384	NVIDIA T4 × 4	16 GB × 4	30	4,500,000	8	32	10	1

gn5i, GPU-accelerated compute-optimized instance family

Supported scenarios: server-side GPU compute workloads, such as deep learning inference and multi-media encoding and decoding.
Compute:
- Uses NVIDIA P4 GPUs.
- Offers a CPU-to-memory ratio of 1:4.
- Uses 2.5 GHz Intel^® Xeon^® E5-2682 v4 (Broadwell) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports standard SSDs and ultra disks.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.gn5i-c2g1.large	2	8	NVIDIA P4 * 1	8GB * 1	1	100,000	2	2	6	1
ecs.gn5i-c4g1.xlarge	4	16	NVIDIA P4 * 1	8GB * 1	1.5	200,000	2	3	10	1
ecs.gn5i-c8g1.2xlarge	8	32	NVIDIA P4 * 1	8GB * 1	2	400,000	4	4	10	1
ecs.gn5i-c16g1.4xlarge	16	64	NVIDIA P4 * 1	8GB * 1	3	800,000	4	8	20	1
ecs.gn5i-c16g1.8xlarge	32	128	NVIDIA P4 * 2	8GB * 2	6	1,200,000	8	8	20	1
ecs.gn5i-c28g1.14xlarge	56	224	NVIDIA P4 * 2	8GB * 2	10	2,000,000	14	8	20	1

gn5, GPU-accelerated compute-optimized instance family

Supported scenarios:
- Deep learning
- Scientific computing applications, such as computational fluid dynamics, computational finance, genomics, and environmental analytics
- Server-side GPU compute workloads, such as high-performance computing, rendering, and multi-media encoding and decoding
Compute:
- Uses NVIDIA P100 GPUs.
- Offers multiple CPU-to-memory ratios.
- Uses 2.5 GHz Intel^® Xeon^® E5-2682 v4 (Broadwell) processors.
Storage:
- Supports high-performance local Non-Volatile Memory Express (NVMe) SSDs.
- Is an instance family in which all instances are I/O optimized.
- Supports standard SSDs and ultra disks.
Network:
- Supports only IPv4.
- Provides high network performance based on large computing capacity.

Instance types

Instance type	vCPUs	Memory (GiB)	Local storage (GiB)	GPU	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI
ecs.gn5-c4g1.xlarge	4	30	440	NVIDIA P100 * 1	16GB * 1	3	300,000	1	3	10
ecs.gn5-c8g1.2xlarge	8	60	440	NVIDIA P100 * 1	16GB * 1	3	400,000	1	4	10
ecs.gn5-c4g1.2xlarge	8	60	880	NVIDIA P100 * 2	16GB * 2	5	1,000,000	2	4	10
ecs.gn5-c8g1.4xlarge	16	120	880	NVIDIA P100 * 2	16GB * 2	5	1,000,000	4	8	20
ecs.gn5-c28g1.7xlarge	28	112	440	NVIDIA P100 * 1	16GB * 1	5	1,000,000	8	8	20
ecs.gn5-c8g1.8xlarge	32	240	1760	NVIDIA P100 * 4	16GB * 4	10	2,000,000	8	8	20
ecs.gn5-c28g1.14xlarge	56	224	880	NVIDIA P100 * 2	16GB * 2	10	2,000,000	14	8	20
ecs.gn5-c8g1.14xlarge	54	480	3520	NVIDIA P100 * 8	16GB * 8	25	4,000,000	14	8	20