GPU-accelerated compute-optimized instances provide high performance and high parallel computing capabilities, and are suitable for large-scale parallel computing scenarios. You can use GPU-accelerated compute-optimized instances to achieve improved computing performance and efficiency for your business. This topic describes the features of GPU-accelerated compute-optimized instance families of Elastic Compute Service (ECS) and lists the instance types in each instance family.
Recommended instance families
ebmgn8is, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn7e, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn7i, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn7, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn6ia, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn6e, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn6v, GPU-accelerated compute-optimized ECS Bare Metal Instance family
ebmgn6i, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Other available instance families (If the preceding instance families are sold out, you can use the recommended instance families.)
gn8is, GPU-accelerated compute-optimized instance family
This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.
Features:
This instance family is an 8th-generation GPU-accelerated compute-optimized instance family provided by Alibaba Cloud in response to the recent development of AI-generated business. This instance family consists of multiple instance types that provide 1, 2, 4, or 8 GPUs per instance and have different CPU-to-GPU ratios to fit various use cases.
Benefits and positioning:
Graphic processing: This instance family uses high-frequency 5th-generation Intel Xeon Scalable processors to provide sufficient CPU capacity for smooth graphics rendering and design in 3D modeling scenarios.
Inference tasks: This instance family uses innovative GPUs, each with 48 GB of memory, which accelerate inference tasks and support the FP8 floating-point format. You can use this instance family together with Container Service for Kubernetes (ACK) to support the inference of various AI-generated content (AIGC) models and accommodate inference tasks for 70B or larger large language models (LLMs).
Compute:
Uses innovative GPUs that have the following features:
Support for acceleration features, such as TensorRT, and the FP8 floating-point format to improve LLM inference performance.
Up to 48 GB of memory per GPU and support for the inference of 70B or larger LLMs on a single instance with multiple GPUs.
Improved graphic processing capabilities. For example, after you install a GRID driver on a gn8is instance by using Cloud Assistant or an Alibaba Cloud Marketplace image, the instance can provide graphic processing performance twice that of a 7th-generation instance.
Uses the latest high-frequency Intel® Xeon® processors that deliver an all-core turbo frequency of 3.9 GHz to meet complex 3D modeling requirements.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, and elastic ephemeral disks (EEDs).
Network:
Supports IPv4 and IPv6.
Supports Elastic RDMA interfaces (ERIs).
NoteFor information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.
Supported scenarios:
Animation, special effects for film and television, and rendering
Generation of AIGC images and inference of LLMs
Other general-purpose AI recognition, image recognition, and speech recognition scenarios
Instance types
Instance type | vCPU | Memory (GiB) | GPU memory | Network baseline bandwidth (Gbit/s) | ENIs | NIC queues per primary ENI | IP addresses (IPv4/IPv6) | Maximum disks | Disk baseline IOPS | Disk baseline bandwidth (Gbit/s) |
ecs.gn8is.2xlarge | 8 | 64 | 48GB * 1 | 8 | 4 | 8 | 15/15 | 17 | 60,000 | 0.75 |
ecs.gn8is.4xlarge | 16 | 128 | 48GB * 1 | 16 | 8 | 16 | 30/30 | 17 | 120,000 | 1.25 |
ecs.gn8is-2x.8xlarge | 32 | 256 | 48GB * 2 | 32 | 8 | 32 | 30/30 | 33 | 250,000 | 2 |
ecs.gn8is-4x.16xlarge | 64 | 512 | 48GB * 4 | 64 | 8 | 64 | 30/30 | 33 | 450,000 | 4 |
ecs.gn8is-8x.32xlarge | 128 | 1024 | 48GB * 8 | 100 | 15 | 64 | 50/50 | 65 | 900,000 | 8 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the Instance type specifications section of the "Overview of instance families" topic.
gn7e, GPU-accelerated compute-optimized instance family
Features:
You can select instance types that provide different numbers of GPUs and CPUs to meet your business requirements for AI use cases.
This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of virtual private clouds (VPCs), networks, and disks compared with instance families of the previous generation.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Small- and medium-scale AI training workloads
High-performance computing (HPC) business accelerated by using Compute Unified Device Architecture (CUDA)
AI inference tasks that require high GPU processing capabilities or large amounts of GPU memory
Deep learning applications such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition.
Scientific computing applications that require robust GPU computing capabilities such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics.
ImportantWhen you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.
Instance types
Instance type | vCPU | Memory (GiB) | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn7e-c16g1.4xlarge | 16 | 125 | 80GB * 1 | 8 | 3,000,000 | 8 | 8 | 10 |
ecs.gn7e-c16g1.8xlarge | 32 | 250 | 80GB * 2 | 16 | 6,000,000 | 16 | 8 | 10 |
ecs.gn7e-c16g1.16xlarge | 64 | 500 | 80GB * 4 | 32 | 12,000,000 | 32 | 8 | 10 |
ecs.gn7e-c16g1.32xlarge | 128 | 1000 | 80GB * 8 | 64 | 24,000,000 | 32 | 16 | 15 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
After you create or restart a gn7e instance in the ECS console, the Multi-Instance GPU (MIG) feature of the instance is automatically disabled. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.
The following table describes whether the MIG feature is supported by the instance types in the gn7e instance family.
Instance type | MIG | Description |
ecs.gn7e-c16g1.4xlarge | Yes | Single-GPU instances support the MIG feature. |
ecs.gn7e-c16g1.16xlarge | No | For security purposes, multi-GPU instances do not support the MIG feature. |
ecs.gn7e-c16g1.32xlarge | No |
gn7i, GPU-accelerated compute-optimized instance family
Features:
This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude.
Compute:
Uses NVIDIA A10 GPUs that have the following features:
Innovative Ampere architecture
Support for acceleration features such as RTX and TensorRT
Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Provides up to 752 GiB of memory, which is much larger than the memory sizes of the gn6i instance family.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn7i-c8g1.2xlarge | 8 | 30 | NVIDIA A10 * 1 | 24 GB * 1 | 16 | 1,600,000 | 8 | 4 | 15 |
ecs.gn7i-c16g1.4xlarge | 16 | 60 | NVIDIA A10 * 1 | 24 GB * 1 | 16 | 3,000,000 | 8 | 8 | 30 |
ecs.gn7i-c32g1.8xlarge | 32 | 188 | NVIDIA A10 * 1 | 24 GB * 1 | 16 | 6,000,000 | 12 | 8 | 30 |
ecs.gn7i-c32g1.16xlarge | 64 | 376 | NVIDIA A10 * 2 | 24 GB * 2 | 32 | 12,000,000 | 16 | 15 | 30 |
ecs.gn7i-c32g1.32xlarge | 128 | 752 | NVIDIA A10 * 4 | 24 GB * 4 | 64 | 24,000,000 | 32 | 15 | 30 |
ecs.gn7i-c48g1.12xlarge | 48 | 310 | NVIDIA A10 * 1 | 24 GB * 1 | 16 | 9,000,000 | 16 | 8 | 30 |
ecs.gn7i-c56g1.14xlarge | 56 | 346 | NVIDIA A10 * 1 | 24 GB * 1 | 16 | 12,000,000 | 16 | 12 | 30 |
ecs.gn7i-2x.8xlarge | 32 | 128 | NVIDIA A10 * 2 | 24 GB * 2 | 16 | 6,000,000 | 16 | 8 | 30 |
ecs.gn7i-4x.8xlarge | 32 | 128 | NVIDIA A10 * 4 | 24 GB * 4 | 16 | 6,000,000 | 16 | 8 | 30 |
ecs.gn7i-4x.16xlarge | 64 | 256 | NVIDIA A10 * 4 | 24 GB * 4 | 32 | 12,000,000 | 32 | 8 | 30 |
ecs.gn7i-8x.32xlarge | 128 | 512 | NVIDIA A10 * 8 | 24 GB * 8 | 64 | 24,000,000 | 32 | 16 | 30 |
ecs.gn7i-8x.16xlarge | 64 | 256 | NVIDIA A10 * 8 | 24 GB * 8 | 32 | 12,000,000 | 32 | 8 | 30 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
You can change the following instance types only to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge: ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
gn7s, GPU-accelerated compute-optimized instance family
Features:
This instance family uses the latest Intel Ice Lake processors and NVIDIA A30 GPUs that are based on NVIDIA Ampere architecture. You can select instance types that comprise appropriate mixes of GPUs and vCPUs to meet your business requirements in AI scenarios.
This instance family uses the third-generation SHENLONG architecture and doubles the average bandwidths of VPCs, networks, and disks compared with instance families of the previous generation.
Compute:
Uses NVIDIA A30 GPUs that have the following features:
Innovative NVIDIA Ampere architecture
Support for the multi-instance GPU (MIG) feature and acceleration features (based on second-generation Tensor cores) to provide diversified business support
Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Improves memory sizes significantly from instance families of the previous generation.
Storage: Supports only enhanced SSDs (ESSDs) and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios: concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification.
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | IPv6 addresses per ENI | NIC queues | ENIs |
ecs.gn7s-c8g1.2xlarge | 8 | 60 | NVIDIA A30 * 1 | 24 GB * 1 | 16 | 6,000,000 | 1 | 12 | 8 |
ecs.gn7s-c16g1.4xlarge | 16 | 120 | NVIDIA A30 * 1 | 24 GB * 1 | 16 | 6,000,000 | 1 | 12 | 8 |
ecs.gn7s-c32g1.8xlarge | 32 | 250 | NVIDIA A30 * 1 | 24 GB * 1 | 16 | 6,000,000 | 1 | 12 | 8 |
ecs.gn7s-c32g1.16xlarge | 64 | 500 | NVIDIA A30 * 2 | 24 GB * 2 | 32 | 12,000,000 | 1 | 16 | 15 |
ecs.gn7s-c32g1.32xlarge | 128 | 1,000 | NVIDIA A30 * 4 | 24 GB * 4 | 64 | 24,000,000 | 1 | 32 | 15 |
ecs.gn7s-c48g1.12xlarge | 48 | 380 | NVIDIA A30 * 1 | 24 GB * 1 | 16 | 6,000,000 | 1 | 12 | 8 |
ecs.gn7s-c56g1.14xlarge | 56 | 440 | NVIDIA A30 * 1 | 24 GB * 1 | 16 | 6,000,000 | 1 | 12 | 8 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
gn7, GPU-accelerated compute-optimized instance family
Features:
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
Scientific computing applications that require robust GPU computing capabilities such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPU | Memory (GiB) | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs |
ecs.gn7-c12g1.3xlarge | 12 | 94 | 40GB * 1 | 4 | 2,500,000 | 4 | 8 |
ecs.gn7-c13g1.13xlarge | 52 | 378 | 40GB * 4 | 16 | 9,000,000 | 16 | 8 |
ecs.gn7-c13g1.26xlarge | 104 | 756 | 40GB * 8 | 30 | 18,000,000 | 16 | 15 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
After you create or restart a gn7 instance in the ECS console, the MIG feature of the instance is automatically disabled. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.
The following table describes whether the MIG feature is supported by the instance types in the gn7 instance family.
Instance type | MIG | Description |
ecs.gn7-c12g1.3xlarge | Yes | Single-GPU instances support the MIG feature. |
ecs.gn7-c13g1.13xlarge | No | For security purposes, multi-GPU instances do not support the MIG feature. |
ecs.gn7-c13g1.26xlarge | No |
gn6i, GPU-accelerated compute-optimized instance family
Features:
Compute:
Uses NVIDIA T4 GPUs that have the following features:
Innovative NVIDIA Turing architecture
16 GB memory (320 GB/s bandwidth) per GPU
2,560 CUDA cores per GPU
Up to 320 Turing Tensor cores per GPU
Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS
Offers a CPU-to-memory ratio of 1:4.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
AI (deep learning and machine learning) inference for computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems
Real-time rendering for cloud gaming
Real-time rendering for AR and VR applications
Graphics workstations or graphics-heavy computing
GPU-accelerated databases
High-performance computing
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | Disk baseline IOPS | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn6i-c4g1.xlarge | 4 | 15 | NVIDIA T4 * 1 | 16 GB * 1 | 4 | 500,000 | None | 2 | 2 | 10 |
ecs.gn6i-c8g1.2xlarge | 8 | 31 | NVIDIA T4 * 1 | 16 GB * 1 | 5 | 800,000 | None | 2 | 2 | 10 |
ecs.gn6i-c16g1.4xlarge | 16 | 62 | NVIDIA T4 * 1 | 16 GB * 1 | 6 | 1,000,000 | None | 4 | 3 | 10 |
ecs.gn6i-c24g1.6xlarge | 24 | 93 | NVIDIA T4 * 1 | 16 GB * 1 | 7.5 | 1,200,000 | None | 6 | 4 | 10 |
ecs.gn6i-c40g1.10xlarge | 40 | 155 | NVIDIA T4 * 1 | 16 GB * 1 | 10 | 1,600,000 | None | 16 | 10 | 10 |
ecs.gn6i-c24g1.12xlarge | 48 | 186 | NVIDIA T4 * 2 | 16 GB * 2 | 15 | 2,400,000 | None | 12 | 6 | 10 |
ecs.gn6i-c24g1.24xlarge | 96 | 372 | NVIDIA T4 * 4 | 16 GB * 4 | 30 | 4,800,000 | 250,000 | 24 | 8 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
gn6e, GPU-accelerated compute-optimized instance family
Features:
Compute:
Uses NVIDIA V100 GPUs, each of which has 32 GB of GPU memory and supports NVLink.
Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
Innovative NVIDIA Volta architecture
32 GB HBM2 memory (900 GB/s bandwidth) per GPU
5,120 CUDA cores per GPU
640 Tensor cores per GPU
Support for up to six NVLink bidirectional connections, each of which provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300)
Offers a CPU-to-memory ratio of 1:8.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications such as the training and inference applications of AI algorithms used in image classification, autonomous driving, and speech recognition
Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn6e-c12g1.3xlarge | 12 | 92 | NVIDIA V100 * 1 | 32 GB * 1 | 5 | 800,000 | 8 | 6 | 10 |
ecs.gn6e-c12g1.12xlarge | 48 | 368 | NVIDIA V100 * 4 | 32 GB * 4 | 16 | 2,400,000 | 8 | 8 | 20 |
ecs.gn6e-c12g1.24xlarge | 96 | 736 | NVIDIA V100 * 8 | 32 GB * 8 | 32 | 4,800,000 | 16 | 8 | 20 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
gn6v, GPU-accelerated compute-optimized instance family
Features:
Compute:
Uses NVIDIA V100 GPUs.
Uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
Innovative NVIDIA Volta architecture
16 GB HBM2 memory (900 GB/s bandwidth) per GPU
5,120 CUDA cores per GPU
640 Tensor cores per GPU
Support for up to six NVLink bidirectional connections, each of which provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300)
Offers a CPU-to-memory ratio of 1:4.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications such as the training and inference applications of AI algorithms used in image classification, autonomous driving, and speech recognition
Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | Disk baseline IOPS | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn6v-c8g1.2xlarge | 8 | 32 | NVIDIA V100 * 1 | 16 GB * 1 | 2.5 | 800,000 | None | 4 | 4 | 10 |
ecs.gn6v-c8g1.8xlarge | 32 | 128 | NVIDIA V100 * 4 | 16 GB * 4 | 10 | 2,000,000 | None | 8 | 8 | 20 |
ecs.gn6v-c8g1.16xlarge | 64 | 256 | NVIDIA V100 * 8 | 16 GB * 8 | 20 | 2,500,000 | None | 16 | 8 | 20 |
ecs.gn6v-c10g1.20xlarge | 82 | 336 | NVIDIA V100 * 8 | 16 GB * 8 | 32 | 4,500,000 | 250,000 | 16 | 8 | 20 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
ebmgn8is, GPU-accelerated compute-optimized ECS Bare Metal Instance family
This instance family is available only in specific regions, including regions outside China. To use the instance family, contact Alibaba Cloud sales personnel.
Features:
The ebmgn8is instance family is an 8th-generation GPU-accelerated compute-optimized ECS Bare Metal instance family provided by Alibaba Cloud in response to the recent development of AI-generated business. Each instance of this instance family is equipped with eight GPUs.
Benefits and positioning:
Graphic processing: This instance family uses high-frequency 5th-generation Intel Xeon Scalable processors to deliver sufficient CPU computing power in 3D modeling scenarios and achieve smooth graphics rendering and design.
Inference tasks: This instance family uses innovative GPUs, each with 48 GB of memory, which accelerate inference tasks and support the FP8 floating-point format. You can use this instance family together with Container Service for Kubernetes (ACK) to support the inference of various AI-generated content (AIGC) models and accommodate inference tasks for 70B or larger large language models (LLMs).
Training tasks: This instance family provides cost-effective computing capabilities and delivers the single-precision floating-point format (FP32) computing performance that is doubled compared with the computing performance of the 7th-generation inference instances. Instances of this instance family are suitable for training FP32-based CV models and other small and medium-sized models.
This instance family uses the latest Cloud Infrastructure Processing Unit (CIPU) 1.0 processors.
Decouples computing capabilities from storage capabilities, allowing you to flexibly select storage resources based on your business requirements, and increases inter-instance bandwidth to 160 Gbit/s for faster data transmission and processing compared with previous-generation instance families.
Uses the bare metal capabilities provided by CIPU processors to support Peripheral Component Interconnect Express (PCIe) peer-to-peer (P2P) communication between GPU-accelerated instances.
Compute:
Uses innovative GPUs that have the following features:
Support for acceleration features such as vGPU, RTX technology, and TensorRT inference engine
Support for PCIe Switch interconnect, which achieves a 36% increase in NVIDIA Collective Communications Library (NCCL) performance compared with the CPU direct connection scheme and helps improve inference performance by up to 9% when you run LLM inference tasks on multiple GPUs in parallel
Support for eight GPUs per instance with 48 GB of memory per GPU to support LLM inference tasks with 70 billion or more parameters on a single instance
Uses 3.4 GHz Intel® Xeon® Scalable (SPR) processors that deliver an all-core turbo frequency of 3.9 GHz.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, and elastic ephemeral disks.
Network:
Supports IPv4 and IPv6.
Provides ultra-high network performance with a packet forwarding rate of 30,000,000 pps.
Supports ERIs to allow inter-instance RDMA-based communication in VPCs and provides up to 160 Gbit/s of bandwidth per instance, which is suitable for training tasks based on CV models and traditional models.
NoteFor information about how to use ERIs, see Configure eRDMA on an enterprise-level instance.
Supported scenarios:
Production and rendering of special effects for animation, film, and television based on workstation-level graphics processing capabilities in scenarios in which Alibaba Cloud Marketplace GRID images are used, the GRID driver is installed, and OpenGL and Direct3D graphics capabilities are enabled.
Scenarios in which the management services provided by ACK for containerized applications are used to support AI-generated graphic content and LLM inference tasks with up to 130 billion parameters
Other general-purpose AI recognition, image recognition, and speech recognition scenarios
Instance types
Instance type | vCPUs | Memory (GiB) | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | Private IPv4 addresses per ENI | IPv6 addresses per ENI | NIC queues (Primary ENI/Secondary ENI) | ENIs | Maximum data disks | Maximum disk bandwidth (Gbit/s) |
ecs.ebmgn8is.32xlarge | 128 | 1,024 | 48GB*8 | 160 (80 × 2) | 30,000,000 | 30 | 30 | 64/16 | 32 | 31 | 6 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
The boot mode of the images that are used by instances of this instance family must be UEFI. If you want to use custom images on the instances, make sure that the images support the UEFI boot mode and the boot mode of the images is set to UEFI. For information about how to set the boot mode of a custom image, see Set the boot mode of custom images to the UEFI mode by calling API operations.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
ebmgn7e, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Compute:
Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz and supports PCIe 4.0 interfaces.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides ultra-high network performance with a packet forwarding rate of 24,000,000 pps.
Supported scenarios:
Deep learning training and development
High-performance computing (HPC) and simulations
ImportantWhen you use AI training services that feature a high communication load, such as transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, data may be damaged due to unpredictable failures that are caused by large-scale data transmission over Peripheral Component Interconnect Express (PCIe) links. If you do not understand the topology of the communication links that are used for AI training services, submit a ticket to obtain technical support.
Instance types
Instance type | vCPUs | Memory (GiB) | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues (Primary ENI/Secondary ENI) | ENIs |
ecs.ebmgn7e.32xlarge | 128 | 1,024 | 80GB * 8 | 64 | 24,000,000 | 32/12 | 32 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
You must check the status of the MIG feature and enable or disable the MIG feature after you start an ebmgn7e instance. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.
The following table describes whether the MIG feature is supported by the instance types in the ebmgn7e instance family.
Instance type | MIG | Description |
ecs.ebmgn7e.32xlarge | Yes | The MIG feature is supported by ebmgn7e instances. |
ebmgn7i, GPU-accelerated compute optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Compute:
Uses NVIDIA A10 GPUs that have the following features:
Innovative NVIDIA Ampere architecture
Support for acceleration features such as vGPU, RTX technology, and TensorRT inference engine
Uses 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides ultra-high network performance with a packet forwarding rate of 24,000,000 pps.
Supported scenarios:
Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
Scenarios that require high network bandwidth and disk bandwidth, such as the creation of high-performance render farms
Small-scale deep learning and training applications that require high network bandwidth
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs |
ecs.ebmgn7i.32xlarge | 128 | 768 | NVIDIA A10 * 4 | 24GB * 4 | 64 | 24,000,000 | 32 | 32 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
ebmgn7, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
Compute:
Uses 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications, such as training applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
Scientific computing applications that require robust GPU computing capabilities such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPUs | Memory (GiB) | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.ebmgn7.26xlarge | 104 | 768 | 30 | 18,000,000 | 16 | 15 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
You must manually check the status of the MIG feature and enable or disable the MIG feature after you start an ebmgn7 instance. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.
The following table describes whether the MIG feature is supported by the instance types in the ebmgn7 instance family.
Instance type | MIG | Description |
ecs.ebmgn7.26xlarge | Yes | The MIG feature is supported by ebmgn7 instances. |
ebmgn6ia, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the third-generation SHENLONG architecture and fast path acceleration on chips to provide predictable and consistent ultra-high computing, storage, and network performance.
This instance family uses NVIDIA T4 GPUs to offer GPU acceleration capabilities for graphics and AI applications and adopts container technology to start up to 60 virtual Android devices and provide hardware-accelerated video transcoding.
Compute:
Offers a CPU-to-memory ratio of 1:3.
Uses 2.8 GHz Ampere® Altra® Arm-based processors that deliver a turbo frequency of 3.0 GHz and provides high performance and high compatibility with applications for Android servers.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only ESSDs and ESSD AutoPL disks.
Network:
Supports IPv6.
Supported scenarios:
Remote application services based on Android, such as always-on cloud-based services, cloud-based mobile games, cloud-based mobile phones, and Android service crawlers.
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.ebmgn6ia.20xlarge | 80 | 256 | NVIDIA T4 * 2 | 16GB * 2 | 32 | 24,000,000 | 32 | 15 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
Ampere® Altra® processors have specific requirements on operating system kernels. Instances of the preceding instance type can use Alibaba Cloud Linux 3 images and CentOS 8.4 or later images. We recommend that you use Alibaba Cloud Linux 3 images on the instances. If you want to use another operating system distribution, patch the kernel of an instance that runs an operating system of that distribution, create a custom image from the instance, and then use the custom image to create instances of the instance type. For information about kernel patches, visit Ampere Altra (TM) Linux Kernel Porting Guide.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
ebmgn6e, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
This instance family uses NVIDIA V100 GPUs that each has 32 GB of GPU memory and support NVLink.
This instance family uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
Innovative NVIDIA Volta architecture.
32 GB of HBM2 memory (900 GB/s bandwidth) per GPU.
5,120 CUDA cores per GPU.
640 Tensor cores per GPU.
Support for up to six NVLink connections per GPU. Each NVLink connection provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300).
Compute:
Offers a CPU-to-memory ratio of 1:8.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.ebmgn6e.24xlarge | 96 | 768 | NVIDIA V100 * 8 | 32GB * 8 | 32 | 4,800,000 | 16 | 15 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
ebmgn6v, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
This instance family uses NVIDIA V100 GPUs.
This instance family uses NVIDIA V100 GPUs (SXM2-based) that have the following features:
Innovative NVIDIA Volta architecture.
16 GB of HBM2 memory (900 GB/s bandwidth) per GPU
5,120 CUDA cores per GPU.
640 Tensor cores per GPU.
Support for up to six NVLink connections per GPU. Each NVLink connection provides a bandwidth of 25 GB/s in each direction for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300).
Compute:
Offers a CPU-to-memory ratio of 1:4.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning applications, such as training and inference applications of AI algorithms used in image classification, autonomous vehicles, and speech recognition
Scientific computing applications, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analytics
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.ebmgn6v.24xlarge | 96 | 384 | NVIDIA V100 * 8 | 16GB * 8 | 30 | 4,500,000 | 8 | 32 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
ebmgn6i, GPU-accelerated compute-optimized ECS Bare Metal Instance family
Features:
This instance family uses the SHENLONG architecture to provide flexible and powerful software-defined compute.
This instance family uses NVIDIA T4 GPUs that have the following features:
Innovative NVIDIA Turing architecture
16 GB of memory (320 GB/s bandwidth) per GPU
2,560 CUDA cores per GPU
Up to 320 Turing Tensor cores per GPU
Mixed-precision Tensor cores that support 65 FP16 TFLOPS, 130 INT8 TOPS, and 260 INT4 TOPS
Compute:
Offers a CPU-to-memory ratio of 1:4.
Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
AI (deep learning and machine learning) inference for computer vision, voice recognition, speech synthesis, natural language processing (NLP), machine translation, and reference systems
Real-time rendering for cloud games
Real-time rendering for AR and VR applications
Graphics workstations or graphics-heavy computing
GPU-accelerated databases
High-performance computing
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.ebmgn6i.24xlarge | 96 | 384 | NVIDIA T4 * 4 | 16GB * 4 | 30 | 4,500,000 | 8 | 32 | 10 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For information about the specifications of the instance types, see the Instance type specifications section of the "Overview of instance families" topic.
The CPU monitoring information about ECS bare metal instances cannot be obtained. To obtain the CPU monitoring information about an ECS bare metal instance, install the CloudMonitor agent on the instance. For more information, see Install and uninstall the CloudMonitor agent.
gn5i, GPU-accelerated compute-optimized instance family
Features:
Compute:
Uses NVIDIA P4 GPUs.
Offers a CPU-to-memory ratio of 1:4.
Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.
Storage:
Is an instance family in which all instances are I/O optimized.
Supports only standard SSDs and ultra disks.
Network:
Supports IPv6.
Provides high network performance based on large computing capacity.
Supported scenarios:
Deep learning inference
Server-side GPU compute workloads such as multi-media encoding and decoding
Instance types
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Network baseline bandwidth (Gbit/s) | Packet forwarding rate (pps) | NIC queues | ENIs | Private IPv4 addresses per ENI |
ecs.gn5i-c2g1.large | 2 | 8 | NVIDIA P4 * 1 | 8 GB * 1 | 1 | 100,000 | 2 | 2 | 6 |
ecs.gn5i-c4g1.xlarge | 4 | 16 | NVIDIA P4 * 1 | 8 GB * 1 | 1.5 | 200,000 | 2 | 3 | 10 |
ecs.gn5i-c8g1.2xlarge | 8 | 32 | NVIDIA P4 * 1 | 8 GB * 1 | 2 | 400,000 | 4 | 4 | 10 |
ecs.gn5i-c16g1.4xlarge | 16 | 64 | NVIDIA P4 * 1 | 8 GB * 1 | 3 | 800,000 | 4 | 8 | 20 |
ecs.gn5i-c16g1.8xlarge | 32 | 128 | NVIDIA P4 * 2 | 8 GB * 2 | 6 | 1,200,000 | 8 | 8 | 20 |
ecs.gn5i-c28g1.14xlarge | 56 | 224 | NVIDIA P4 * 2 | 8 GB * 2 | 10 | 2,000,000 | 14 | 8 | 20 |
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.