Pod-based capacity reservation provides resource guarantee for elastic workloads. GPU Pod capacity reservation does not require direct cluster binding. You only need to specify Pod specifications, availability zone, and lock-in period when purchasing. ACS guarantees that Pods of the corresponding specifications start within minutes when resources are needed. GPU Pod capacity reservation ensures resource availability while offering lower prices compared to standard pay-as-you-go Pods.
Features
Resource guarantee: During the GPU Pod capacity reservation validity period, the system guarantees successful resource provisioning.
Cost reduction: After a Pod starts, you are charged at pay-as-you-go rates. After a Pod is terminated, you are charged at capacity reservation rates. You can flexibly configure Pod start and termination times based on traffic patterns.
Resource flexibility: Create multiple GPU Pod capacity reservations with different specifications to meet various workload requirements.
GPU Pod capacity reservation does not support BestEffort compute class Pods.
GPU Pod capacity reservation supports savings plans that match region and type attributes.
GPU Pod capacity reservation creation success depends on inventory availability.
Use cases
Periodic real-time workloads: Workloads that exhibit tidal patterns on a daily or weekly basis, where tasks must be executed and completed in real time. For example, real-time inference services.
Sporadic large-scale resource demands: Workloads with sudden real-time computing requirements that need rapid resource delivery and scaling to avoid business impact. For example, resource demands triggered by trending events in Internet services.
Usage and billing example
GPU Pod capacity reservation uses pay-as-you-go billing. During the capacity reservation validity period, fees include:
Pay-as-you-go fees for unused capacity reservations
Pay-as-you-go fees for running Pods
The following example demonstrates the usage flow and billing calculation for different phases when purchasing two GPU Pod capacity reservations and creating two pay-as-you-go Pods (Pod1 and Pod2).
Phase 1: Purchase and create capacity reservation
In the ACS console, choose Capacity Reservation > Create GPU Capacity Reservation, configure the capacity reservation parameters, and click Create Capacity Reservation.
Parameter | Description |
Capacity Reservation Name | User-defined capacity reservation name. |
Region | Region where resources are reserved. |
Zone | Availability zone where resources are reserved. |
Reservation Type | GPU card type. |
Resource Specification | Capacity reservation specification. Select only the number of GPU cards. The system automatically matches the highest CPU and memory specifications for that card count. |
Reservation Mode | Pod reservation (not modifiable). |
Billing Mode | Pay-as-you-go (not modifiable). |
Release Method | Default time to release the capacity reservation. |
Quantity | Number of GPU Pod capacity reservations for this specification. |
Fee calculation for Phase 1:
Phase | Fee | Description |
Phase 1 | None | Capacity reservation not yet created |
Phases 2-6: Capacity reservation validity period
During the validity period, you can create Pod instances that do not exceed the reserved configuration at any time. The system guarantees successful creation and deducts the corresponding capacity reservation quota. Deduction requires matching GPU (card type and quantity), CPU, and memory that do not exceed the reserved configuration. If matching succeeds, the quota is fully deducted.
For example, if you purchased 1 capacity reservation (specification: 1 card, 10 vCPU, 80 GB), creating a Pod with specification 1 card, 1 vCPU, 2 GB fully deducts this capacity reservation. When a Pod is terminated, the corresponding GPU Pod capacity reservation quota is restored.
Fee calculation for each phase:
Phase | Fee |
Phase 2 | 2 × Capacity reservation unit price × Phase 2 duration |
Phase 3 | 1 × Capacity reservation unit price × Phase 3 duration + Pod1 pay-as-you-go unit price × Phase 3 duration |
Phase 4 | Pod1 pay-as-you-go unit price × Phase 4 duration + Pod2 pay-as-you-go unit price × Phase 4 duration |
Phase 5 | 1 × Capacity reservation unit price × Phase 5 duration + Pod2 pay-as-you-go unit price × Phase 5 duration |
Phase 6 | 2 × Capacity reservation unit price × Phase 6 duration |
The capacity reservation unit price is the pay-as-you-go fee for unused capacity reservations. Pod1 and Pod2 pay-as-you-go unit prices are calculated based on the pay-as-you-go fees after Pod startup.
Phase 7: Capacity reservation expiration
After the capacity reservation expires, the system automatically releases the GPU Pod capacity reservation.
Specifications
After capacity reservation specification upgrade, the following card types and specifications are supported:
Card Type | GPU | vCPU | Memory (GiB) |
L20 (GN8IS) | 1 (48 GB VRAM) | 16 | 128 |
2 (48 GB × 2 VRAM) | 32 | 230 | |
4 (48 GB × 4 VRAM) | 64 | 460 | |
8 (48 GB × 8 VRAM) | 128 | 920 | |
T4 | 1 (16 GB VRAM) | 24 | 90 |
2 (16 GB × 2 VRAM) | 48 | 180 | |
A10 | 1 (24 GB VRAM) | 16 | 60 |
2 (24 GB × 2 VRAM) | 32 | 120 | |
4 (24 GB × 4 VRAM) | 64 | 240 | |
8 (24 GB × 8 VRAM) | 128 | 480 | |
P16EN | 1 (96 GB VRAM) | 10 | 80 |
2 (96 GB × 2 VRAM) | 22 | 225 | |
4 (96 GB × 4 VRAM) | 46 | 450 | |
8 (96 GB × 8 VRAM) | 92 | 900 | |
16 (96 GB × 16 VRAM) | 184 | 1800 | |
GU8TF | 1 (96 GB VRAM) | 16 | 128 |
2 (96 GB × 2 VRAM) | 46 | 230 | |
4 (96 GB × 4 VRAM) | 92 | 460 | |
8 (96 GB × 8 VRAM) | 184 | 920 | |
GU8TEF | 1 (141 GB VRAM) | 22 | 225 |
2 (141 GB × 2 VRAM) | 46 | 450 | |
4 (141 GB × 4 VRAM) | 92 | 900 | |
8 (141 GB × 8 VRAM) | 184 | 1800 | |
L20X (GX8SF) | 1 (141 GB VRAM) | 22 | 225 |
2 (141 GB × 2 VRAM) | 46 | 450 | |
4 (141 GB × 4 VRAM) | 92 | 900 | |
8 (141 GB × 8 VRAM) | 184 | 1800 |
Deduction rules
Capacity reservation deduction requires all of the following conditions:
GPU card type exactly matches the reserved card type. For example, both the reserved and Pod card types are L20.
GPU card count exactly matches the reserved configuration. For example, both the reserved and Pod card counts are 1.
Pod vCPU ≤ Reserved vCPU.
Pod memory ≤ Reserved memory.
The following deduction scenarios assume the Pod card type matches the reserved card type:
Deduction Principle | Scenario | Result and Explanation |
Exact match or downward compatibility | Reserved: 1 × (1 card, 16 vCPU, 128 GB). Create Pod: 1 × (1 card, 8 vCPU, 16 GB). | Result: ✓ Successfully deducted. Explanation: The Pod's required resources (card count, CPU, memory) do not exceed the reserved specification, so matching succeeds. This Pod fully deducts this capacity reservation. |
Smallest specification first | Reserved: 1 × (1 card, 10 vCPU, 80 GB) and 1 × (1 card, 16 vCPU, 128 GB). Create Pod: 1 × (1 card, 5 vCPU, 30 GB). | Result: ✓ Preferentially deducts the 1 card, 10 vCPU, 80 GB reservation. Explanation: To maximize resource utilization, the system preferentially selects the smallest specification reservation that meets the Pod requirements. |
First-in-first-out (FIFO) | Reserved: 4 × (1 card, 10 vCPU, 80 GB), created at different times. Create Pod: 4 × (1 card, 5 vCPU, 30 GB). | Result: ✓ The 4 Pods deduct reservations in order from earliest to latest creation time. Explanation: For reservations with the same specification, the FIFO principle applies. |
Multi-card specification atomicity (cannot be split) | Reserved: 1 × (4 cards, 46 vCPU, 450 GB). Create Pod: 4 × (1 card, 10 vCPU, 60 GB). | Result: ✗ Reservation is not deducted. Explanation: Multi-card reservations are atomic units and cannot be split to satisfy multiple single-card Pods. These 4 Pods are created as pay-as-you-go. |
Mixed specification matching | Reserved: 1 × (2 cards, 22 vCPU, 225 GB) and 1 × (4 cards, 46 vCPU, 450 GB). Create Pod: 2 × (1 card, 12 vCPU, 60 GB) and 2 × (2 cards, 20 vCPU, 120 GB). | Result: ✗ Only 1 Pod with 2 cards, 20 vCPU, 120 GB successfully deducts the 2 cards, 22 vCPU, 225 GB reservation. Explanation: Other Pods cannot match the remaining 4-card reservation and are created as pay-as-you-go. |
Real-time dynamic matching | Existing pay-as-you-go Pod: 1 × (1 card, 5 vCPU, 30 GB). New reservation purchased: 1 × (1 card, 10 vCPU, 80 GB). | Result: ✓ After the new reservation is created, it automatically matches and deducts the existing pay-as-you-go Pod. Explanation: Capacity reservations can deduct existing pay-as-you-go Pods that meet the conditions. |