Elastic Block Storage (EBS) implements cross-zone or cross-region disaster recovery of disks based on async replication capabilities to meet various business requirements. If a production site fails, you can fail over to the disaster recovery site to prevent system failures caused by regional disasters and ensure business availability and continuity.
Scenarios
Disk disaster recovery is used in the following scenarios:
Cross-zone disaster recovery
If applications in a zone fail and cannot recover in a short period of time due to force majeure factors (such as fires or blackouts that affect data centers) or device faults (such as software issues and hardware damage), you can use the cross-zone disaster recovery capability of the async replication feature to recover from zone-level failures and ensure business continuity.
Cross-region disaster recovery
If a production site fails due to a disaster, such as an earthquake or a tsunami, you can fail over to the disaster recovery site to continue business operations. The production site and the disaster recovery site are deployed in different regions. You can use the cross-region disaster recovery capability of the async replication feature to prevent system failures caused by regional disasters.
Features
Async replication
The async replication feature is suitable for disaster recovery of a single disk set. The async replication feature protects data across regions or across zones based on the data replication capability of EBS. This feature can asynchronously replicate data from one disk (primary disk) to another disk (secondary disk) in a different region or zone. If the primary disk fails, you can fail over to the secondary disk and perform a reverse replication to achieve disaster recovery.
Cross-zone disaster recovery
Cross-region disaster recovery
Replication pair-consistent group
The replication pair-consistent group feature is suitable for disaster recovery of multiple disk sets. The replication pair-consistent group feature allows you to batch manage and operate disks in disaster recovery scenarios where a business system involves multiple disks. You can restore the data of all the disks in a replication pair-consistent group to the same point in time to implement disaster recovery for the disks.
The replication pair-consistent group feature allows data to be asynchronously replicated from primary disks in production sites to secondary disks in disaster recovery sites across regions or zones. When a production site fails, you can fail over to the corresponding disaster recovery site and perform a reverse replication to achieve disaster recovery.
Cross-zone disaster recovery
Cross-region disaster recovery
Limits
Limits on regions
You can use the async replication and replication pair-consistent group features in the following regions and zones:
China (Hangzhou): Hangzhou Zone G, Hangzhou Zone H, Hangzhou Zone I, and Hangzhou Zone K
China (Shanghai): Shanghai Zone B, Shanghai Zone E, Shanghai Zone F, Shanghai Zone G, Shanghai Zone L, and Shanghai Zone N.
China (Beijing): Beijing Zone F, Beijing Zone G, Beijing Zone H, and Beijing Zone J
China (Zhangjiakou): Zhangjiakou Zone A
China (Shenzhen): Shenzhen Zone D and Shenzhen Zone E
China (Heyuan): Heyuan Zone A and Heyuan Zone B
China (Chengdu): Chengdu Zone A and Chengdu Zone B
China (Hong Kong): Hong Kong Zone B and Hong Kong Zone C
Singapore: Singapore Zone B and Singapore Zone C
Indonesia (Jakarta): Jakarta Zone A and Jakarta Zone B
US (Silicon Valley): Silicon Valley Zone A and Silicon Valley Zone B
US (Virginia): Virginia Zone A and Virginia Zone B
SAU (Riyadh - Partner Region): Riyadh - Partner Region Zone A and Riyadh - Partner Region Zone B
China East 2 Finance: China East 2 Finance Zone G, China East 2 Finance Zone K, and China East 2 Finance Zone Z
China North 2 Finance (Preview): China North 2 Finance (Preview) Zone L, China North 2 Finance (Preview) Zone K, and China North 2 Finance (Preview) Zone Z
Malaysia (Kuala Lumpur): Kuala Lumpur Zone A and Kuala Lumpur Zone B
Limits on specifications
The following table describes the limits on specifications that apply to the async replication and replication pair-consistent group features.
Item | Description |
Number of replication pairs that can be created per disk | 1 |
Number of replication pairs that can be added per replication pair-consistent group | 17 |
Replication cycle | 15 minutes (Data is asynchronously replicated from a primary disk to a secondary disk every 15 minutes.) |
Replication rate | The replication rate can be up to 100 MB/s and may vary based on the system load. |
Primary disk category | Primary disks must be ESSDs or ESSD AutoPL disks. |
Secondary disk category | Secondary disks must be of the same disk category and have the same performance level and capacity as the corresponding primary disks. |
Limits on disks
The following table describes the limits on disks that apply to the async replication and replication pair-consistent group features.
①: After a replication pair is activated, the secondary disk enters the read-only state and no users have write permissions on the disk.
②: Due to the recovery point objective (RPO), the data of a snapshot created for a primary disk may be inconsistent with that of a snapshot created at the same time for the associated secondary disk.
③: Replication is restricted to encrypted disks. Cross-replication between encrypted disks and unencrypted disks is not supported.
Item | Supported by the primary disk | Supported by the secondary disk |
Read and write operations | √ | ×① |
Disk deletion | × | × |
Disk initialization | × | × |
Disk resizing | × | × |
Disk attaching | √ | × |
Snapshot creation | √ | ✓② |
Rollback based on snapshots | √ | × |
Disk category change | × | × |
Performance level change | × | × |
Disk encryption | √ | ✓③ |
Multi-attach | × | × |
Disk migration together with instances | × | × |
Billing
The async replication feature supports the following billing methods for replication pairs:
Subscription: You are charged based on the bandwidth level you choose. You can select 10 Mbit/s, 20 Mbit/s, 50 Mbit/s, or 100 Mbit/s of bandwidth.
The fee for a subscription replication pair is calculated by using the following formula: Fee for a subscription replication pair = Unit price of bandwidth × Bandwidth size of the replication pair × Subscription duration of the replication pair. For example, assume that you purchase a 2-month subscription replication pair and select 20 Mbit/s of bandwidth for the replication pair. The unit price of bandwidth is CNY 10 per Mbit/s per month. You are charged CNY 400 (= CNY 10 per Mbit/s per month × 20 Mbit/s × 2 months) for the replication pair.
Pay-as-you-go: You are charged based on the amount of replicated data.
The fee for a pay-as-you-go replication pair is calculated by using the following formula: Fee for a pay-as-you-go replication pair = Unit price of replicated data × Amount of replicated data. For example, assume that you have a pay-as-you-go replication pair that consists of Disk A located in the China (Hangzhou) region and Disk B located in the China (Shanghai) region. 100 GB of data is replicated from Disk A to Disk B, and the unit price of the replicated data is CNY 0.5 per GB. You are charged CNY 50 (= CNY 0.5 per GB × 100 GB) for the replication pair.
NoteYou are charged only for cross-region replication on a pay-as-you-go basis. You are not charged for cross-zone replication in the same region.
For disaster recovery drills, you are charged for the disks that you create in the disaster recovery sites on a pay-as-you-go basis. For more information, see Pay-as-you-go.
Terms
Before you implement disaster recovery for disks, familiarize yourself with the terms described in the following table.
Term | Description |
asynchronous replication | The async replication feature replicates data from one disk to another disk across regions or across zones on a periodic basis. In asynchronous replication, the data on the source disk is not always identical to the data on the destination disk because data is not synchronized in real time. |
production site | A data center that can independently support the normal operation of business. |
disaster recovery site | The data center that serves as a backup site for a production site. If the production site fails, the disaster recovery site takes over business operations to ensure business continuity. |
primary disk | The disk from which data is replicated to implement disaster recovery. The primary disk is also called the source disk. After a reverse replication is performed, the primary disk is converted to the secondary disk. |
secondary disk | The disk to which data is replicated. The secondary disk is also called the destination disk. After a reverse replication is performed, the secondary disk is converted to the primary disk. |
primary site | The data center where a primary disk is located. After a reverse replication is performed, the primary site is converted to the secondary site. |
secondary site | The data center where a secondary disk is located. After a reverse replication is performed, the secondary site is converted to the primary site. |
RPO | The amount of data that may be lost due to a disk exception. The RPO is measured in time and used as a data metric. The default value of the RPO is 15 minutes in async replication. This value indicates that data written to a primary disk within the previous 15 minutes may be lost if an exception occurs on the disk. |
recovery time objective (RTO) | The time it takes a primary disk to recover after an exception occurs on the disk. The RTO is used as a data metric in async replication. For example, if the value of the RTO is 1 hour, the data of a primary disk can be restored and run as expected within 1 hour after an exception occurs on the primary disk. |
async replication relationship | The replication relationship that is established between a primary disk, a secondary disk, and the configurations for asynchronous replication. |
replication pair | A pair of disks that have the async replication relationship. A replication pair-consistent group can contain multiple replication pairs. |
failover | A sub-feature of async replication that allows you to enable read and write permissions on the secondary disk and fail over to the disk. |
reverse replication | A sub-feature of async replication that allows you to reverse the async replication relationship of a replication pair to replicate data from the original secondary disk to the original primary disk. |
Procedures
Implement disaster recovery for a set of disks
Implement disaster recovery for multiple sets of disks