This topic describes the basic capabilities and benefits of the async replication feature for disaster recovery.
Overview
Cloud Backup implements cross-region and cross-zone disaster recovery based on the async replication feature to meet different business requirements.
Async replication is implemented on disks without the need to install an agent on the protected instance.
If a fault occurs on the primary system, the business system is switched to the disaster recovery system. This effectively prevents system failures caused by regional disasters, ensures business availability, and meets the recovery point objective (RPO) and recovery time objective (RTO) goals of your business.
Async replication is a feature that protects data across regions or across zones within the same region based on the data replication capability of Elastic Block Storage (EBS). For more information, see Overview.
The following table describes the differences between continuous data replication (CDR) and async replication.
Item | CDR | Async replication |
Application scenarios | Disaster recovery for a single virtual machine (VM). The target customers are those who have strict RPO requirements and do not mind intrusions into the system. | Disaster recovery that ensures the consistency of VM groups. The target customers are those who can accept an RPO of a few minutes and do not expect intrusions into the system. |
System intrusive | Yes | No |
Replication implementation | An agent is installed on the operating system of the protected instance, so that Cloud Backup replicates data written into the disks and sends the data to a gateway in real time. The gateway then transmits the data to the Object Storage Service (OSS) bucket for storage on the disaster recovery site. | Data is replicated by using the async replication and snapshot features. |
Recovery implementation | Supports multiple recovery points. A shadow Elastic Compute Service (ECS) instance and a gateway server are created for the protected ECS instance at the disaster recovery site. Cloud Backup reads data from the OSS bucket to the shadow ECS instance, writes the data to the ECS instance at the disaster recovery site, and then creates a recovery point based on the snapshot mechanism. | Supports only a single recovery point. Cloud Backup creates a recovery point by replicating the snapshot to the disaster recovery site. |
Consistency group | Not supported | Supported |
Benefits of disaster recovery
Agentless replication
Async replication does not require agents, does not intrude into the system, is universally applicable to operating systems, and does not consume computing resources at the disaster recovery site.
Multi-VM consistency
Disaster recovery provides multi-VM consistency to meet the high requirements for enterprise applications.
Ease of use
After you create a protection group for an application, you can add all the ECS instances of the application to the protection group and enable replication. You do not need to focus on the mappings between disks and ECS instances. ECS instances and disks are mapped by Cloud Backup.
Terms
Term | Description |
site pair | Cross-region and cross-zone disaster recovery is implemented based on async replication. Async replication is used to replicate data from one site to another site across regions or across zones in a region. Therefore, you have to pair two sites according to your business requirements. These two sites are referred to as a site pair. Protection groups must be created for the site pair. Disaster recovery is implemented only in the forward direction for the protection groups in a site pair. For example, disaster recovery is performed from Protection Group A to Protection Group B, and the forward protection is initiated from Region 1 to Region 2. Disaster recovery is performed from Protection Group C to Protection Group D, and the forward protection is initiated from Region 2 to Region 1. In this case, you must create two site pairs. A protection group can belong to only one site pair. Only one replication technology can be used for one site pair. |
protection group |
|
protected instance | An ECS instance or database that is protected by Cloud Backup. Database protection will be supported in the future. Roles are classified into primary and secondary roles. Primary roles refer to the instances on which services are running, and secondary roles refer to the instances that are currently used for disaster recovery. |
production site | The zone or region where your production business operates initially. |
disaster recovery site | The zone or region for disaster recovery of your production business. |
failover | The process of switching services to the disaster recovery site when a fault occurs at the production site. Failover is classified into planned failover and unplanned failover. The difference lies in whether the ECS instance at the production site fails during the switchover. |
failback | The process of switching services from the disaster recovery site to the production site when the fault at the production site is rectified. |
forward protection | The replication direction of the protection group and ECS instances. In forward protection, data and services are replicated from the production site to the disaster recovery site. |
reverse protection | The replication direction of the protection group and ECS instances. After a failover, the disaster recovery site (Site B) becomes the primary site, and the production site (Site A) becomes the secondary site. In this case, after the protection is enabled, data is replicated from Site B to Site A. The process is called reverse protection. After the fault is rectified, Site A becomes the production site and Site B becomes the disaster recovery site again. In this case, after the protection is enabled, data is replicated from Site A to Site B. The process is called forward protection. |
Supported disaster recovery scenarios
Disaster recovery scenario | Type |
Failover |
|
Failback |
|
Disaster recovery process
To implement disaster recovery protection for critical applications in the Hybrid Backup Recovery (HBR) console, perform the following steps:
Step 1: Plan resources.
Before you perform disaster recovery, you must plan the required compute, network, and storage resources. You must determine the number of servers, storage capacity, and virtual private clouds (VPCs).
Step 2: Create a disaster recovery site pair.
Create VPCs and vSwitches for the disaster recovery site, and configure CIDR blocks. During the test, you can use the default configurations to create VPCs and vSwitches. You can also configure the same VPC CIDR block and vSwitch CIDR block for the production site and the disaster recovery site. During actual disaster recovery, you can configure CIDR blocks as required.
Step 3: Configure network and security settings.
Create resource mappings, including the zone mapping, vSwitch mapping, and security group mapping.
Step 4: Create a protection group.
Step 5: Add protected instances.
Add instances to be protected.
Step 6: Start replication.
Start disaster recovery protection, a process of replicating data from the production site to the disaster recovery site.
Step 7: Perform a failover.
Switch After Data Synchronization
During the failover, HBR stops the protected instances in the protection group, and performs the final data synchronization after all the protected instances are stopped. The failover starts after the data is synchronized. This ensures that the data at the disaster recovery site is the same as that at the production site. This type of failover applies to scenarios such as planned disaster recovery drills and business migration.
Switch Now
During the failover, HBR attempts to stop the protected instances in the protection group. HBR does not wait until all the protected instances are stopped or perform the final data synchronization. Some data may be lost within the recovery point objective (RPO) range. This type of failover applies to scenarios where a fault cannot be rectified within a short period of time at the production site and business must be immediately switched to the disaster recovery site.
Billing
If you use the async replication feature for disaster recovery, the following fees are incurred:
The usage fees of disaster recovery software are included in Cloud Backup bills.
Async replication is in public preview. You can use disaster recovery software for free during the public review.
The usage fees of the pay-as-you-go ECS instances and disks created at the disaster recovery side are included in ECS bills. For more information, see Pay-as-you-go.
The fees incurred by the async replication feature are included in ECS bills. For more information, see Overview.