If your workloads run in Kubernetes clusters in on-premises data centers or on third-party public clouds, you can use Distributed Cloud Container Platform for Kubernetes (ACK One) to implement zone-disaster recovery for business high availability. ACK One provides a unified control plane that lets you centrally manage traffic, applications, and clusters, route traffic across clusters, and seamlessly perform traffic failovers. This topic describes how to use ACK One to build a zone-disaster recovery system in a hybrid cloud environment.
Overview
In a typical hybrid cloud scenario, your applications run in both a cloud-based Kubernetes cluster and an external Kubernetes cluster in a data center or on a third-party public cloud. When a zone-level failure occurs, traffic must be rerouted to the healthy zone with minimal disruption. ACK One addresses this by combining registered clusters, Fleet instances, and Microservices Engine (MSE) multi-cluster gateways into an integrated zone-disaster recovery architecture.
The solution provides the following capabilities:
| Capability | Description |
|---|---|
| Centralized cluster management | Associate cloud-based and external Kubernetes clusters with a single Fleet instance and manage them as one logical group. |
| Multi-cluster application distribution | Use ACK One GitOps to distribute applications to multiple clusters consistently. |
| Cross-cluster traffic routing | Use MSE multi-cluster gateways to manage north-south traffic and implement zone-disaster recovery with automatic failover. |
Architecture
The preceding figure shows the zone-disaster recovery architecture built with ACK One. The architecture consists of a registered cluster, a Fleet instance (with optional GitOps), and a multi-cluster gateway.
Component layout
| Component | Description |
|---|---|
| Virtual Private Cloud (VPC) 1 | Hosts all Alibaba Cloud resources. |
| Availability Zone (AZ) 1 | Contains a Container Service for Kubernetes (ACK) cluster. |
| AZ 2 | Contains a registered cluster. |
| Registered cluster | Connects to a Kubernetes cluster deployed in your on-premises data center or on a third-party public cloud. |
| Express Connect circuit | Connects the data center to the VPC for network communication between the on-premises environment and the cloud. |
| Fleet instance | Associates both the ACK cluster and the registered cluster in VPC 1. Serves as the central management plane for multi-cluster operations. |
| ACK One GitOps (optional) | Distributes applications from a Git repository to both the ACK cluster and the registered cluster. |
| MseIngressConfig | Configured on the Fleet instance to create an MSE gateway and add clusters to the gateway. An Ingress resource is then created on the Fleet instance to define traffic routing rules that manage north-south traffic and implement zone-disaster recovery. |
Architecture constraints
-
The Fleet instance, ACK cluster, and registered cluster must be deployed in the same VPC.
-
The ACK cluster and registered cluster must reside in different zones.
Procedure
Step 1: Design the network and create a Fleet instance
Before you build the disaster recovery system, plan the network layout and create a Fleet instance. Keep the following requirements in mind:
-
The Fleet instance, ACK cluster, and registered cluster must be deployed in the same VPC.
-
The ACK cluster and registered cluster must reside in different zones to enable zone-disaster recovery.
For detailed guidance, see Network design for Fleet management.
Step 2: Use a registered cluster to manage external Kubernetes clusters
A registered cluster lets you bring an external Kubernetes cluster (running in your data center or on a third-party public cloud) under ACK One management. This is the foundation for incorporating off-cloud workloads into the disaster recovery architecture.
To set up a registered cluster:
-
Create a registered cluster in the ACK console.
-
Configure a YAML file to connect your external Kubernetes cluster to the registered cluster.
For complete instructions, see Use registered clusters to manage external Kubernetes clusters in a centralized manner.
Use elastic cloud resources through the registered cluster (optional)
To use elastic resources on Alibaba Cloud through the registered cluster, you can add Elastic Compute Service (ECS) instances or schedule pods to Elastic Container Instance (ECI) instances deployed as virtual nodes. For more information, see:
-
Build a hybrid cloud cluster and add ECS instances to the cluster
-
Schedule pods to elastic container instances that are deployed as virtual nodes
To improve resilience against unexpected traffic spikes, you can configure high availability settings for ECI instances. For more information, see Create ECIs across zones.
Step 3: Connect the on-premises network to the VPC
To enable communication between your on-premises data center and the Alibaba Cloud VPC, establish network connectivity. This is required for the registered cluster to communicate with cloud-based resources.
For an overview of available options, see Network connectivity. For a detailed guide on hybrid Kubernetes networking, see Overview of hybrid networks.
If you use an Express Connect circuit, the general procedure is as follows:
-
Connect the on-premises network to Alibaba Cloud by provisioning an Express Connect circuit. For more information, see Physical Connection.
-
Create a connection over the Express Connect circuit to connect edge devices in the data center to a Virtual Border Router (VBR), which functions as a gateway in the cloud.
-
Attach the VBR and VPC to a Cloud Enterprise Network (CEN) instance to enable inter-network communication.
-
Configure BGP on the VBR and in the data center to enable dynamic route exchange.
-
Test network connectivity between the cloud network and the on-premises network to verify the connection.
-
Configure routes for cloud service private CIDR blocks to enable communication between on-premises resources and cloud services:
-
Container Registry (ACR): Add routes that point to the private address of ACK component images. See Add routes that point to the private address if ACK component images.
-
Object Storage Service (OSS): Configure routes for internal endpoints and VIP ranges. See Internal endpoints of OSS buckets and VIP ranges.
-
Step 4: Connect the registered cluster and ACK cluster to the Fleet instance
After the registered cluster and ACK cluster are ready and network connectivity is established, associate both clusters with the Fleet instance. This brings the clusters under unified management through the ACK One control plane.
For more information, see Manage associated clusters.
Step 5: Use GitOps to distribute applications to multiple clusters
With both clusters associated to the Fleet instance, use ACK One GitOps to distribute your application to the ACK cluster and the registered cluster simultaneously. GitOps ensures that application deployments in both clusters stay consistent with the desired state defined in your Git repository.
For more information, see Use GitOps to distribute an application to multiple clusters.
Step 6: Use the multi-cluster gateway to implement zone-disaster recovery
The MSE multi-cluster gateway enables traffic routing across clusters and automatic failover between zones. In this step, you configure the gateway on the Fleet instance and set up Ingress rules for zone-disaster recovery.
-
Enable the multi-cluster gateway feature on the Fleet instance.
-
Configure the MseIngressConfig to create a gateway on the Fleet instance and add both the ACK cluster and the registered cluster to the gateway.
-
Create an Ingress on the Fleet instance to define traffic routing rules that implement zone-disaster recovery.
For complete instructions, see Use MSE multi-cluster gateways to implement zone-disaster recovery in ACK One.
References
For more information about ACK One, see ACK One overview.