The Application Load Balancer (ALB) multi-cluster gateways of Distributed Cloud Container Platform for Kubernetes (ACK One) can be used together with ACK One GitOps or the multi-cluster application distribution feature to quickly implement zone-disaster recovery. This allows you to ensure the high availability of your business and automatically switch traffic in a seamless manner when a fault occurs. This topic describes how to build a zone-disaster recovery system by using multi-cluster gateways.
Disaster recovery overview
Disaster recovery solutions in the cloud can be classified into the following types:
Zone-disaster recovery: This solution includes active zone-redundancy and primary/secondary disaster recovery. The network latency between data centers located in the same region is low. Therefore, zone-disaster recovery is suitable for protecting data against zone-level hazardous events, such as fire, network interruptions, and power outages. Although this solution uses simple methods to back up and restore data, it is applicable for common scenarios.
Active geo-redundancy: The network latency between data centers will be higher if the active geo-redundancy solution is used. However, this solution can efficiently protect data against region-level disasters, such as floods and earthquakes.
Disaster recovery based on three data centers across two zones: This solution provides the benefits of zone-disaster recovery and active geo-redundancy. This solution is suitable for scenarios where you need to ensure the continuity and availability of applications.
In most cases, the business architecture of an enterprise can be divided into the following layers from the top down: access layer, application layer, and data layer.
Access layer: serves as an entry point for ingress traffic. This layer routes ingress traffic to the backend application layer based on forwarding rules.
Application layer: hosts applications. This layer processes ingress traffic and sends the results back to the upper layer.
Data layer: stores data. This layer provides data and storage services for the application layer.
When you build a disaster recovery system for your business, you need to enforce recovery measures on each layer.
Access layer: ACK One uses multi-cluster gateways to build the access layer. The multi-cluster gateways of ACK One support zone-disaster recovery. Therefore, the access layer built on ACK One is highly available.
Application layer: ACK One uses multi-cluster gateways to implement disaster recovery on the application layer. The multi-cluster gateways of ACK One support active zone-redundancy, primary/secondary disaster recovery, and geo-redundancy.
Data layer: Disaster recovery and data synchronization on the data layer have middleware dependencies.
Benefits
Disaster recovery by using the multi-cluster gateways of ACK One has the following advantages over disaster recovery by using DNS traffic distribution:
Disaster recovery by using DNS traffic distribution requires multiple load balancer IP addresses (one IP address for each cluster). Disaster recovery by using multi-cluster gateways uses only one load balancer IP address in one region and uses multi-zone deployment in the same region by default to ensure high availability.
Disaster recovery by using multi-cluster gateways supports request forwarding at Layer 7, while disaster recovery by using DNS traffic distribution does not support this feature.
In most cases, clients need to cache DNS query results during IP address switching in a disaster recovery system that uses DNS traffic distribution. This causes temporary service interruptions. Disaster recovery by using multi-cluster gateways can resolve this problem by seamlessly failing over to the backend pods in another cluster.
Multi-cluster gateways are region-level gateways. Therefore, you can complete all the operations on a Fleet instance without the need to install an Ingress controller or create Ingresses in each Container Service for Kubernetes (ACK) cluster. This helps you manage traffic in a region and reduce multi-cluster management costs.
Architecture
In this example, a web application is used to show how to use ALB multi-cluster gateways to implement zone-disaster recovery. The web application consists of a Deployment and a Service. The following figure shows the architecture of the zone-disaster recovery system.
Create Cluster 1 and Cluster 2 in AZ 1 and AZ 2 in the China (Hong Kong) region.
Use ACK One GitOps to distribute the application to Cluster 1 and Cluster 2.
Use an AlbConfig to create an ALB multi-cluster gateway on the ACK One Fleet instance.
After the ALB multi-cluster gateway is created, you can configure Ingress rules to route traffic to the clusters based on weights and request headers. When one of the clusters is down, traffic is automatically switched to the other cluster.
Data synchronization based on ApsaraDB RDS has middleware dependencies.
Prerequisites
ALB is activated.
The Fleet management feature is enabled. For more information, see Enable multi-cluster management.
The ACK One Fleet instance is associated with two ACK clusters that are deployed in the same virtual private cloud (VPC) as the ACK One Fleet instance. For more information, see Manage associated clusters.
The kubeconfig file of the Fleet instance is obtained in the ACK One console and a kubectl client is connected to the Fleet instance.
The latest version of Alibaba Cloud CLI is installed and Alibaba Cloud CLI is configured.
Step 1: Use GitOps or the application distribution feature to distribute an application to multiple clusters
ACK One allows you to use GitOps or the application distribution feature to distribute an application to multiple clusters. For more information, see Getting started with GitOps, Create a multi-cluster application, and Get started with application distribution. In this step, GitOps is used.
Log on to the ACK One console. In the left-side navigation pane, choose .
In the upper-left corner of the Multi-cluster Applications page, click to the right of the Fleet instance name and select your Fleet instance from the drop-down list.
Choose
to go to the Create Multi-cluster Application - GitOps page.NoteIf GitOps is not enabled for the ACK One Fleet instance, enable GitOps. For more information, see Enable GitOps for the Fleet instance.
For more information about how to enable Internet access to GitOps, see Enable public access to Argo CD.
On the Create from YAML tab, copy the following YAML template to the code editor. Then, click OK to deploy the application.
NoteThe following YAML template is used to deploy an application named
web-demo
to each associated cluster. You can also select the clusters where you want to deploy the application on the Quick Create tab. The configuration changes that you make on the Quick Create tab will be automatically synchronized to the YAML template on the Create from YAML tab.apiVersion: argoproj.io/v1alpha1 kind: ApplicationSet metadata: name: appset-web-demo namespace: argocd spec: template: metadata: name: '{{.metadata.annotations.cluster_id}}-web-demo' namespace: argocd spec: destination: name: '{{.name}}' namespace: gateway-demo project: default source: repoURL: https://github.com/AliyunContainerService/gitops-demo.git path: manifests/helm/web-demo targetRevision: main helm: valueFiles: - values.yaml parameters: - name: envCluster value: '{{.metadata.annotations.cluster_name}}' syncPolicy: automated: {} syncOptions: - CreateNamespace=true generators: - clusters: selector: matchExpressions: - values: - cluster key: argocd.argoproj.io/secret-type operator: In - values: - in-cluster key: name operator: NotIn goTemplateOptions: - missingkey=error syncPolicy: preserveResourcesOnDeletion: false goTemplate: true
Step 2: Use kubectl to deploy an ALB multi-cluster gateway from the ACK One Fleet instance
You can use an AlbConfig to create an ALB multi-cluster gateway from the ACK One Fleet instance. You can associate clusters with the gateway.
Obtain the IDs of two vSwitches that belong to the VPC where the ACK One Fleet instance resides.
Create a file named
gateway.yaml
and copy the following content to the file.NoteReplace
${vsw-id1}
and${vsw-id2}
with the vSwitch IDs obtained from the preceding step, and replace${cluster1}
and${cluster2}
with the IDs of the associated clusters you want to add.For associated clusters
${cluster1}
and${cluster2}
, you must configure the inbound rules of their security group to allow access from all IP addresses and ports of the vSwitch CIDR block.
apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: ackone-gateway-demo annotations: # Specify the IDs of the clusters that you want to associate with the ALB instance. alb.ingress.kubernetes.io/remote-clusters: ${cluster1},${cluster2} spec: config: name: one-alb-demo addressType: Internet addressAllocatedMode: Fixed zoneMappings: - vSwitchId: ${vsw-id1} - vSwitchId: ${vsw-id2} listeners: - port: 8001 protocol: HTTP --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: alb spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: ackone-gateway-demo
You need to configure the following parameters.
Parameter
Required
Description
metadata.name
Yes
The name of the AlbConfig.
metadata.annotations:
alb.ingress.kubernetes.io/remote-clusters
Yes
The list of associated clusters to be added to the ALB multi-cluster gateway. The cluster IDs listed here have been associated with the Fleet instance.
spec.config.name
No
The name of the ALB instance.
spec.config.addressType
No
The network type of the ALB instance. Valid values:
Internet (default): Public network. The ALB instance provides services to the Internet and is accessible over the Internet.
NoteTo allow an ALB instance to provide Internet-facing services, the ALB instance needs to be associated with an elastic IP address (EIP). If you use an Internet-facing ALB instance, you are charged instance fees and bandwidth or data transfer fees for the associated EIPs. For more information, see Pay-as-you-go.
Intranet: Private network. The ALB instance provides services within a VPC and cannot be accessed over the Internet.
spec.config.zoneMappings
Yes
The IDs of the vSwitches that are associated with the ALB instance. For more information about how to create a vSwitch, see Create and manage a vSwitch.
NoteThe specified vSwitches must be deployed in the zones supported by the ALB instance and deployed in the same VPC as the cluster. For more information about regions and zones supported by ALB, refer to Regions and zones in which ALB is available.
ALB supports multi-zone deployment. If the current region supports two or more zones, select vSwitches in at least two zones to ensure high availability.
spec.listeners
No
The listener port and protocol of the ALB instance. The example provided in this topic configures an HTTP listener on port 8001.
A listener defines how ALB receives traffic. We recommend that you retain the listener configuration. Otherwise, you must create a listener before you can use ALB Ingresses.
Run the following command to deploy the
gateway.yaml
file to create an ALB multi-cluster gateway and an IngressClass:kubectl apply -f gateway.yaml
Wait 1 to 3 minutes and run the following command to check whether the ALB multi-cluster gateway is created:
kubectl get albconfig ackone-gateway-demo
Expected output:
NAME ALBID DNSNAME PORT&PROTOCOL CERTID AGE ackone-gateway-demo alb-xxxx alb-xxxx.<regionid>.alb.aliyuncs.com 4d9h
Run the following command to check whether the associated cluster is connected to the gateway:
kubectl get albconfig ackone-gateway-demo -ojsonpath='{.status.loadBalancer.subClusters}'
The IDs of the associated clusters are returned in the output.
Step 3: Use Ingresses to implement zone-disaster recovery
Multi-cluster gateways use Ingresses to manage traffic across clusters. You can create Ingress objects on the ACK One Fleet instance to implement active zone-redundancy.
Create a namespace named
gateway-demo
, which is the same as the namespace where the Service that you created in the preceding steps resides.Create a file named
ingress-demo.yaml
and copy the following content to the file.NoteThe sum of all weights specified in the
alb.ingress.kubernetes.io/cluster-weight
annotation must be 100.The
/svc1
forwarding rule below the domain nameexample.com
is used to expose the backend Service namedservice1
. Replace${cluster1-id}
and${cluster2-id}
with the actual cluster IDs.
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: alb.ingress.kubernetes.io/listen-ports: | [{"HTTP": 8001}] alb.ingress.kubernetes.io/cluster-weight.${cluster1-id}: "20" alb.ingress.kubernetes.io/cluster-weight.${cluster2-id}: "80" name: web-demo namespace: gateway-demo spec: ingressClassName: alb rules: - host: alb.ingress.alibaba.com http: paths: - path: /svc1 pathType: Prefix backend: service: name: service1 port: number: 80
Run the following command to deploy the Ingress on the ACK One Fleet instance:
kubectl apply -f ingress-demo.yaml -n gateway-demo
Step 4: Verify active zone-redundancy
Forward traffic to different clusters by ratio
Run the following command to access the web application:
curl -H "host: alb.ingress.alibaba.com" alb-xxxx.<regionid>.alb.aliyuncs.com:<listeners port>/svc1
You need to configure the following parameters.
Parameter | Description |
| Set the value to the domain name in the |
| Set the value to 8001, which is the value specified in the AlbConfig configurations and the |
Run the following command. The output shows that 20% of traffic is forwarded to Cluster 1 (poc-ack-1) and 80% of traffic is forwarded to Cluster 2 (poc-ack-2).
for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.cn-beijing.alb.aliyuncs.com:8001/svc1; done > res.txt
Automatically and seamlessly switch traffic when a fault occurs in one cluster
Run the following command. Then, decrease the number of application pods in Cluster 2 to 0. After the change takes effect, traffic is automatically switched to Cluster 1 in a seamless manner.
for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.cn-beijing.alb.aliyuncs.com:8001/svc1; sleep 1; done