All Products
Search
Document Center

Container Service for Kubernetes:Build a zone-disaster recovery system

Last Updated:Nov 26, 2024

The Application Load Balancer (ALB) multi-cluster gateways of Distributed Cloud Container Platform for Kubernetes (ACK One) can be used together with ACK One GitOps or the multi-cluster application distribution feature to quickly implement zone-disaster recovery. This allows you to ensure the high availability of your business and automatically switch traffic in a seamless manner when a fault occurs. This topic describes how to build a zone-disaster recovery system by using multi-cluster gateways.

Disaster recovery overview

Disaster recovery solutions in the cloud can be classified into the following types:

  • Zone-disaster recovery: This solution includes active zone-redundancy and primary/secondary disaster recovery. The network latency between data centers located in the same region is low. Therefore, zone-disaster recovery is suitable for protecting data against zone-level hazardous events, such as fire, network interruptions, and power outages. Although this solution uses simple methods to back up and restore data, it is applicable for common scenarios.

  • Active geo-redundancy: The network latency between data centers will be higher if the active geo-redundancy solution is used. However, this solution can efficiently protect data against region-level disasters, such as floods and earthquakes.

  • Disaster recovery based on three data centers across two zones: This solution provides the benefits of zone-disaster recovery and active geo-redundancy. This solution is suitable for scenarios where you need to ensure the continuity and availability of applications.

In most cases, the business architecture of an enterprise can be divided into the following layers from the top down: access layer, application layer, and data layer.

  • Access layer: serves as an entry point for ingress traffic. This layer routes ingress traffic to the backend application layer based on forwarding rules.

  • Application layer: hosts applications. This layer processes ingress traffic and sends the results back to the upper layer.

  • Data layer: stores data. This layer provides data and storage services for the application layer.

When you build a disaster recovery system for your business, you need to enforce recovery measures on each layer.

  • Access layer: ACK One uses multi-cluster gateways to build the access layer. The multi-cluster gateways of ACK One support zone-disaster recovery. Therefore, the access layer built on ACK One is highly available.

  • Application layer: ACK One uses multi-cluster gateways to implement disaster recovery on the application layer. The multi-cluster gateways of ACK One support active zone-redundancy, primary/secondary disaster recovery, and geo-redundancy.

  • Data layer: Disaster recovery and data synchronization on the data layer have middleware dependencies.

Benefits

Disaster recovery by using the multi-cluster gateways of ACK One has the following advantages over disaster recovery by using DNS traffic distribution:

  • Disaster recovery by using DNS traffic distribution requires multiple load balancer IP addresses (one IP address for each cluster). Disaster recovery by using multi-cluster gateways uses only one load balancer IP address in one region and uses multi-zone deployment in the same region by default to ensure high availability.

  • Disaster recovery by using multi-cluster gateways supports request forwarding at Layer 7, while disaster recovery by using DNS traffic distribution does not support this feature.

  • In most cases, clients need to cache DNS query results during IP address switching in a disaster recovery system that uses DNS traffic distribution. This causes temporary service interruptions. Disaster recovery by using multi-cluster gateways can resolve this problem by seamlessly failing over to the backend pods in another cluster.

  • Multi-cluster gateways are region-level gateways. Therefore, you can complete all the operations on a Fleet instance without the need to install an Ingress controller or create Ingresses in each Container Service for Kubernetes (ACK) cluster. This helps you manage traffic in a region and reduce multi-cluster management costs.

Architecture

In this example, a web application is used to show how to use ALB multi-cluster gateways to implement zone-disaster recovery. The web application consists of a Deployment and a Service. The following figure shows the architecture of the zone-disaster recovery system.

image
  • Create Cluster 1 and Cluster 2 in AZ 1 and AZ 2 in the China (Hong Kong) region.

  • Use ACK One GitOps to distribute the application to Cluster 1 and Cluster 2.

  • Use an AlbConfig to create an ALB multi-cluster gateway on the ACK One Fleet instance.

  • After the ALB multi-cluster gateway is created, you can configure Ingress rules to route traffic to the clusters based on weights and request headers. When one of the clusters is down, traffic is automatically switched to the other cluster.

  • Data synchronization based on ApsaraDB RDS has middleware dependencies.

Prerequisites

  • ALB is activated.

  • The Fleet management feature is enabled. For more information, see Enable multi-cluster management.

  • The ACK One Fleet instance is associated with two ACK clusters that are deployed in the same virtual private cloud (VPC) as the ACK One Fleet instance. For more information, see Manage associated clusters.

Step 1: Use GitOps or the application distribution feature to distribute an application to multiple clusters

ACK One allows you to use GitOps or the application distribution feature to distribute an application to multiple clusters. For more information, see Getting started with GitOps, Create a multi-cluster application, and Get started with application distribution. In this step, GitOps is used.

  1. Log on to the ACK One console. In the left-side navigation pane, choose Fleet > Multi-cluster Applications.

  2. In the upper-left corner of the Multi-cluster Applications page, click Dingtalk_20231226104633.jpg to the right of the Fleet instance name and select your Fleet instance from the drop-down list.

  3. Choose Create Multi-cluster Application > GitOps to go to the Create Multi-cluster Application - GitOps page.

    Note
  4. On the Create from YAML tab, copy the following YAML template to the code editor. Then, click OK to deploy the application.

    Note

    The following YAML template is used to deploy an application named web-demo to each associated cluster. You can also select the clusters where you want to deploy the application on the Quick Create tab. The configuration changes that you make on the Quick Create tab will be automatically synchronized to the YAML template on the Create from YAML tab.

    apiVersion: argoproj.io/v1alpha1
    kind: ApplicationSet
    metadata:
      name: appset-web-demo
      namespace: argocd
    spec:
      template:
        metadata:
          name: '{{.metadata.annotations.cluster_id}}-web-demo'
          namespace: argocd
        spec:
          destination:
            name: '{{.name}}'
            namespace: gateway-demo
          project: default
          source:
            repoURL: https://github.com/AliyunContainerService/gitops-demo.git
            path: manifests/helm/web-demo
            targetRevision: main
            helm:
              valueFiles:
                - values.yaml
              parameters:
                - name: envCluster
                  value: '{{.metadata.annotations.cluster_name}}'
          syncPolicy:
            automated: {}
            syncOptions:
              - CreateNamespace=true
      generators:
        - clusters:
            selector:
              matchExpressions:
                - values:
                    - cluster
                  key: argocd.argoproj.io/secret-type
                  operator: In
                - values:
                    - in-cluster
                  key: name
                  operator: NotIn
      goTemplateOptions:
        - missingkey=error
      syncPolicy:
        preserveResourcesOnDeletion: false
      goTemplate: true

Step 2: Use kubectl to deploy an ALB multi-cluster gateway from the ACK One Fleet instance

You can use an AlbConfig to create an ALB multi-cluster gateway from the ACK One Fleet instance. You can associate clusters with the gateway.

  1. Obtain the IDs of two vSwitches that belong to the VPC where the ACK One Fleet instance resides.

  2. Create a file named gateway.yaml and copy the following content to the file.

    Note
    • Replace ${vsw-id1} and ${vsw-id2} with the vSwitch IDs obtained from the preceding step, and replace ${cluster1} and ${cluster2} with the IDs of the associated clusters you want to add.

    • For associated clusters ${cluster1} and ${cluster2}, you must configure the inbound rules of their security group to allow access from all IP addresses and ports of the vSwitch CIDR block.

    apiVersion: alibabacloud.com/v1
    kind: AlbConfig
    metadata:
      name: ackone-gateway-demo
      annotations:
        # Specify the IDs of the clusters that you want to associate with the ALB instance. 
        alb.ingress.kubernetes.io/remote-clusters: ${cluster1},${cluster2}
    spec:
      config:
        name: one-alb-demo
        addressType: Internet
        addressAllocatedMode: Fixed
        zoneMappings:
        - vSwitchId: ${vsw-id1}
        - vSwitchId: ${vsw-id2}
      listeners:
      - port: 8001
        protocol: HTTP
    ---
    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: alb
    spec:
      controller: ingress.k8s.alibabacloud/alb
      parameters:
        apiGroup: alibabacloud.com
        kind: AlbConfig
        name: ackone-gateway-demo

    You need to configure the following parameters.

    Parameter

    Required

    Description

    metadata.name

    Yes

    The name of the AlbConfig.

    metadata.annotations:

    alb.ingress.kubernetes.io/remote-clusters

    Yes

    The list of associated clusters to be added to the ALB multi-cluster gateway. The cluster IDs listed here have been associated with the Fleet instance.

    spec.config.name

    No

    The name of the ALB instance.

    spec.config.addressType

    No

    The network type of the ALB instance. Valid values:

    • Internet (default): Public network. The ALB instance provides services to the Internet and is accessible over the Internet.

      Note

      To allow an ALB instance to provide Internet-facing services, the ALB instance needs to be associated with an elastic IP address (EIP). If you use an Internet-facing ALB instance, you are charged instance fees and bandwidth or data transfer fees for the associated EIPs. For more information, see Pay-as-you-go.

      Intranet: Private network. The ALB instance provides services within a VPC and cannot be accessed over the Internet.

    spec.config.zoneMappings

    Yes

    The IDs of the vSwitches that are associated with the ALB instance. For more information about how to create a vSwitch, see Create and manage a vSwitch.

    Note
    • The specified vSwitches must be deployed in the zones supported by the ALB instance and deployed in the same VPC as the cluster. For more information about regions and zones supported by ALB, refer to Regions and zones in which ALB is available.

    • ALB supports multi-zone deployment. If the current region supports two or more zones, select vSwitches in at least two zones to ensure high availability.

    spec.listeners

    No

    The listener port and protocol of the ALB instance. The example provided in this topic configures an HTTP listener on port 8001.

    A listener defines how ALB receives traffic. We recommend that you retain the listener configuration. Otherwise, you must create a listener before you can use ALB Ingresses.

  3. Run the following command to deploy the gateway.yaml file to create an ALB multi-cluster gateway and an IngressClass:

    kubectl apply -f gateway.yaml
  4. Wait 1 to 3 minutes and run the following command to check whether the ALB multi-cluster gateway is created:

    kubectl get albconfig ackone-gateway-demo

    Expected output:

    NAME      		      ALBID      DNSNAME                               PORT&PROTOCOL   CERTID   AGE
    ackone-gateway-demo           alb-xxxx   alb-xxxx.<regionid>.alb.aliyuncs.com                           4d9h
  5. Run the following command to check whether the associated cluster is connected to the gateway:

    kubectl get albconfig ackone-gateway-demo -ojsonpath='{.status.loadBalancer.subClusters}'

    The IDs of the associated clusters are returned in the output.

Step 3: Use Ingresses to implement zone-disaster recovery

Multi-cluster gateways use Ingresses to manage traffic across clusters. You can create Ingress objects on the ACK One Fleet instance to implement active zone-redundancy.

  1. Create a namespace named gateway-demo, which is the same as the namespace where the Service that you created in the preceding steps resides.

  2. Create a file named ingress-demo.yaml and copy the following content to the file.

    Note
    • The sum of all weights specified in the alb.ingress.kubernetes.io/cluster-weight annotation must be 100.

    • The /svc1 forwarding rule below the domain name example.com is used to expose the backend Service named service1. Replace ${cluster1-id} and ${cluster2-id} with the actual cluster IDs.

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        alb.ingress.kubernetes.io/listen-ports: |
         [{"HTTP": 8001}]
        alb.ingress.kubernetes.io/cluster-weight.${cluster1-id}: "20"
        alb.ingress.kubernetes.io/cluster-weight.${cluster2-id}: "80"
      name: web-demo
      namespace: gateway-demo
    spec:
      ingressClassName: alb
      rules:
      - host: alb.ingress.alibaba.com
        http:
          paths:
          - path: /svc1
            pathType: Prefix
            backend:
              service:
                name: service1
                port:
                  number: 80
  3. Run the following command to deploy the Ingress on the ACK One Fleet instance:

    kubectl apply -f ingress-demo.yaml -n gateway-demo

Step 4: Verify active zone-redundancy

Forward traffic to different clusters by ratio

Run the following command to access the web application:

curl -H "host: alb.ingress.alibaba.com" alb-xxxx.<regionid>.alb.aliyuncs.com:<listeners port>/svc1

You need to configure the following parameters.

Parameter

Description

alb-xxxx.<regionid>.alb.aliyuncs.com

Set the value to the domain name in the DNSNAME column in the AlbConfig details you obtained in Step 2.

<listeners port>

Set the value to 8001, which is the value specified in the AlbConfig configurations and the annotations of the Ingress configurations.

Run the following command. The output shows that 20% of traffic is forwarded to Cluster 1 (poc-ack-1) and 80% of traffic is forwarded to Cluster 2 (poc-ack-2).

for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.cn-beijing.alb.aliyuncs.com:8001/svc1; done > res.txt

image

Automatically and seamlessly switch traffic when a fault occurs in one cluster

Run the following command. Then, decrease the number of application pods in Cluster 2 to 0. After the change takes effect, traffic is automatically switched to Cluster 1 in a seamless manner.

for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.cn-beijing.alb.aliyuncs.com:8001/svc1; sleep 1; done

image