Multi-cluster gateways are cloud-native gateways provided by Distributed Cloud Container Platform for Kubernetes (ACK One) to manage north-south traffic in multi-cloud and multi-cluster scenarios. Multi-cluster gateways use fully-managed Microservices Engine (MSE) Ingresses and Ingress APIs to manage north-south traffic at Layer 7. You can use multi-cluster gateways to implement zone-disaster recovery and verify canary versions based on request headers. This simplifies multi-cluster application O&M and reduces costs. By using the Continuous Delivery (CD) capability of ACK One GitOps, you can deploy applications across clusters and build an active zone-redundancy or primary/secondary disaster recovery system. This solution does not include disaster recovery for data services.
Disaster recovery overview
Disaster recovery solutions in the cloud include:
Zone-disaster recovery
Zone-disaster recovery includes active zone-redundancy and primary/secondary disaster recovery. The network latency between data centers located in the same region is low. Therefore, zone-disaster recovery is suitable for protecting data against zone-level hazardous events, such as fire, network interruptions, and power outage.
Active geo-redundancy
The network latency between data centers will be higher if the active geo-redundancy solution is used. However, this solution can efficiently protect data against region-level disasters, such as floods and earthquakes.
Three data centers across two zones
Disaster recovery based on three data centers across two zones provides the benefits of zone-disaster recovery and active geo-redundancy. This solution is suitable for scenarios where you need to ensure the continuity and availability of applications.
Compared with active geo-redundancy, the implementation of zone-disaster recovery is easier and therefore still plays an important role in disaster recovery.
Benefits
Disaster recovery by using multi-cluster gateways has the following advantages over disaster recovery by using DNS traffic distribution:
Disaster recovery by using DNS traffic distribution requires multiple load balancer IP addresses (one IP address for each cluster). Disaster recovery by using multi-cluster gateways uses only one load balancer IP address in one region and uses multi-zone deployment in the same region by default to ensure high availability.
Disaster recovery by using multi-cluster gateways supports request forwarding at Layer 7, while disaster recovery by using DNS traffic distribution does not support this feature.
In most cases, clients need to cache DNS query results during IP address switching in a disaster recovery system that uses DNS traffic distribution. This causes temporary service interruptions. Disaster recovery by using multi-cluster gateways can resolve this problem by seamlessly failing over to the backend pods in another cluster.
Multi-cluster gateways are region-level gateways. Therefore, you can complete all the operations on a Fleet instance, including creating gateways and Ingresses. You do not need to install an Ingress controller or create Ingresses in each Container Service for Kubernetes (ACK) cluster. This helps you manage traffic in a region with reduced multi-cluster management costs.
Architecture
Multi-cluster gateways provided by ACK One route traffic based on fully-managed MSE Ingresses. You can use multi-cluster gateways together with the multi-cluster application distribution feature of ACK One GitOps to quickly build a zone-disaster recovery system. In this example, GitOps is used to deploy an application in ACK Cluster 1 and ACK Cluster 2 in the China (Hong Kong) region to implement active zone-redundancy and primary/secondary disaster recovery. Cluster 1 and Cluster 2 are deployed in two different zones of the region.
The application is a web application that uses Deployment and Service resources. A multi-cluster gateway is used to implement active zone-redundancy and primary/secondary disaster recovery, as shown in the following figure.
Use the MseIngressConfig object in an ACK One Fleet instance to create an MSE gateway.
Create Cluster 1 and Cluster 2 in AZ 1 and AZ 2 in the China (Hong Kong) region.
Use ACK One GitOps to distribute the application to Cluster 1 and Cluster 2.
After a multi-cluster gateway is created, you can configure Ingress rules to route traffic to the clusters based on weights and request headers. When one of the clusters is down, traffic is automatically switched to the other cluster.
Prerequisites
The Fleet management feature is enabled. For more information, see Enable multi-cluster management.
The ACK One Fleet instance is associated with two ACK clusters that are deployed in the same virtual private cloud (VPC) as the ACK One Fleet instance. For more information, see Associate clusters with a Fleet instance.
The kubeconfig file of the Fleet instance is obtained in the Distributed Cloud Container Platform for Kubernetes (ACK One) console and a kubectl client is connected to the Fleet instance.
The multi-cluster gateway feature is enabled.
NoteFor more information about the billing of multi-cluster gateways, see Billing overview.
A namespace is created on the ACK One Fleet instance. The namespace is the same as the namespace of the applications deployed in associated clusters. In this example, the namespace is
gateway-demo
.
Step 1: Use GitOps to distribute an application to multiple clusters
Use the Argo CD UI to deploy an application
Log on to the ACK One console. In the left-side navigation pane, choose .
On the GitOps page of the Fleet instance, click GitOps Console to go to the GitOps console.
NoteIf GitOps is not enabled for the ACK One Fleet instance, click Enable GitOps to log on to the GitOps console.
For more information about how to enable Internet access to GitOps, see Enable public access to Argo CD.
Add an application repository.
In the left-side navigation pane of the ArgoCD UI, click Settings and then choose
.In the panel that appears, configure the following parameters and click CONNECT.
Section
Parameter
Value
Choose your connection method
VIA HTTPS
CONNECT REPO USING HTTPS
Type
git
Project
default
Repository URL
https://github.com/AliyunContainerService/gitops-demo.git
Skip server verification
Select the check box.
After the Git repository is connected, CONNECTION STATUS displays Successful.
Create an application.
On the Applications page of Argo CD, click + NEW APP and configure the following parameters.
Section
Parameter
Description
GENERAL
Application Name
The application name. You can specify a custom application name.
Project Name
default
SYNC POLICY
Select a synchronization policy based on your requirements. Valid values:
Manual: When changes are made to the image in the Git repository, you need to manually synchronize the changes to the destination cluster.
Automatic: Argo CD checks the Git repository for image changes every three minutes and automatically synchronizes the changes to the destination cluster.
SYNC OPTIONS
Select
AUTO-CREATE NAMESPACE
.SOURCE
Repository URL
Select a Git repository from the drop-down list and select the URL
https://github.com/AliyunContainerService/gitops-demo.git
that you added in the preceding step.Revision
Branches: Select
gateway-demo
in the Branches drop-down list.Path
manifests/helm/web-demo
DESTINATION
Cluster URL
Select the URL of Cluster 1 or Cluster 2.
You can also click the
URL
drop-down list on the right side and selectNAME
to select a URL based on the cluster name.Namespace
gateway-demo
is used in this example. Application resources (Services and Deployments) will be created in this namespace.Helm
Parameters
Set
envCluster
tocluster-demo-1
orcluster-demo-2
to specify the backend pods in a cluster to process requests.
Use the Argo CD CLI to deploy an application
Enable GitOps on the ACK One Fleet instance. For more information, see Enable GitOps for the Fleet instance.
Access Argo CD. For more information, see Use the Argo CD CLI to log on to Argo CD.
Create and deploy an application.
Run the following command to add a Git repository:
argocd repo add https://github.com/AliyunContainerService/gitops-demo.git --name ackone-gitops-demos
Expected output:
Repository 'https://github.com/AliyunContainerService/gitops-demo.git' added
Run the following command to query Git repositories:
argocd repo list
Expected output:
TYPE NAME REPO INSECURE OCI LFS CREDS STATUS MESSAGE PROJECT git https://github.com/AliyunContainerService/gitops-demo.git false false false false Successful default
Run the following command to query clusters:
argocd cluster list
Expected output:
SERVER NAME VERSION STATUS MESSAGE PROJECT https://1.1.XX.XX:6443 c83f3cbc90a****-temp01 1.22+ Successful https://2.2.XX.XX:6443 c83f3cbc90a****-temp02 1.22+ Successful https://kubernetes.default.svc in-cluster Unknown Cluster has no applications and is not being monitored.
Use the Application mode to create and deploy the application to the destination cluster.
Create a file named apps-web-demo.yaml and add the following content to the file:
Replace
repoURL
with the actual repository URL.Run the following command to deploy the application:
kubectl apply -f apps-web-demo.yaml
Run the following command to query applications:
argocd app list
Expected output:
# app list NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET argocd/web-demo-cluster1 https://10.1.XX.XX:6443 default Synced Healthy Auto <none> https://github.com/AliyunContainerService/gitops-demo.git manifests/helm/web-demo main argocd/web-demo-cluster2 https://10.1.XX.XX:6443 default Synced Healthy Auto <none> https://github.com/AliyunContainerService/gitops-demo.git manifests/helm/web-demo main
Step 2: Use kubectl to deploy a multi-cluster gateway from the ACK One Fleet instance
Create an MseIngressConfig object on the ACK One Fleet instance to deploy a multi-cluster gateway and then connect the associated clusters to the gateway.
Obtain and record the vSwitch ID of the ACK One Fleet instance. For more information, see Obtain a vSwitch ID.
Create a file named gateway.yaml and add the following content to the file.
NoteReplace
${vsw-id1}
and${vsw-id2}
with the vSwitch IDs obtained from the preceding step, and replace${cluster1}
and${cluster2}
with the IDs of the associated clusters you want to add.For associated clusters
${cluster1}
and${cluster2}
, you must configure the inbound rules of their security group to allow access from all IP addresses and ports of the vSwitch CIDR block.
apiVersion: mse.alibabacloud.com/v1alpha1 kind: MseIngressConfig metadata: annotations: mse.alibabacloud.com/remote-clusters: ${cluster1},${cluster2} name: ackone-gateway-hongkong spec: common: instance: replicas: 3 spec: 2c4g network: vSwitches: - ${vsw-id} ingress: local: ingressClass: mse name: mse-ingress
Parameter
Description
mse.alibabacloud.com/remote-clusters
The cluster to be connected to the multi-cluster gateway. Enter the ID of a cluster that is associated with the ACK One Fleet instance.
spec.name
The name of the gateway instance.
spec.common.instance.spec
Optional. The specification of the gateway instance. The default value is
4c8g
.spec.common.instance.replicas
Optional. The number of replicated gateway instances. The default value is 3.
spec.ingress.local.ingressClass
Optional. The names of IngressClasses that the multi-cluster gateway listens on. In this example, the multi-cluster gateway listens on all Ingresses whose
IngressClasses
aremse
.Run the following command to deploy the multi-cluster gateway:
kubectl apply -f gateway.yaml
Run the following command to check whether the multi-cluster gateway is created and listens on Ingresses.
kubectl get mseingressconfig ackone-gateway-hongkong
Expected output:
NAME STATUS AGE ackone-gateway-hongkong Listening 3m15s
The output indicates that the gateway is in the
Listening
state. This means that the multi-cluster gateway is created and running. The gateway listens on Ingresses whose IngressClasses aremse
.The status of a gateway created from an MseIngressConfig changes in the following order: Pending, Running, and Listening. State description:
Pending: The cloud-native gateway is being created. This process may take about 3 minutes.
Running: The cloud-native gateway is created and running.
Listening: The cloud-native gateway is running and listens on Ingresses.
Failed: The cloud-native gateway is invalid. You can check the message in the Status field to troubleshoot the issue.
Run the following command to check whether the associated cluster is connected to the gateway:
kubectl get mseingressconfig ackone-gateway-hongkong -ojsonpath="{.status.remoteClusters}"
Expected output:
[{"clusterId":"c7fb82****"},{"clusterId":"cd3007****"}]
The output indicates the IDs of the associated clusters and no Failed error is returned. This means that the associated clusters are connected to the multi-cluster gateway.
Step 3: Use Ingresses to implement zone-disaster recovery
Multi-cluster gateways use Ingresses to manage traffic across clusters. You can create Ingress objects on the ACK One Fleet instance to implement active zone-redundancy and primary/secondary disaster recovery.
Make sure that the gateway-demo
namespace is created on the ACK One Fleet instance. The Ingress objects and Service objects in the Deployment of the application must belong to the same namespace: gateway-demo
.
Active zone-redundancy
Create an Ingress object to implement active zone-redundancy
Create an Ingress object on the ACK One Fleet instance to implement active zone-redundancy.
By default, traffic is distributed to all replicated pods of the application in Cluster 1 and Cluster 2 that are connected to the multi-cluster gateway. When the backend pods in Cluster 1 are down, the gateway automatically distributes all traffic to the backend pods in Cluster 2. The ratio of replicated pods in Cluster 1 to replicated pods in Cluster 2 is 9:1. Therefore, 90% traffic is routed to Cluster 1 and 10% traffic is routed to Cluster 2. When all backend pods in Cluster 1 are down, the gateway automatically routes all traffic to Cluster 2. The following figure shows the topology.
Create a file named ingress-demo.yaml and copy the following content to the file.
In the following code, the
/svc1
forwarding rule below the domain nameexample.com
is used to expose the backend Service namedservice1
.apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: web-demo spec: ingressClassName: mse rules: - host: example.com http: paths: - path: /svc1 pathType: Exact backend: service: name: service1 port: number: 80
Run the following command to deploy the Ingresses on the ACK One Fleet instance:
kubectl apply -f ingress-demo.yaml -n gateway-demo
Verify the canary version
In active zone-redundancy scenarios, you can use the following methods to verify the canary version of an application without affecting your businesses.
Create an application in an existing cluster. Specify a name
and selector
for the Service and Deployment. Make sure that the name and selector are different from those configured for the original application. Then, verify the canary version of the application based on request headers.
Create a file named new-app.yaml in Cluster 1 and copy the following content to the file. Retain the original configurations except for the configurations of the Service and Deployment.
apiVersion: v1 kind: Service metadata: name: service1-canary-1 namespace: gateway-demo spec: ports: - port: 80 protocol: TCP targetPort: 8080 selector: app: web-demo-canary-1 sessionAffinity: None type: ClusterIP --- apiVersion: apps/v1 kind: Deployment metadata: name: web-demo-canary-1 namespace: gateway-demo spec: replicas: 1 selector: matchLabels: app: web-demo-canary-1 template: metadata: labels: app: web-demo-canary-1 spec: containers: - env: - name: ENV_NAME value: cluster-demo-1-canary image: 'registry-cn-hangzhou.ack.aliyuncs.com/acs/web-demo:0.6.0' imagePullPolicy: Always name: web-demo
Run the following command to deploy the canary version in Cluster 1:
kubectl apply -f new-app.yaml
Create a file named new-ingress.yaml and copy the following content to the file.
Create an Ingress on the ACK One Fleet instance to route requests based on request headers. You can add annotations to enable the canary feature and specify the
canary-dest: cluster1
header to route requests that carry this header to the canary version.nginx.ingress.kubernetes.io/canary
: Set the value to"true"
to enable the canary feature.nginx.ingress.kubernetes.io/canary-by-header
: the header key of requests routed to the canary version.nginx.ingress.kubernetes.io/canary-by-header-value
: the header value of requests routed to the canary version.apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: web-demo-canary-1 namespace: gateway-demo annotations: nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-by-header: "canary-dest" nginx.ingress.kubernetes.io/canary-by-header-value: "cluster1" spec: ingressClassName: mse rules: - host: example.com http: paths: - path: /svc1 pathType: Exact backend: service: name: service1-canary-1 port: number: 80
Run the following command to deploy an Ingress on the ACK One Fleet instance to route requests based on request headers.
kubectl apply -f new-ingress.yaml
Verify active zone-redundancy
Change the number of replicated pods deployed in Cluster 1 to 9 and the number of replicated pods deployed in Cluster 2 to 1. This way, 90% traffic is routed to Cluster 1 and 10% traffic is routed to Cluster 2 by default. When all backend pods in Cluster 1 are down, all traffic is automatically routed to Cluster 2.
Run the following command to query the public IP address of the multi-cluster gateway:
kubectl get ingress web-demo -n gateway-demo -ojsonpath="{.status.loadBalancer}"
Route traffic to the two clusters based on the specified ratio
Run the following command to query the traffic routing information.
Replace
XX.XX.XX.XX
with the public IP address of the multi-cluster gateway that you obtained in the preceding step.for i in {1..100}; do curl -H "host: example.com" XX.XX.XX.XX done
Expected output: The output indicates that 90% traffic is routed to Cluster 1 and 10% traffic is routed to Cluster 2.
Route all traffic to Cluster 2 when all backend pods in Cluster 1 are down
If the
replica
value in the Deployment in Cluster 1 is set to 0, the following result is returned. The output indicates that the traffic is automatically failed over to Cluster 2.Route requests that carry the specified header to the canary version
Run the following command to query the traffic routing information.
Replace
XX.XX.XX.XX
in the following command with the IP address that is obtained after the application and Ingress are deployed in the Verify the canary version section.for i in {1..100}; do curl -H "host: example.com" -H "canary-dest: cluster1" xx.xx.xx.xx/svc1; sleep 1; done
Expected output: The output indicates that requests with the
canary-dest: cluster1
header are routed to the canary version in Cluster 1.
Primary/secondary disaster recovery
Create an Ingress object on the ACK One Fleet instance to implement primary/secondary disaster recovery.
If the backend pods in both clusters are normal, traffic is routed only to the backend pods in Cluster 1. If the backend pods of Cluster 1 are down, the gateway automatically routes traffic to Cluster 2. The following figure shows the topology.
Create an Ingress to implement primary/secondary disaster recovery
Create a file named ingress-demo-cluster-one.yaml and add the following content to the file.
Add the
mse.ingress.kubernetes.io/service-subset
andmse.ingress.kubernetes.io/subset-labels
annotations to the YAML file of the Ingress object to use/service1
below the domain nameexample.com
to expose the backend Serviceservice1
. For more information about the annotations supported by MSE Ingresses, see Annotations supported by MSE Ingress gateways.mse.ingress.kubernetes.io/service-subset
: the name of the subset of the Service. We recommend that you use a name related to the destination cluster.mse.ingress.kubernetes.io/subset-labels
: The ID of the destination cluster.apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: mse.ingress.kubernetes.io/service-subset: cluster-demo-1 mse.ingress.kubernetes.io/subset-labels: | topology.istio.io/cluster ${cluster1-id} name: web-demo-cluster-one spec: ingressClassName: mse rules: - host: example.com http: paths: - path: /service1 pathType: Exact backend: service: name: service1 port: number: 80
Run the following command to deploy the Ingress on the ACK One Fleet instance:
kubectl apply -f ingress-demo-cluster-one.yaml -ngateway-demo
Cluster-level canary releases
Multi-cluster gateways allow you to deploy an Ingress to route requests to specific clusters based on request headers. You can use this feature and the Ingress object created in the Create an Ingress to implement primary/secondary disaster recovery section to evenly distribute traffic to all replicated pods for load balancing and route requests that carry the specified header to the canary version.
An Ingress is created by following the steps in the Create an Ingress to implement primary/secondary disaster recovery section.
Create an Ingress that contains header-related annotations on the ACK One Fleet instance to implement cluster-level canary releases. When the headers of requests match the Ingress rule, the requests are routed to the backend pods of the canary version.
Create a file named ingress-demo-cluster-gray.yaml and add the following content to the file.
In the YAML file of the following Ingress object, replace
${cluster1-id}
with the ID of the destination cluster. In addition to themse.ingress.kubernetes.io/service-subset
annotation and themse.ingress.kubernetes.io/subset-labels
annotation, you must add the following annotations to expose the backend Service namedservice1
by using the/service1
Ingress rule below the domain nameexample.com
.nginx.ingress.kubernetes.io/canary
: Set the value to"true"
to enable canary releases.nginx.ingress.kubernetes.io/canary-by-header
: the header key of requests routed to the cluster.nginx.ingress.kubernetes.io/canary-by-header-value
: the header value of requests routed to the cluster.
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: mse.ingress.kubernetes.io/service-subset: cluster-demo-2 mse.ingress.kubernetes.io/subset-labels: | topology.istio.io/cluster ${cluster2-id} nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-by-header: "app-web-demo-version" nginx.ingress.kubernetes.io/canary-by-header-value: "gray" name: web-demo-cluster-gray name: web-demo spec: ingressClassName: mse rules: - host: example.com http: paths: - path: /service1 pathType: Exact backend: service: name: service1 port: number: 80
Run the following command to deploy the Ingress on the ACK One Fleet instance:
kubectl apply -f ingress-demo-cluster-gray.yaml -n gateway-demo
Verify primary/secondary disaster recovery
To verify primary/secondary disaster recovery, set the number of replicated pods to 1 for Cluster 1 and Cluster 2. This way, requests that carry the specified header are routed to Cluster 2 and other requests are routed to Cluster 1 by default. When the backend pods in Cluster 1 are down, traffic is routed to Cluster 2.
Run the following command to query the public IP address of the multi-cluster gateway:
kubectl get ingress web-demo -n gateway-demo -ojsonpath="{.status.loadBalancer}"
Route traffic to Cluster 1 by default
Run the following command to check whether the traffic is routed to Cluster 1 by default:
Replace
XX.XX.XX.XX
with the public IP address of the multi-cluster gateway that you obtained in the preceding step.for i in {1..100}; do curl -H "host: example.com" xx.xx.xx.xx/service1; sleep 1; done
Expected Output: The output indicates that all traffic is routed to Cluster 1 by default.
Route requests that carry the specified header to the canary version
Run the following command to check whether requests that carry the specified header are routed to the canary version.
Replace
XX.XX.XX.XX
with the public IP address of the multi-cluster gateway that you obtained in the preceding step.for i in {1..50}; do curl -H "host: example.com" -H "app-web-demo-version: gray" xx.xx.xx.xx/service1; sleep 1; done
Expected output: The output indicates that requests with the
app-web-demo-version: gray
header are routed to the canary version in Cluster 2.Route traffic to Cluster 2 when Cluster 1 is down
If the
replica
value in the Deployment in Cluster 1 is set to 0, the following result is returned. The output indicates that the traffic is automatically failed over to Cluster 2.
References
For more information about how to use multi-cluster gateways provided by ACK One to manage north-south traffic, see Manage north-south traffic.
For more information about how to use ACK One GitOps to distribute applications, see Getting started with GitOps.