×
Community Blog Fleet Management Feature of ACK One: Enterprise-Class Multi-Cluster Management Solution

Fleet Management Feature of ACK One: Enterprise-Class Multi-Cluster Management Solution

This article introduces the Fleet management feature of ACK One, a multi-cluster management solution provided by Alibaba Cloud.

By Jing Cai and Yu Zhuang

Overview of Kubernetes Multi-Cluster Development

As enterprise business develops, the necessity of using Kubernetes multi-cluster is gradually highlighted:

Disaster Recovery, Active Multi-zone Deployment, High Availability, and Low Latency: Deploy your business in multiple regions and zones to migrate traffic after faults to improve service availability; deploy your business in multiple clusters to distribute traffic; deploy your business in multiple regions to provide nearby access for reducing latency.

Multi-cloud and Hybrid Cloud Deployment: Manage IDC clusters on the cloud in a centralized manner; promote the use of elastic resources on the cloud when traffic burst occurs; manage clusters across multiple cloud vendors in a centralized manner to prevent vendor lock-in.

Business and Fault Isolation: Use multiple clusters to isolate businesses with different attributes, providing better isolation and performance compared with the multi-tenancy architecture based on the namespace (for example, use multiple clusters to divide the dev, staging, and production environments), and reducing the impact of faults.

• Security Compliance and Upper Limits on Nodes and Pods in a Single Kubernetes Cluster.

In view of the above multi-cluster use cases, many multi-cluster or fleet management solutions have emerged to manage multiple clusters efficiently and centrally. Multi-cluster management solutions began with the KubeFed project proposed by the Kubernetes community in Kubernetes 1.5 and 1.6. However, due to issues such as the low extensibility of the API, complex management, and insufficient maturity, Federation v1 has been archived by the community and has not been widely used. Federation v2 made a lot of improvements, including using the CRD mechanism to extend the API, which was more widely recognized and used. However, it ultimately could not be adopted in a wider range due to some design defects:

It is not compatible with the native Kubernetes API and uses a new set of Federated APIs, which significantly increases the learning costs for users.

It lacks extensibility and cannot be extended through its rigid nature to meet use cases in different scenarios.

Currently, the multi-cluster management solutions widely adopted by the open-source community include Open Cluster Management (OCM) and Karmada. Meanwhile, various cloud vendors have launched their own multi-cluster/fleet management solutions, such as Fleet management feature of ACK One.

ACK One Fleet Manages Multiple Kubernetes Clusters in a Centralized Manner

ACK One is an enterprise-class distributed cloud container platform developed by Alibaba Cloud to meet container management requirements in hybrid cloud, multi-cluster, distributed computing, and disaster recovery scenarios. You can use ACK One registered clusters to connect your other public cloud vendors and IDC Kubernetes clusters to the ACK console. ACK One Fleet manages these registered clusters and ACK and ACK Edge clusters on the cloud to achieve centralized application distribution, traffic management, O&M management, and security management.

The Fleet management feature of ACK One is a solution for the centralized management of multiple clusters based on Open Cluster Management (OCM) from the open-source community. Each Fleet instance is managed by ACK. You can focus on application development without much O&M work.

1

ACK One Fleet includes the following key capabilities:

  1. ACK One GitOps: manages Argo CD to continuously deploy multi-cluster applications.
  2. Multi-cluster Gateways: namely multi-cluster Ingress, manages the north-south traffic of multiple clusters in a centralized manner to implement cross-zone high availability and tag-based routing.
  3. Multi-cluster Services: Multi-cluster service API that implements Kubernetes community standards and cross-cluster service discovery.
  4. Global Monitoring: Prometheus global aggregation instances, providing a centralized monitoring view for multiple clusters.
  5. Service Mesh: integrates ASM to manage east-west traffic across multi-cluster services.
  6. Centralized Permission Management: manages the RBAC permissions for RAM users or roles in multiple clusters in a centralized manner.
  7. Multi-cluster Workload Scheduling and Distribution: supports priority queues and multi-tenancy scheduling for Spark, TensorFlow, and Kubernetes Job/CronJob.

ACK One GitOps

According to the results of the CNCF microsurvey on GitOps usage trend evaluation published in late 2023, the data shows that GitOps has become the top choice of most developers for fast, consistent, and secure delivery.

Based on the CNCF graduated project Argo CD, ACK One GitOps provides GitOps continuous delivery capabilities for multi-cluster applications in multi-cloud, multi-cluster, and hybrid cloud scenarios. ACK One GitOps is integrated with fully managed Argo CD, the multi-cluster management feature of ACK One, and Alibaba Cloud Resource Access Management (RAM) and single sign-on (SSO). With these capabilities, ACK One GitOps provides out-of-the-box Argo CD features and a secure and comprehensive GitOps CD experience for applications among clusters and allows you to implement continuous hybrid cloud application deployment across clusters in a fast, consistent, and secure manner.

2

  1. The developer uses ArgoCD UI, CLI, or Go SDK to create an Application or ApplicationSet and deploy the application.
  2. The developer updates the new image to the image repository and the new tag is updated to YAML in the Git repository after the Argo CD Image Updater detects the image update.
  3. Argo CD regularly synchronizes the application state of the Git repository to cloud clusters and on-premises clusters (Secret management in GitOps is implemented based on KMS).
  4. The state changes during application synchronization are notified in real time by DingTalk.

ACK One GitOps has the following advantages:

• Integrated with open-source Argo CD, out-of-the-box, O&M-free, and provides a CLI and a UI that offer the same user experience as the CLI and the UI provided by Argo CD.

• Provides a separate Argo CD console, integrated with Alibaba Cloud Resource Access Management (RAM) and single sign-on (SSO), and supports Argo CD multi-tenancy permission management.

• Supports application distribution across clusters in hybrid cloud scenarios. Argo CD is automatically enabled for clusters that are associated with ACK One. The associated clusters use GitOps for application distribution.

• Supports ArgoCD ApplicationSet to improve the user experience in application distribution across clusters.

• Publishs multi-cluster applications more securely, supports Secret management in GitOps, and accesses sub-clusters at the ServiceAccount level.

Customer Case 1: Continuous Deployment of Applications in Hybrid Cloud and Multi-cluster Scenarios across Multiple Teams

Currently, multiple ACK One customers use GitOps to build continuous deployment of applications in hybrid cloud and multi-cluster scenarios across multiple teams. ACK One Fleet manages dozens of cloud clusters and on-premises clusters and uses GitOps to rapidly deploy thousands of applications. Argo CD is automatically enabled for clusters that are associated with ACK One. The associated clusters use GitOps for application distribution. This simplifies the application distribution process across clusters.

The multi-tenancy permission management in GitOps involves granting ArgoCD RBAC permissions to RAM users or RAM roles and managing the RBAC permissions of RAM users or RAM roles on clusters, repositories, and applications by using ArgoCD Projects.

3

The following four steps are used to build a continuous deployment case for applications in hybrid cloud and multi-cluster scenarios across multiple teams:

  1. Register the IDC cluster to Alibaba Cloud through ACK One registered clusters.
  2. ACK One Fleet manages cloud ACK clusters in multiple regions and on-premises IDC clusters through associated clusters.
  3. Use ACK One GitOps to quickly deploy applications to cloud and on-premises clusters.
  4. Configure permission controls at different application levels for different RAM users or roles on the ArgoCD UI, and then use the RAM users or roles to access SSO to the ArgoCD UI for verifying their own permissions. For more information, please refer to Manage users based on GitOps.

Customer Case 2: Build Git Event-driven Automated CI/CD Pipeline

ACK One Serverless Argo Workflows (Argo Workflows) are fully managed by ACK One. The performance, stability, observability, and O&M capabilities of Argo Workflows are improved. EventBridge is a serverless event bus service provided by Alibaba Cloud, which has significant advantages in availability, usability, and security.

Combining EventBridge, Argo Workflows, and Argo CD, ACK One GitOps allows you to easily, quickly, efficiently, and cost-effectively deliver your applications and implement an automated CI/CD system that delivers code when you submit it. For more information about how to build a CI Pipeline, see Event-driven CI Pipeline based on EventBridge.

4

  1. Users commit code to the Git repository (push commit).
  2. EventBridge, based on configured rules, is triggered to commit CI workflows to the ACK One workflow cluster when Git events occur.
  3. The CI Pipeline based on the ACK One workflow cluster is used to build a Docker Image and it is pushed to the image repository (ACR EE).
  4. Use GitOps to automatically synchronize the image changes to the ACK cluster.

Multi-Cluster Gateways

Multi-cluster gateways are cloud-native gateways provided by ACK One for multi-cloud and multi-cluster scenarios. It manages north-south traffic at Layer 7 across multiple clusters in a single region. ACK One manages MSE Ingress and uses Ingress APIs to define traffic routing rules. It supports the following features across multiple clusters: HTTP routing, traffic splitting, health-based automatic disaster recovery, traffic mirroring, and traffic load balancing based on the number of replicas. You can use multi-cluster gateways to build capabilities such as disaster recovery, tag-based routing, and weight-based routing.

5

Multi-cluster gateways have the following advantages:

• A multi-cluster Global Ingress at the region level to centrally manage north-south traffic at Layer 7 in multiple clusters.

Simplified multi-cluster traffic management: You can configure Ingress rules for multiple clusters in a Fleet instance without separately managing each sub-cluster. And they are compatible with NGINX Ingress.

• Multi-cluster gateways ensure high availability across zones.

Fallback in milliseconds: When a backend of a cluster fails, the multi-cluster gateway smoothly migrates traffic to other backends.

• Gateways are fully managed and O&M-free.

Customer Case: Multi-cluster and Hybrid Cloud Zone-disaster Recovery

The following five steps are required to build a hybrid cloud disaster recovery system. For more information, please refer to Use ACK One to implement zone-disaster recovery in hybrid cloud environments.

  1. Use ACK One registered clusters to manage Kubernetes clusters deployed on IDC or third-party clouds.
  2. Interconnect the on-premises network with the VPC.
  3. Create an ACK One Fleet and associate clusters with the Fleet.
  4. Use the ACK One GitOps to distribute the application to multiple clusters (optional).
  5. Use the ACK One multi-cluster gateways to manage multi-cluster traffic.

6

Multi-Cluster Services

ACK One Fleet manages cross-cluster service discovery by implementing multi-cluster services of the multi-cluster service API provided by the Kubernetes community, which can help you in the following scenarios:

  1. High Availability: Multiple clusters run the same core service.
  2. Stateful Service and Stateless Service: They are deployed separately and stateful service enables read/write splitting.
  3. Shared Service: A service is shared by multiple clusters, such as the monitoring/key service.
  4. Migration: One service is deployed to two clusters and is migrated gradually. For example, it is migrated from the on-premises cluster to the cloud cluster.

The following is the architecture of ACK One multi-cluster services:

1.  Connections marked with Circled Number 1 in the figure are used by the Fleet instance to manage the ServiceExport and ServiceImport in the associated Container Service for Kubernetes (ACK) clusters.

• A ServiceExport is created in ACK Cluster 1 to export Service 1. ACK Cluster 1 serves as a Service provider. Service 1 provides external services.

• A ServiceImport is created in ACK Cluster 2 to allow ACK Cluster 2 to access Service 1 exported by the service provider. ACK Cluster 2 serves as a Service consumer.

2.  The connection marked by Circled Number 2 in the figure is used for data exchange. After Service 1 is exported in ACK Cluster 1 and imported in ACK Cluster 2, you can access Service 1 in ACK Cluster 1 from ACK Cluster 2. This way, you can access Services across Kubernetes clusters.

7

The following figure shows how the Client Pod in ACK Cluster 2 can access Service 1 in ACK Cluster 1. The principle of multi-cluster services is as follows:

  1. The ACK One Fleet instance has a component named Multi-cluster Service Controller, which is responsible for listening to ServiceExport and ServiceImport created by users.
  2. Obtain the backends of Service 1 during the export service, create these backends in EndpointSlice during the import service, and associate them with ServiceImport.
  3. Create a service named amcs-service1 that is prefixed with amcs- and associate it with EndpointSlice.
  4. Finally, the Client Pod in ACK Cluster 2 can access Service 1 in ACK Cluster 1 through two domain names:

a) service1.provider-ns.svc.clusterset.local:

The Client Pod needs to enable the multi-cluster plug-in in CoreDNS to use this domain name. After the Client Pod resolves the domain name, the IP address of ServiceImport is returned. Then, the Client Pod can use the IPs in the associated EndpointSlice to access the Pods in ACK Cluster 1.

b) amcs-service1.provider-ns.svc.cluster.local:

The Client Pod needs normal Service domain name resolution in the Kubernetes cluster to use this domain name. Then, the Client Pod can use the IPs in the associated EndpointSlice to access the Pods in ACK Cluster 1.

8

Customer Case: Cross-cluster Access to a Specified Stateful Service Instance through a Headless Multi-cluster Service

For example, you can use headless multi-cluster services to implement read/write splitting for MySQL primary and secondary clusters. This helps improve the performance, throughput, reliability, and fault tolerance of the MySQL database.

Cross-cluster access to a specified stateful service instance through a headless multi-cluster service. The main steps are as follows:

  1. ACK One Fleet instance is associated with two ACK clusters: ACK Cluster 1 and ACK Cluster 2. A MySQL Service is exported from ACK Cluster 1 and imported into ACK Cluster 2.
  2. ACK Cluster 1 serves as a service provider. A MySQL Service and a ServiceExport are created in ACK Cluster 1 to export the service.
  3. ACK Cluster 2 serves as a service consumer. A ServiceImport is created in ACK Cluster 2 to import the service.
  4. In ACK Cluster 2, the Client Pod can access a Pod instance of a MySQL Service in ACK Cluster 1 across clusters through the specified domain name of the Pod instance name. For example, mysql-0 can be accessed through the following two domain names:

a) mysql-0.mysql.provider-ns.svc.clusterset.local

b) mysql-0.amcs-mysql.provider-ns.svc.cluster.local

9

Global Monitoring

The global observability of the Fleet management feature of ACK One includes global monitoring and global FinOps. In addition, it includes capabilities under construction such as global event centers.

ACK One Fleet manages all Kubernetes clusters in a centralized manner to prevent differences in management and control. It uses an aggregation instance of Alibaba Cloud Managed Service for Prometheus to aggregate the metrics of each cluster, providing you with a global and centralized monitoring view to ensure your business stability.

10

11

Summary

The Fleet management feature of ACK One is a multi-cluster management solution provided by Alibaba Cloud. It features GitOps application distribution, multi-cluster gateways, multi-cluster services, global observability, service mesh, centralized permission management, and multi-cluster workload scheduling and distribution. This solution simplifies multi-cluster management in scenarios such as hybrid cloud, multi-cluster, and disaster recovery. Managed Fleet instances (Kubernetes clusters) and Argo CD also minimize your O&M efforts, allowing you to focus more on business development.

0 1 0
Share on

Alibaba Container Service

177 posts | 31 followers

You may also like

Comments