Traditional batch task orchestration and workflow orchestration systems struggle to manage the growing complexity of orchestration scenarios and do not support automation expansion. These limitations are evident in scenarios such as batch data processing, machine learning pipelines, infrastructure automation, and continuous integration and continuous delivery (CI/CD). Alibaba Cloud provides a component compatible with the cloud-native workflow engine Argo Workflows to help you simplify batch task orchestration. This topic mainly introduces the usage of the Argo Workflows component in Container Service for Kubernetes (ACK) clusters.
Open-source Argo Workflows
Argo Workflows is a powerful cloud-native workflow engine designed to define, manage, and schedule complex workflows in Kubernetes. A workflow can include multiple tasks with dependencies between them. This flexibility simplifies task configuration.
Scenarios
Argo Workflows supports various scenarios, and is widely used in industries such as autonomous driving, scientific computing, financial quantitative analysis, and digital media.
Batch data processing: large-scale high-precision map processing, financial quantitative backtesting simulations, parallel audio and video processing, animation rendering.
Scientific computing: complex scientific computing simulations, pharmaceutical research and training, gene sequencing, mutation alignment detection, energy exploration.
Simulation and modeling: autonomous driving algorithm simulations, molecular dynamics simulations, astronomical data simulations, financial modeling.
Machine learning pipelines: machine learning data preprocessing, distributed training, large model parameter tuning, model evaluation and deployment.
Infrastructure automation: automated management of cloud resources, resource backup and recovery, node pool migration, cluster migration and upgrades.
CI/CD: parallel CI pipelines, multi-stage build and testing, cross-cloud application deployment, integration of approval workflows.
Advantages
Cloud native: Specifically designed for Kubernetes, each task is a pod that fully uses the lightweight and flexible nature of containers.
Lightweight and scalability: Compared to traditional VMs, Argo Workflows is lightweight and imposes no additional overhead or limitations. With the robust scheduling capabilities provided by Kubernetes, thousands of tasks can be launched in parallel, thus improving processing efficiency.
Flexible orchestration capabilities: The flexible combination of directed acyclic graphs (DAGs) and steps supports the customization of workflows with a wide range of complexity. With powerful retry and caching mechanisms, the success rate of workflow executions is improved.
Rich ecosystem: Orchestration of various types of tasks, such as Spark, Ray, and TensorFlow jobs, is supported. Combined with event-driven capabilities, it can build fully automated task processing platforms.
ACK Argo Workflows
Advantages
ACK Argo Workflows is compatible with the open-source Argo Workflows, and includes enhancements. You can seamlessly migrate from existing open-source Argo workflows to ACK Argo Workflows without modification. Compared to the open-source version, ACK Argo Workflows offers the following advantages:
High elasticity, automatic scaling, and optimized computing costs.
High reliability, multi-zone load balancing, and high scheduling reliability.
Enhanced control plane with significant improvements in scalability, performance, efficiency, stability, and observability.
Enhanced Object Storage Service (OSS) storage management, supporting large file uploads, garbage collection (GC) of artifacts, and data streaming.
Support from container service technical experts to help your team optimize workflows, improve performance, and reduce costs.
ACK Argo Workflows provides two usage options to meet different user needs:
Serverless Argo Workflows: If you prefer to focus on business process orchestration without operational overhead, and have requirements for large-scale and high-performance workloads, you can build a separate workflow cluster. For more information, see Serverless Argo Workflows.
Argo Workflows component on ACK: If you already have an ACK cluster and want to use your existing cluster resources, you can use the Argo Workflows component to orchestrate your workflows.
Procedure
After you install the Argo Workflows component, you can use batch task orchestration by submitting and managing workflows through Alibaba Cloud Argo CLI or the Argo console.
The following figure describes the processes for different roles:
Process | Description |
Preparation |
|
Environment setup |
For more information, see Enable batch task orchestration. |
Workflow management | (Data engineers) After you orchestrate parallel tasks, submit and manage tasks by using the Argo CLI or Argo console.
|
(Cluster administrators)
|
Billing
The batch task orchestration feature does not incur any fees. However, in addition to regular ACK billing, when using batch task orchestration, the Argo Server will automatically create a pay-as-you-go Classic Load Balancer (CLB) instance. The associated costs are charged by CLB. For more information, see CLB billing.
Contact us
If you have suggestions or questions about this product, join the DingTalk group 35688562 to contact us.