By Shuangkun Tian
Argo Workflows is a CNCF (Cloud Native Computing Foundation) graduate project designed for orchestrating parallel jobs on Kubernetes. This article focuses on the key new features of the latest release, Argo Workflows 3.6.
Argo Workflows is a CNCF graduate project that is designed to orchestrate parallel jobs on Kubernetes. It implements each task in a workflow as a separate container instance and is lightweight, scalable, and highly parallelizable.
Argo Workflows is mainly used in the following scenarios:
• Batch processing system: Argo Workflows provides a declarative way to define and execute large-scale data processing tasks that need to be executed periodically or on demand, such as ETL jobs and data analysis report generation.
• Machine learning pipelines: Argo Workflows can coordinate the steps of data preprocessing, model training, verification, parameter tuning, and deployment in machine learning projects. It also uses the resource scheduling capability of Kubernetes to efficiently allocate GPU and other resources to support large-scale parallel computing requirements.
• Infrastructure automation: When managing cloud-native infrastructure, Argo Workflows can be used to perform a series of automated tasks, such as creating and configuring Kubernetes resources, performing backup and recovery operations, and monitoring system health status.
• CI/CD: Continuous integration and continuous deployment processes typically include multiple phases such as code building, testing, and deployment. Argo Workflows integrates these steps to automate the pipeline and improve the speed and quality of software delivery.
The Alibaba Cloud Container Service team is one of the early adopters of Argo Workflows in China. During practical implementation, numerous performance bottlenecks were resolved and many features were developed to contribute back to the community. Version 3.6 contributes several features, particularly enhancing the usability and stability of the core controller. These contributions include automatic offloading of large parameters, template scheduling constraints, parallel parsing of large flat workflows, streaming of OSS files, OSS Artifacts garbage collection, retry for parallel pod cleanup, and dynamic template references, among others. These enhancements improve the stability, usability, and performance of the workflow engine.
This article will provide an in-depth analysis of the key new features of Argo Workflows 3.6.
Cron Workflows is one of the most commonly used features of Argo Workflows. You can use this feature to trigger task scheduling at a custom time. In version 3.6, several enhancements are made:
• Multiple cron scheduler scheduling: You can integrate multiple timed scheduling strategies within a single Cron Workflow for workflow scheduling.
• Stop strategy: You can set strategies to stop scheduling under specific conditions, which prevents the continuous failure of scheduled workflows and the accumulation of failed workflows in the cluster.
• When expression: The system checks whether the expression is true before the scheduling starts, providing a more flexible combination mechanism with the cron scheduler.
This enhancement enables the user to combine scheduling strategies to implement various timed scheduling strategies.
Sample code:
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: cron-workflow-example
spec:
schedules: # Multiple scheduling strategies, executed every 3 minutes and 5 minutes, and only once in the 15th minute.
- "*/3 * * * *"
- "*/5 * * * *"
concurrencyPolicy: "Allow"
stopStrategy: # Stop the Cron Workflow after the number of failed workflows exceeds 10.
condition: "failed >= 10"
# Ensure that the interval between two executions exceeds 3600 seconds by using an expression.
when: "{{= cronworkflow.lastScheduledTime == nil || (now() - cronworkflow.lastScheduledTime).Seconds() > 3600 }}"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: alpine:3.6
command: [sh, -c]
args: ["date; sleep 1"]
The Argo UI is an important component of a workflow. After you submit a workflow, you can view the status of the workflow through the Argo UI. In version 3.6, workflow details, time display, the directory used for output Artifacts, and Markdown syntax are added. At the same time, the execution history and real-time logs of Cron Workflow and Workflow Template can be accessed.
The UI shows the directory used for input artifacts
These enhancements improve the usability and observability of the user interface, helping you better observe the status of your workflow.
The controller is the most critical component of Argo Workflows, so its stability and high performance are essential. In version 3.6, there are some major enhancements: large scale, stability, and security:
• Use queues to archive workflows. This improves memory management when archiving a large number of workflows simultaneously.
• Clean up pods in parallel. This is useful for retrying large workflows, allowing retries to be completed within the tolerance period.
• Add the Pod Kubernetes finalizer. This avoids the "Pod Delete" error caused by premature deletion, making it easier for the controller to reconcile.
• Parse large flat workflows in parallel. This speeds up the parsing of large workflows.
• Offload large parameters automatically to support longer startup parameters. This is beneficial for large-scale scientific simulation scenarios.
• Automatically set the seccomp profile to RuntimeDefault. This enhances container security and reduces the risk of attacks.
These functions will be enabled by default when the controller starts. In addition, there are some functional enhancements:
You can configure the artifact garbage collection (GC) policy to reclaim the intermediate files of the workflow on Alibaba Cloud OSS when the workflow is completed or deleted. This saves storage costs. The following sample code provides an example:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: artifact-gc-
spec:
entrypoint: main
artifactGC:
strategy: OnWorkflowDeletion # the overall strategy, which can be overridden
podMetadata:
annotations:
kubernetes.io/resource-type: eci
templates:
- name: main
container:
image: argoproj/argosay:v2
command:
- sh
- -c
args:
- |
echo "hello world" > /tmp/on-completion.txt
echo "hello world" > /tmp/on-deletion.txt
outputs:
artifacts:
- name: on-completion # The artifact is recycled when the workflow is completed.
path: /tmp/on-completion.txt
oss:
endpoint: http://oss-cn-zhangjiakou-internal.aliyuncs.com
bucket: my-argo-workflow
key: on-completion.txt
accessKeySecret:
name: my-argo-workflow-credentials
key: accessKey
secretKeySecret:
name: my-argo-workflow-credentials
key: secretKey
artifactGC:
strategy: OnWorkflowCompletion # overriding the default strategy for this artifact
- name: on-deletion # The artifact is recycled when the workflow is deleted.
path: /tmp/on-deletion.txt
oss:
endpoint: http://oss-cn-zhangjiakou-internal.aliyuncs.com
bucket: my-argo-workflow
key: on-deletion.txt
accessKeySecret:
name: my-argo-workflow-credentials
key: accessKey
secretKeySecret:
name: my-argo-workflow-credentials
key: secretKey
You can add NodeSelectors and Tolerations when you define a template. The following sample code provides an example:
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: benchmarks
spec:
entrypoint: main
serviceAccountName: workflow
templates:
- dag:
tasks:
- arguments:
parameters:
- name: msg
value: 'hello'
name: benchmark
template: benchmark
name: main
nodeSelector: # The node selector is defined on the template and will be passed to pods.
pool: workflows
tolerations: # Tolerations are defined on the template and will be passed to pods.
- key: pool
operator: Equal
value: workflows
- inputs:
parameters:
- name: msg
name: benchmark
script:
command:
- python
image: python:latest
source: |
print("{{inputs.parameters.msg}}")
When defining template references, you can directly use parameters to greatly optimize the structure and size of Yaml orchestration files. The following sample code provides an example:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-wf-global-arg-
namespace: default
spec:
entrypoint: whalesay
arguments:
parameters:
- name: global-parameter
value: hello
templates:
- name: whalesay
steps:
- - name: hello-world
templateRef: # Reference the dynamic template in Step.
name: '{{item.workflow-template}}' # Read the template workflow to be called from the loop.
template: '{{item.template-name}}' # Read the template name from the loop.
withItems: # Define the loop parameters.
- { workflow-template: 'hello-world-template-global-arg', template-name: 'hello-world'}
- name: hello-world-dag
template: diamond
- name: diamond
dag:
tasks:
- name: A
templateRef: # Reference the dynamic template in a DAG.
name: '{{item.workflow-template}}' # Read the template workflow to be called from the loop.
template: '{{item.template-name}}' # Read the template name from the loop.
withItems:
- { workflow-template: 'hello-world-template-global-arg', template-name: 'hello-world'}
You can use more expression functions, such as list concatenation and string merge functions. The following sample code provides an example:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: test-expression-
namespace: argo
spec:
entrypoint: main
arguments:
parameters:
- name: expr
value: "{{= concat(['a', 'b'], ['c', 'd']) | join('\\n') }}" # Use the list concatenation and string merge functions.
templates:
- name: main
inputs:
parameters:
- name: expr
script:
image: alpine:3.6
command: ["sh"]
source: |
echo result: '{{ inputs.parameters.expr }}'
Argo CLI is the most common way to submit workflows, and templates are used to define standard processes, making it easy to define a variety of different templates. In version 3.6, the following enhancements are made to improve template usability:
The workflow template can be directly updated through Argo CLI, which makes it easier for us to update templates without using the Kubectl method. The following sample code provides an example:
argo cron update FILE1 # Update the scheduled workflow
argo template update FILE1 # Update workflow-template
argo cluster-template update FILE1 # Update cluster-workflow-template
You can use labels to filter templates. This helps you classify templates for management.
argo template list -l app=test # Filter templates according to labels
As a cloud-native batch task orchestration engine, Argo Workflows is essential for orchestrating various types of tasks on Kubernetes and improving business automation. Whether you are an enterprise architect, data scientist, or DevOps engineer, you can use Argo Workflows to improve your work efficiency.
Alibaba Cloud Container Service also provides fully managed Serverless Argo Workflows: https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/overview-12
It has the following characteristics:
• Easy to use: The core components are managed and O&M-free. RestAPI and SDK for Python are provided for easy integration.
• Stable and high performance: The control plane is optimized to support large-scale workflow orchestration, with an overall capacity of up to 40,000 workflows.
• Product-based support: Best practices in many fields are provided to build efficient workflows. You only need to focus on business innovation.
To help you quickly experience submitting workflows, we welcome you to join our DingTalk group (Group ID: 35688562) to discuss together.
[1] Argo Workflows
https://github.com/argoproj/argo-workflows
[2] Serverless Argo Workflows
https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/overview-12
[3] Best Practices
https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/best-practices
Cloud Elasticity Provided by ACK One Registered Clusters: A New Tool for Business Expansion
173 posts | 31 followers
FollowAlibaba Cloud Native Community - March 11, 2024
Alibaba Container Service - December 18, 2024
Alibaba Container Service - April 12, 2024
Alibaba Developer - September 7, 2020
Alibaba Container Service - August 30, 2024
Alibaba Container Service - October 15, 2024
173 posts | 31 followers
FollowAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreProvides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreA public Internet gateway for flexible usage of network resources and access to VPC.
Learn MoreMore Posts by Alibaba Container Service