Argo Workflows 3.6: Key New Features in Cloud-native Orchestration

By Shuangkun Tian

Argo Workflows is a CNCF (Cloud Native Computing Foundation) graduate project designed for orchestrating parallel jobs on Kubernetes. This article focuses on the key new features of the latest release, Argo Workflows 3.6.

1. Introduction to Argo Workflows

Argo Workflows is a CNCF graduate project that is designed to orchestrate parallel jobs on Kubernetes. It implements each task in a workflow as a separate container instance and is lightweight, scalable, and highly parallelizable.

Argo Workflows is mainly used in the following scenarios:

• Batch processing system: Argo Workflows provides a declarative way to define and execute large-scale data processing tasks that need to be executed periodically or on demand, such as ETL jobs and data analysis report generation.

• Machine learning pipelines: Argo Workflows can coordinate the steps of data preprocessing, model training, verification, parameter tuning, and deployment in machine learning projects. It also uses the resource scheduling capability of Kubernetes to efficiently allocate GPU and other resources to support large-scale parallel computing requirements.

• Infrastructure automation: When managing cloud-native infrastructure, Argo Workflows can be used to perform a series of automated tasks, such as creating and configuring Kubernetes resources, performing backup and recovery operations, and monitoring system health status.

• CI/CD: Continuous integration and continuous deployment processes typically include multiple phases such as code building, testing, and deployment. Argo Workflows integrates these steps to automate the pipeline and improve the speed and quality of software delivery.

The Alibaba Cloud Container Service team is one of the early adopters of Argo Workflows in China. During practical implementation, numerous performance bottlenecks were resolved and many features were developed to contribute back to the community. Version 3.6 contributes several features, particularly enhancing the usability and stability of the core controller. These contributions include automatic offloading of large parameters, template scheduling constraints, parallel parsing of large flat workflows, streaming of OSS files, OSS Artifacts garbage collection, retry for parallel pod cleanup, and dynamic template references, among others. These enhancements improve the stability, usability, and performance of the workflow engine.

This article will provide an in-depth analysis of the key new features of Argo Workflows 3.6.

2. New Feature Analysis

1) Cron Workflows: Enhanced Scheduling Strategies

Cron Workflows is one of the most commonly used features of Argo Workflows. You can use this feature to trigger task scheduling at a custom time. In version 3.6, several enhancements are made:

• Multiple cron scheduler scheduling: You can integrate multiple timed scheduling strategies within a single Cron Workflow for workflow scheduling.

• Stop strategy: You can set strategies to stop scheduling under specific conditions, which prevents the continuous failure of scheduled workflows and the accumulation of failed workflows in the cluster.

• When expression: The system checks whether the expression is true before the scheduling starts, providing a more flexible combination mechanism with the cron scheduler.

This enhancement enables the user to combine scheduling strategies to implement various timed scheduling strategies.

Sample code:

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  name: cron-workflow-example
spec:
  schedules: # Multiple scheduling strategies, executed every 3 minutes and 5 minutes, and only once in the 15th minute. 
  - "*/3 * * * *"
  - "*/5 * * * *"
  concurrencyPolicy: "Allow"
  stopStrategy: # Stop the Cron Workflow after the number of failed workflows exceeds 10.
    condition: "failed >= 10"
  # Ensure that the interval between two executions exceeds 3600 seconds by using an expression.
  when: "{{= cronworkflow.lastScheduledTime == nil || (now() - cronworkflow.lastScheduledTime).Seconds() > 3600 }}"
  startingDeadlineSeconds: 0
  workflowSpec:
    entrypoint: whalesay
    templates:
    - name: whalesay
      container:
        image: alpine:3.6
        command: [sh, -c]
        args: ["date; sleep 1"]

2) Optimized User Interface

The Argo UI is an important component of a workflow. After you submit a workflow, you can view the status of the workflow through the Argo UI. In version 3.6, workflow details, time display, the directory used for output Artifacts, and Markdown syntax are added. At the same time, the execution history and real-time logs of Cron Workflow and Workflow Template can be accessed.

The UI shows the directory used for input artifacts

These enhancements improve the usability and observability of the user interface, helping you better observe the status of your workflow.

3) Argo Workflows Controller: Large Scale, Stability, Security, and Functional Enhancements

The controller is the most critical component of Argo Workflows, so its stability and high performance are essential. In version 3.6, there are some major enhancements: large scale, stability, and security:

• Use queues to archive workflows. This improves memory management when archiving a large number of workflows simultaneously.

• Clean up pods in parallel. This is useful for retrying large workflows, allowing retries to be completed within the tolerance period.

• Add the Pod Kubernetes finalizer. This avoids the "Pod Delete" error caused by premature deletion, making it easier for the controller to reconcile.

• Parse large flat workflows in parallel. This speeds up the parsing of large workflows.

• Offload large parameters automatically to support longer startup parameters. This is beneficial for large-scale scientific simulation scenarios.

• Automatically set the seccomp profile to RuntimeDefault. This enhances container security and reduces the risk of attacks.

These functions will be enabled by default when the controller starts. In addition, there are some functional enhancements:

a. Automatic recycling of OSS Artifacts

You can configure the artifact garbage collection (GC) policy to reclaim the intermediate files of the workflow on Alibaba Cloud OSS when the workflow is completed or deleted. This saves storage costs. The following sample code provides an example:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifact-gc-
spec:
  entrypoint: main
  artifactGC:
    strategy: OnWorkflowDeletion # the overall strategy, which can be overridden
    podMetadata:
      annotations:
         kubernetes.io/resource-type: eci
  templates:
    - name: main
      container:
        image: argoproj/argosay:v2
        command:
          - sh
          - -c
        args:
          - |
            echo "hello world" > /tmp/on-completion.txt
            echo "hello world" > /tmp/on-deletion.txt
      outputs:
        artifacts:
          - name: on-completion # The artifact is recycled when the workflow is completed.
            path: /tmp/on-completion.txt
            oss:
              endpoint: http://oss-cn-zhangjiakou-internal.aliyuncs.com
              bucket: my-argo-workflow
              key: on-completion.txt
              accessKeySecret:
                name: my-argo-workflow-credentials
                key: accessKey
              secretKeySecret:
                name: my-argo-workflow-credentials
                key: secretKey
            artifactGC:
              strategy: OnWorkflowCompletion # overriding the default strategy for this artifact
          - name: on-deletion # The artifact is recycled when the workflow is deleted.
            path: /tmp/on-deletion.txt
            oss:
              endpoint: http://oss-cn-zhangjiakou-internal.aliyuncs.com
              bucket: my-argo-workflow
              key: on-deletion.txt
              accessKeySecret:
                name: my-argo-workflow-credentials
                key: accessKey
              secretKeySecret:
                name: my-argo-workflow-credentials
                key: secretKey

b. Template supports scheduling constraints

You can add NodeSelectors and Tolerations when you define a template. The following sample code provides an example:

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: benchmarks
spec:
  entrypoint: main
  serviceAccountName: workflow
  templates:
  - dag:
      tasks:
      - arguments:
          parameters:
          - name: msg
            value: 'hello'
        name: benchmark
        template: benchmark
    name: main
    nodeSelector: # The node selector is defined on the template and will be passed to pods.
      pool: workflows
    tolerations: # Tolerations are defined on the template and will be passed to pods.
    - key: pool
      operator: Equal
      value: workflows
  - inputs:
      parameters:
      - name: msg
    name: benchmark
    script:
      command:
      - python
      image: python:latest
      source: |
        print("{{inputs.parameters.msg}}")

c. Support for dynamic template references

When defining template references, you can directly use parameters to greatly optimize the structure and size of Yaml orchestration files. The following sample code provides an example:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-wf-global-arg-
  namespace: default
spec:
  entrypoint: whalesay
  arguments:
    parameters:
      - name: global-parameter
        value: hello
  templates:
    - name: whalesay
      steps:
        - - name: hello-world
            templateRef: # Reference the dynamic template in Step.
              name: '{{item.workflow-template}}' # Read the template workflow to be called from the loop.
              template: '{{item.template-name}}' # Read the template name from the loop.
            withItems: # Define the loop parameters.
                - { workflow-template: 'hello-world-template-global-arg', template-name: 'hello-world'}
          - name: hello-world-dag
            template: diamond

    - name: diamond
      dag:
        tasks:
        - name: A
          templateRef: # Reference the dynamic template in a DAG.
            name: '{{item.workflow-template}}' # Read the template workflow to be called from the loop.
            template: '{{item.template-name}}' # Read the template name from the loop.
          withItems:
              - { workflow-template: 'hello-world-template-global-arg', template-name: 'hello-world'}

d. Update the expr database to support multiple functions

You can use more expression functions, such as list concatenation and string merge functions. The following sample code provides an example:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-expression-
  namespace: argo
spec:
  entrypoint: main
  arguments:
    parameters:
      - name: expr 
        value: "{{= concat(['a', 'b'], ['c', 'd']) | join('\\n') }}" # Use the list concatenation and string merge functions.
  templates:
    - name: main
      inputs:
        parameters:
          - name: expr
      script:
        image: alpine:3.6
        command: ["sh"]
        source: |
          echo result: '{{ inputs.parameters.expr }}'

4) Argo CLI: Template Usability

Argo CLI is the most common way to submit workflows, and templates are used to define standard processes, making it easy to define a variety of different templates. In version 3.6, the following enhancements are made to improve template usability:

The workflow template can be directly updated through Argo CLI, which makes it easier for us to update templates without using the Kubectl method. The following sample code provides an example:

argo cron update FILE1 # Update the scheduled workflow
argo template update FILE1 # Update workflow-template
argo cluster-template update FILE1 # Update cluster-workflow-template

You can use labels to filter templates. This helps you classify templates for management.

argo template list -l app=test # Filter templates according to labels

3. Quick Start with Argo Workflows

As a cloud-native batch task orchestration engine, Argo Workflows is essential for orchestrating various types of tasks on Kubernetes and improving business automation. Whether you are an enterprise architect, data scientist, or DevOps engineer, you can use Argo Workflows to improve your work efficiency.

Alibaba Cloud Container Service also provides fully managed Serverless Argo Workflows: https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/overview-12

It has the following characteristics:

• Easy to use: The core components are managed and O&M-free. RestAPI and SDK for Python are provided for easy integration.

• Stable and high performance: The control plane is optimized to support large-scale workflow orchestration, with an overall capacity of up to 40,000 workflows.

• Product-based support: Best practices in many fields are provided to build efficient workflows. You only need to focus on business innovation.

To help you quickly experience submitting workflows, we welcome you to join our DingTalk group (Group ID: 35688562) to discuss together.

References

[1] Argo Workflows
https://github.com/argoproj/argo-workflows
[2] Serverless Argo Workflows
https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/overview-12
[3] Best Practices
https://www.alibabacloud.com/help/en/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/best-practices

Community

Argo Workflows 3.6: Key New Features in Cloud-native Orchestration

1. Introduction to Argo Workflows

2. New Feature Analysis

1) Cron Workflows: Enhanced Scheduling Strategies

2) Optimized User Interface

3) Argo Workflows Controller: Large Scale, Stability, Security, and Functional Enhancements

a. Automatic recycling of OSS Artifacts

b. Template supports scheduling constraints

c. Support for dynamic template references

d. Update the expr database to support multiple functions

4) Argo CLI: Template Usability

3. Quick Start with Argo Workflows

References

Read previous post:

Alibaba Container Service

You may also like

Comments

Alibaba Container Service

Related Products

Container Service for Kubernetes

ACK One

Cloud-Native Applications Management Solution

NAT(NAT Gateway)