All Products
Search
Document Center

DataWorks:Configure input and output parameters

Last Updated:Nov 13, 2024

Input and output parameters are used to transmit parameter settings between ancestor and descendant nodes. This topic describes how to define and use input and output parameters.

Background information

After you define an output parameter and its value for a node, you can define an input parameter for the descendant node of the node and configure the descendant node to reference the value of the output parameter in the input parameter.

Precautions

Input and output parameters are used to transmit parameter settings between ancestor and descendant nodes. An output parameter defined for the ancestor node can be used as an input parameter for the descendant node. However, the query results of the ancestor node cannot be directly passed to the descendant node. If you want a node to use the query results of its ancestor node, you can use an assignment node to pass the data. For more information, see Configure an assignment node. Some types of nodes support the assignment parameter. The assignment parameter functions the same as an assignment node. For more information about how to configure the assignment parameter, see the following sections in this topic and the topic about how to configure an assignment node.

Configure input and output parameters

  1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. In the Scheduled Workflow pane of the DataStudio page, double-click a desired node to go to the configuration tab of the node.

  3. Click the Properties tab in the right-side navigation pane. On the Properties tab, configure the input parameters and output parameters in the Input and Output Parameters section.

    For more information about how to configure scheduling parameters for different types of nodes, see Configure scheduling parameters for different types of nodes.

    image

Configure output parameters

Common scenario

You can define output parameters in the Input and Output Parameters section. The following value types are supported for output parameters: Constant and Variable.

After you define an output parameter for the current node and commit the current node, the output parameter can be referenced as an input parameter for a descendant node of the current node.

Note

You cannot write code for the current node to assign a value to the defined output parameter.

image

Field

Description

Remarks

No.

The serial number, which is automatically increased by 1 each time you define an output parameter.

N/A.

Parameter Name

The name of the output parameter.

N/A.

Type

The value type of the output parameter.

Valid values: Constant and Variable.

Value

The value of the output parameter.

The value type can be Constant or Variable.

  • If the Type field is set to Constant, the value is a fixed string.

  • If the Type field is set to Variable, the value can be a global variable, a built-in scheduling parameter, or a custom scheduling parameter in the ${...} or $[…] format.

Description

The description of the output parameter.

N/A.

Add Method

The way in which the output parameter is defined.

Valid values: Added Automatically, Auto Parse, and Added Manually.

Actions

You can click Change or Delete to perform the related operation.

These two operations are not supported if a node depends on the current node. Before you configure a node to reference the output parameter of the current node, make sure that the output parameter is correctly defined on the current node.

Special scenario: assignment parameter

If the current node supports the assignment parameter and you want to pass the query results of the current node to its descendant node as a parameter, you can perform the following operations: Go to the configuration tab of the current node. Click the Properties tab in the right-side navigation pane. In the Output Parameters table in the Input and Output Parameters section, click Add assignment parameter. The outputs assignment parameter is added.

Note
  • The following types of nodes support the assignment parameter: EMR Hive, EMR Spark SQL, ODPS Script, Hologres SQL, AnalyticDB for PostgreSQL, and MySQL.

  • The assignment parameter is supported only in DataWorks Standard Edition or a more advanced edition.

After you click Add assignment parameter to add the outputs assignment parameter, a descendant node of the current node can reference the outputs parameter to obtain the query results of the current node. If the query results of the current node are empty, the current node is not blocked. However, the descendant node that references the outputs assignment parameter of the current node may fail to run.

image

You must add the assignment parameter of the current node as an input parameter of a descendant node of the current node. The assignment parameter can be referenced by using a two-dimensional array in the code. The assignment parameter functions the same as an assignment node that uses ODPS SQL. For more information, see Configure an assignment node.

Configure input parameters

You can define an input parameter for a node to reference an output parameter defined for its ancestor node. You can use the input parameter in the code of the node in the same way as the assignment of scheduling parameters to variables in the node code. After you configure scheduling dependencies to allow the current node to depend on a node, the current node can reference an output parameter defined for its ancestor node as an input parameter for the current node in the Input Parameters section.

  1. Configure dependencies.

    In the Dependencies section of the Properties tab on the configuration tab of a desired node, add an ancestor node for the current node.

    image

  2. Define an output parameter for the ancestor node of the current node.

    In the Input and Output Parameters section, define an output parameter for the ancestor node of the current node. For more information, see the Configure output parameters section in this topic.

    image

  3. Define an input parameter for the current node.

    In the Input Parameters table in the Input and Output Parameters section, define an input parameter and configure this parameter to reference the output parameter of the ancestor node of the current node.

    image

    Field

    Description

    Remarks

    No.

    The serial number, which is automatically increased by 1 each time you define an input parameter.

    N/A.

    Parameter Name

    The name of the input parameter.

    N/A.

    Value Source

    The value source of the input parameter.

    The value of the input parameter is the value of the output parameter of the ancestor node.

    Description

    The description of the input parameter.

    N/A.

    Parent Node ID

    The ID of the ancestor node.

    The system parses the ID of the ancestor node.

    Add Method

    The way in which the input parameter is defined.

    Valid values: Added Automatically, Auto Parse, and Added Manually.

    Actions

    You can click Change or Delete to perform the related operation.

    N/A.

Use the input parameter

The defined input parameter is referenced in the code of a node in the same way as other built-in scheduling parameters. The input parameter is configured in the ${Input parameter name} format. The following figure shows how to reference an input parameter in the code of a Shell node.

image

Note

You must run the workflow to which the ancestor node and the current node belong to ensure that the ancestor node and the current node are run in sequence. Otherwise, the input parameter that is defined for the current node is invalid.

Supported built-in scheduling parameters

  • Built-in scheduling parameters

    Built-in scheduling parameter

    Description

    ${projectId}

    The ID of the MaxCompute project.

    ${projectName}

    The name of the MaxCompute project.

    ${nodeId}

    The ID of the node.

    ${gmtdate}

    The date on which the node instance is scheduled to run, in the yyyy-MM-dd 00:00:00 format.

    ${taskId}

    The ID of the node instance.

    ${seq}

    The sequence number of the node instance, which indicates the ranking of this node instance among all node instances on the same day.

    ${cyctime}

    The time at which the node instance is scheduled to run.

    ${status}

    The status of the node instance. Valid values: SUCCESS and FAILURE.

    ${bizdate}

    The data timestamp.

    ${finishTime}

    The time at which the node instance finishes running.

    ${taskType}

    The mode in which the node instance is run. Valid values: NORMAL, MANUAL, PAUSE, SKIP, UNCHOOSE, and SKIP_CYCLE.

    ${nodeName}

    The name of the node.

  • For more information about other parameters, see Supported formats of scheduling parameters.