All Products
Search
Document Center

DataWorks:Process management

Last Updated:Nov 13, 2024

DataWorks provides a unified data development and governance process for end-to-end data development and governance. You can also manage key stages in the process based on your business requirements. This topic describes the process management capabilities supported by DataWorks during data development.

Background information

DataWorks workspaces are classified into workspaces in standard mode and workspaces in basic mode. The node development process varies based on the workspace mode. The following figures show the node development processes in workspaces in standard and basic modes.

  • Node development process in a workspace in standard modeNode development process in a workspace in standard mode

  • Node development process in a workspace in basic modeNode development process in a workspace in basic mode

You can manage key stages in a general node development process. For example, you can perform a check before you run a node for debugging, commit a node, or deploy a node, as shown in the preceding figures.

Sample check before node running

Sample check before node committing

Sample check before node deployment

Code running

Node committing

Node deployment

You can use DataWorks services, such as Open Platform and Data Governance Center, to manage key stages in the data development process.

DataWorks service

Check before node running

Check before node committing

Check before node deployment

Description

Data Governance Center

Supported

Supported

Supported

Data Governance Center provides multiple built-in check items. You can enable a check item based on your business requirements. This way, when you perform the related operation, the built-in check logic of DataWorks is triggered to check whether the conditions for the operation are met. You can proceed to the subsequent operations in the process only after the check is passed.

Open Platform

Supported

Supported

Supported

If the built-in check items cannot meet your process management requirements, you can use Open Platform to register and develop programs as DataWorks extensions to check related events and add the check events to the overall data development process.

The following sections use the data development process in a workspace in standard mode as an example to describe the process management capabilities.

Enable built-in check items provided by Data Governance Center

Data Governance Center provides multiple built-in check items. You can enable a check item based on your business requirements. This way, when you perform the related operation, the built-in check logic of DataWorks is triggered to check whether the conditions for the operation are met. You can proceed to the subsequent operations in the process only after the check is passed.

Item

Description

DataWorks service

Data Governance Center

Data Governance Center provides multiple built-in check items. You can enable the check items to check whether the conditions for specific operations are met. Check items

  • Before node running: You can enable check items for which the Effective Point parameter is set to Pre-event for Code Running.

  • Before node committing: You can enable check items for which the Effective Point parameter is set to Pre-event for Node Commit.

  • Before node deployment: You can enable check items for which the Effective Point parameter is set to Pre-event for Node Deployment.

Entry point for configuring check items and guidance

You need to enable check items in Data Governance Center and specify the workspace in which the check items that you enabled take effect.

For more information, see Configure governance items.

Develop custom check logic in Open Platform

If the built-in check items cannot meet your process management requirements, you can use Open Platform to register and develop programs as DataWorks extensions to check related events and add the check events to the overall data development process. The following table describes the capability of developing custom check logic that is used to perform a check before a node is run for debugging.

Item

Description

DataWorks service

Open Platform

DataWorks Open Platform provides the following modules: OpenAPI, OpenEvent, and Extensions. You can use the OpenEvent module to subscribe to event messages generated for the operations that you perform on the DataWorks DataStudio page, use the Extensions module to create an extension to process the event messages, and then use the OpenAPI module to send processing results to DataWorks. For information about the OpenEvent and Extensions modules, see Overview and Overview.

Check process

If you use the features provided by Open Platform to subscribe to event messages generated for specific operations that you perform on the DataStudio page and create an extension to process the event messages, a check is triggered when a specific operation is performed. The following figure shows a check process before node running. Check process before node running

Entry point for configuration and guidance

In Open Platform, you need to subscribe to event messages generated for the operations that you perform on the DataStudio page, develop an extension that is used to process the event messages, publish the extension to DataWorks, and then specify the workspace in which the extension is enabled.

  • Before node running: You need to subscribe to events related to node running, such as file execution pre-events.

  • Before node committing: You need to subscribe to events related to node committing, such as file commit pre-events and table commit pre-events.

  • Before node deployment: You need to subscribe to events related to node deployment, such as file deployment pre-events and table deployment pre-events.

For more information about the event types supported by Open Platform, see Overview.