DataWorks provides a unified data development and governance process for end-to-end data development and governance. You can also manage key stages in the process based on your business requirements. This topic describes the process management capabilities supported by DataWorks during data development.
Background information
DataWorks workspaces are classified into workspaces in standard mode and workspaces in basic mode. The node development process varies based on the workspace mode. The following figures show the node development processes in workspaces in standard and basic modes.
Node development process in a workspace in standard mode
Node development process in a workspace in basic mode
You can manage key stages in a general node development process. For example, you can perform a check before you run a node for debugging, commit a node, or deploy a node, as shown in the preceding figures.
Sample check before node running | Sample check before node committing | Sample check before node deployment |
You can use DataWorks services, such as Open Platform and Data Governance Center, to manage key stages in the data development process.
DataWorks service | Check before node running | Check before node committing | Check before node deployment | Description |
Data Governance Center | Data Governance Center provides multiple built-in check items. You can enable a check item based on your business requirements. This way, when you perform the related operation, the built-in check logic of DataWorks is triggered to check whether the conditions for the operation are met. You can proceed to the subsequent operations in the process only after the check is passed. | |||
Open Platform | If the built-in check items cannot meet your process management requirements, you can use Open Platform to register and develop programs as DataWorks extensions to check related events and add the check events to the overall data development process. |
The following sections use the data development process in a workspace in standard mode as an example to describe the process management capabilities.
Enable built-in check items provided by Data Governance Center
Data Governance Center provides multiple built-in check items. You can enable a check item based on your business requirements. This way, when you perform the related operation, the built-in check logic of DataWorks is triggered to check whether the conditions for the operation are met. You can proceed to the subsequent operations in the process only after the check is passed.
Item | Description |
DataWorks service | Data Governance Center Data Governance Center provides multiple built-in check items. You can enable the check items to check whether the conditions for specific operations are met.
|
Entry point for configuring check items and guidance | You need to enable check items in Data Governance Center and specify the workspace in which the check items that you enabled take effect. For more information, see Configure governance items. |
Develop custom check logic in Open Platform
If the built-in check items cannot meet your process management requirements, you can use Open Platform to register and develop programs as DataWorks extensions to check related events and add the check events to the overall data development process. The following table describes the capability of developing custom check logic that is used to perform a check before a node is run for debugging.
Item | Description |
DataWorks service | Open Platform DataWorks Open Platform provides the following modules: OpenAPI, OpenEvent, and Extensions. You can use the OpenEvent module to subscribe to event messages generated for the operations that you perform on the DataWorks DataStudio page, use the Extensions module to create an extension to process the event messages, and then use the OpenAPI module to send processing results to DataWorks. For information about the OpenEvent and Extensions modules, see Overview and Overview. |
Check process | If you use the features provided by Open Platform to subscribe to event messages generated for specific operations that you perform on the DataStudio page and create an extension to process the event messages, a check is triggered when a specific operation is performed. The following figure shows a check process before node running. |
Entry point for configuration and guidance | In Open Platform, you need to subscribe to event messages generated for the operations that you perform on the DataStudio page, develop an extension that is used to process the event messages, publish the extension to DataWorks, and then specify the workspace in which the extension is enabled.
For more information about the event types supported by Open Platform, see Overview.
|