After a batch synchronization node that is created in DataStudio is committed and deployed to the production environment, you can go to Operation Center to manage the batch synchronization node, monitor the status of the node, change the resource group that is used to run the node, and view the run logs of the node. This ensures that the synchronization node can be run as expected. This topic describes the common O&M operations that you can perform on a batch synchronization node.
Prerequisites
A batch synchronization node is created, deployed, and run as expected. For more information, see Configure a batch synchronization node by using the codeless UI and Configure a batch synchronization node by using the code editor.
Usage notes
- The O&M operations that can be performed on batch synchronization nodes are the same as the O&M operations that can be performed on other types of auto triggered nodes. This topic describes how to perform common O&M operations on batch synchronization nodes. For more information about O&M for auto triggered nodes, see O&M overview of auto triggered nodes.
- To ensure that a batch synchronization node can be run as expected after you deploy the node, you can go to the Cycle Task page in Operation Center in the production environment to check whether the configurations of the node in the production environment meet your requirements. The configurations include the code of the node and the resource groups for scheduling and for Data Integration used to run the node.
- Batch synchronization nodes are issued to a resource group for Data Integration by using a resource group for scheduling. Therefore, execution of batch synchronization nodes requires both a resource group for Data Integration and a resource group for scheduling. If you use an exclusive resource group for scheduling, you are charged for scheduling instances. For more information, see Mechanism for issuing nodes.
- Workspaces in standard mode support isolation of data sources.
- Before a node is deployed to the production environment, the system accesses the databases or data warehouses in the development environment that correspond to the data sources you added to the node by default.
- After a node is deployed to the production environment, the system accesses the databases or data warehouses in the production environment that correspond to the data sources you added to the node by default.
Schedule and manage a batch synchronization node
DataWorks provides powerful scheduling capabilities for you to run batch synchronization nodes. You can configure scheduling parameters for a batch synchronization node to write incremental and full data to a specific partition of a destination table. The O&M operations that can be performed on batch synchronization nodes are the same as the O&M operations that can be performed on other types of auto triggered nodes. You can also manually run a batch synchronization node.Operation | Description |
---|---|
Run a batch synchronization node | After you deploy a batch synchronization node to the production environment, you can
go to the Cycle Task page in Operation Center in the production environment to view the node. The scheduling system runs the node
based on the configurations of the scheduling parameters. You can also manually run
the node.
|
Suspend scheduling of a batch synchronization node | On the Cycle Task page in Operation Center, you can freeze an auto triggered node for a period of time.
After you freeze the auto triggered node, the auto triggered node and its descendant
nodes cannot be run.
Note Instances are generated for an auto triggered node after the node is run. If an auto
triggered node instance and its descendant instances do not need to be run, you can
freeze the current auto triggered node instance.
|
Resume scheduling of a batch synchronization node | On the Cycle Task page in Operation Center, you can unfreeze an auto triggered node. After you unfreeze
the auto triggered node, the node can be run as expected.
Note Instances generated for a frozen auto triggered node are also frozen. If you want
to run a frozen auto triggered node instance and its descendant instances, you can
unfreeze the current auto triggered node instance.
|
Synchronize historical data
DataWorks allows you to synchronize historical data to a specified table or partition in the destination database or data warehouse based on the scheduling parameter configurations and data backfill configurations of a batch synchronization node. If you want to configure a batch synchronization node to synchronize incremental data and historical data to a specified partition in the destination table, you must configure the data backfill settings for the node. When you backfill data for the node, the system assigns the value that you specify for the Data Timestamp parameter to the variable of the related scheduling parameter. For more information about how to backfill data for a node, see View and manage data backfill instances.
Monitor the status of a batch synchronization node
You can create an alert rule to monitor the status of an auto triggered node on the Rule Management page. To go to the Rule Management page, perform the following operations: In the left-side navigation pane of the Operation Center page, choose Overview.
. An alert notification is sent if the node is in a specified state, such as Completed, Uncompleted, Error, or Overtime. For more information, seePerform O&M operations on resource groups
- Monitor resource groups: On the Resource page of Operation Center, you can monitor the usage of resource groups that are used to run nodes. For more information, see Use the resource O&M feature.
- Change resource groups: You can change the resource group that is used to run nodes
to another resource group by using one of the methods described in the following table.
Note Before you change a resource group, make sure that network connections are established between the resource group that you want to select and the required data sources. If you do not establish the required network connections, nodes fail to run.
Operating environment Supported operation Entry point Production environment Change the resource groups for multiple nodes at the same time Go to the Operation Center page. In the left-side navigation pane, choose .Select the nodes for which you want to change the resource groups and click Modify Data Integration Resource Group at the bottom of the Cycle Task page.Development environment Note After you change the resource group for a node in the development environment, you must commit and deploy the node to the production environment again.- Change the resource group for a single node
- Change the resource groups for multiple nodes at the same time
Go to the DataStudio page. - Change the resource group for a single nodeGo to the configuration tab of the node for which you want to change the resource group and click Resource Group configuration in the right-side navigation pane. On the Resource Group configuration tab, you can change the resource group for the node.
- Change the resource groups for multiple nodes at the same time
Click the icon. On the batch operation tab, select the nodes for which you want to change the resource groups and click Modify Data Integration Resource Group at the bottom of the batch operation tab.
Monitor the quality of table data
On the Data Quality page, you can configure monitoring rules for tables of some destinations to monitor the data quality of data in the tables. If you configure monitoring rules for a table, the monitoring rules are triggered after the scheduling node with which you associate the table is successfully run. If exceptions are detected, Data Quality determines whether to fail the node and block the descendant nodes based on the check result and rule settings, such as the rule type. This way, dirty data is stopped from being forwarded as downstream data. For more information about the destinations that support monitoring rules and how to use Data Quality, see Overview.
View the run logs of a batch synchronization node
View the statistics on batch synchronization nodes
On the Batch Sync tab of the Data integration tab under Overview in Operation Center, you can view the statistics on node execution, such as execution status distribution, data synchronization progress, synchronized data volume, and details of synchronization nodes. You can search for the desired synchronization node based on the filter conditions such as Source Name, Destination Name, and Whether there is public network traffic. For more information, see View the statistics on the Overview page.