All Products
Search
Document Center

DataWorks:O&M for real-time synchronization nodes

Last Updated:Nov 19, 2024

After you create a real-time synchronization node in DataStudio and commit and deploy the node to the production environment, you can run the node in Operation Center. In addition, you can configure monitoring and alerting to monitor the status of the node and view metrics related to the running of the node in Operation Center. This topic describes the common O&M operations that can be performed on a real-time synchronization node.

Prerequisites

A real-time synchronization node is created and deployed. For more information, see Create a real-time synchronization node to synchronize incremental data from a single table and Configure a real-time synchronization node to synchronize all incremental data from a database.

Run and manage the real-time synchronization node

After the real-time synchronization node is created and deployed, you can start, stop, or undeploy the node on the Real Time DI page in Operation Center. To go to the Real Time DI page, perform the following steps: Log on to the DataWorks console and go to the Operation Center page. On the Operation Center page, choose Real-Time Node O&M > Real-time Synchronization Nodes in the left-side navigation pane. For more information about how to run and manage a real-time synchronization node, see Manage real-time synchronization tasks.

Note

After you click Start in the operation column of a real-time synchronization node to start the node, you can specify an offset for data synchronization and the maximum number of failovers that are allowed within a specified period of time.

  • Offset: Resumable uploads or data synchronization from a specified offset is supported.

  • Failovers: To prevent system resources from being occupied when the node frequently starts due to failovers, you can configure the node to automatically stop when the number of failovers exceeds a specified threshold within a specified period of time. If you do not specify the maximum number of failovers, the system automatically stops the node when the number of failovers exceeds 100 within 5 minutes.

Monitor the real-time synchronization node

To prevent data output delays caused by errors that occur on the real-time synchronization node, you can configure monitoring and alerting settings for the node to monitor the node in Operation Center. To configure monitoring and alerting settings for the node, perform the following steps: Log on to the DataWorks console and go to the Operation Center page. On the Operation Center page, choose Real-Time Node O&M > Real-time Synchronization Nodes in the left-side navigation pane. On the Real Time DI page, find the node and click Alarm settings in the Operation column. For more information, see Manage real-time synchronization tasks.

  • Alerting conditions: You can configure monitoring and alerting based on the following conditions: node status, business delay, failover, and support for synchronization of DDL operations.

    Note

    Support for synchronization of data changes generated by DDL operations that are performed on a source varies based on the destination type. Specific types of destinations may not support the synchronization of such data changes. If you use such a destination for your real-time synchronization node and configure an alert rule for the node based on this alerting condition, a related error may be reported during the running of the node. For more information about the supported DDL operations, see Supported DML and DDL operations.

  • Notification method: Methods such as Email, text message, and DingTalk are supported.

  • Alerting frequency control: To prevent a large number of alerts from being reported within a short period of time, DataWorks allows you to control the alerting frequency for real-time synchronization nodes. You can configure a setting for the alert rule to enable DataWorks to send only one alert notification within a specified period of time.

Use LogView to view the running information about nodes

Note

This feature is in invitational preview. If you want to use the feature, contact technical personnel.

The LogView feature in Data Integration is used to collect data about data synchronization nodes in Data Integration based on events, analyze and process the data, and display analysis and processing results in a visualized manner. LogView can display and analyze information such as the data transmission rate and logs of a data synchronization node at a finer-grained granularity.

In the left-side navigation pane of the Operation Center page, choose RealTime Task > Real Time DI. On the Real Time DI page, find the desired real-time synchronization node, and click the name of the node. The page on which you can view the running information about the node is displayed. The following table describes the information on the page.

Tab

Description

Log

You can view log details of the real-time synchronization node and search for logs by time on this tab.

Progress

You can view the progress details of the real-time synchronization node on this tab. The progress is indicated by metrics such as business latency, number of data records that are synchronized, number of bytes that are synchronized, data transmission rate for synchronized data records, data transmission rate for synchronized bytes, and window waiting time.

You can also perform the following operations on this tab:

  • You can search for the synchronization information of the real-time synchronization node in a specified period of time by using a time picker.

    Note

    You can view the synchronization details about the node within 15 days.

  • On the right side of the node list, you can click the Custom columns icon to select the columns that you want to view.

  • In the node list section, you can switch between the list of workers and the list of tasks.

    • The list of workers lists the threads that are started to run the real-time synchronization node. If a real-time synchronization node is not run in distributed execution mode, only one worker is started. If a real-time synchronization node is run in distributed execution mode, multiple workers are started. The number of workers depends on the parallelism of the real-time synchronization node.

    • The list of tasks lists the tasks in a worker. Tasks are classified into two types: reader tasks and writer tasks. One or more reader tasks and one or more writer tasks may be required to run each worker based on the parallelism that you configure for a node.

    You can click the value of a metric for a node in the node list to view the value changes of the metric in a curve chart.

Database DDL event

You can view the DDL operation records that are identified on the source on this tab.

Database DML statistics

You can view statistical analysis results about the source and destination tables on this tab.

Failover

You can view the failover status and the curve chart for the number of failovers of the real-time synchronization node on this tab. You can also click View in the Log Details column of a failover to view logs of the failover.

Modify the real-time synchronization node

You can go to the DataStudio page to modify the configurations of the real-time synchronization node. To make the modifications take effect, you must stop the node before the modification, commit and redeploy the node after the modification, and then start the node on the Real Time DI page in Operation Center.

Modify processing rules for DDL messages for the real-time synchronization node

Before you start the real-time synchronization node, you can go to the DataStudio page to modify the rules that are configured for the node to process the DDL messages of the source. For more information, see Configure a real-time synchronization node to synchronize all incremental data from a database.

Note

To ensure that the modifications to the real-time synchronization node on the DataStudio page take effect, you must stop the node before the modification, commit and redeploy the node after the modification, and then start the node on the Real Time DI page in Operation Center.

FAQ

Why does my real-time synchronization task have high latency?