After a synchronization task is configured, you can manage the task and view the running details of the task. This topic describes common O&M operations that can be performed on a full and incremental synchronization task.
Background information
This topic describes only common O&M operations that can be performed on a full and incremental synchronization task. For information about how to perform O&M operations on a real-time synchronization subtask and a batch synchronization subtask that are generated by a full and incremental synchronization task, see O&M for real-time synchronization nodes and O&M for batch synchronization nodes.
Manage a full and incremental synchronization task
After a full and incremental synchronization task is configured, you can go to the Nodes page in Data Integration in the DataWorks console to view the synchronization task. This page displays all created synchronization tasks. You can specify filter conditions to search for the desired synchronization task. Then, you can perform the operations that are described in the following table on the synchronization task.
Operation | Description |
Start | You can click Commit and Run in the Actions column of the synchronization task to start the synchronization task. |
Edit | In a business scenario, your business in the production environment may evolve over time. Your business tables may increase or decrease. In this case, you may need to adjust the number of business tables from which you want to synchronize data. Data Integration allows you to adjust the source tables that are specified in your synchronization task. You can click More in the Actions column of the synchronization task and select Modify Configuration to go to the configuration page of the synchronization task. On the configuration page, you can add or remove source tables based on your business requirements. After the adjustment is complete, you can go back to the Nodes page, find the synchronization task, and then click Commit and Run in the Actions column of the synchronization task to run the synchronization task. When you rerun the synchronization task, the system compares the source tables specified in the synchronization task in this run with the source tables specified in the synchronization task in the previous run. If new tables are detected, the system runs the synchronization task to synchronize data from the new tables. For more information, see Add or remove source tables to or from a synchronization solution that is running. If you run a one-click real-time synchronization task, the synchronization task synchronizes full data from the newly added tables. After the full data is synchronized, the system runs the real-time synchronization subtask generated by the synchronization task to synchronize incremental data from the newly added tables in real time. Note
|
Forcefully rerun | In some special cases, you can click More in the Actions column of the synchronization task and select Force Rerun to rerun the synchronization task. For example, if data in the source is contaminated or errors occur on data links, you can perform the forcible rerun operation. After you forcefully rerun the synchronization task, the system synchronizes full data and incremental data from the source to the destination again. Note
In the following scenarios, a one-click real-time synchronization task used to synchronize data to MaxCompute needs to be rerun to restore data:
Important
|
Backfill full data | You can perform this operation if you need to synchronize full data from the source again to resolve data accuracy issues, such as data loss, that occur on the data synchronized to MaxCompute tables in the synchronization task. Note
To backfill full data for a one-click real-time synchronization task used to synchronize data to MaxCompute, find the synchronization task on the Nodes page in Data Integration, click More in the Actions column, and then select Backfill Full Data.
Important
|
Stop | If the synchronization task is running and you want to stop the running of the synchronization task, you can click Stop in the Actions column of the synchronization task. |
View the status overview of synchronization tasks
You can go to the Running Status Overview page in Data Integration and specify a period of time to view the status overview of synchronization tasks. The Running Status Overview page contains the following sections:
Solution Status Distribution: displays the total number of synchronization tasks and displays the status distribution of the synchronization tasks in a pie chart. The statistical data about the status distribution shows the number of synchronization tasks that are successfully run and the number of synchronization tasks that fail to be run. The statistical data is collected in the specified period of time. You can click a sector in the pie chart to go to the synchronization task list page. On this page, you can view the synchronization tasks that are successfully run or fail to be run, and the running details of a synchronization task. For more information about the running details of a synchronization task, see View the running details of a synchronization task.
Usage of Resources in Resource Groups: displays the specifications and resource usage of the resource groups that are used within the current Alibaba Cloud account. You can click the name of a resource group to go to the details page of the resource group. On the details page, you can view the basic information and resource usage of the resource group. For information about resource groups, see View the resource usage of an exclusive resource group.
Batch Synchronization Nodes: displays the number of batch synchronization subtasks generated by specific synchronization tasks, the data synchronization speed, the status distribution of the batch synchronization subtasks, and the details of the synchronized data. The statistical data is collected in the specified period of time.
The statistical data about the status distribution shows the number of the batch synchronization subtasks that are successfully run and the number of the batch synchronization subtasks that fail to be run.
The Synchronization Data subsection displays the following items:
Number of synchronization subtasks: the number of batch synchronization subtasks that are successfully run
Amount of data synchronized: the amount of data synchronized by batch synchronization subtasks that are successfully run or running
Number of data records synchronized: the number of data records that are synchronized by batch synchronization subtasks
NoteThe statistical data in the Batch Synchronization Nodes section is updated per hour.
Real-time Synchronization Nodes: displays the number of real-time synchronization subtasks generated by specific synchronization tasks, the data synchronization speed, the status distribution of the real-time synchronization subtasks, and the top 10 subtasks with the highest latency. You can click the name of a subtask to go to the Real Time DI page and view the details of the subtask.
View the running details of a synchronization task
You can click Data Synchronization Node in the left-side navigation pane of the Data Integration page to go to the Nodes page.
On the Nodes page, you can view information, such as the type and name, of a synchronization task and the operations that you can perform on the synchronization task. You can also click Running Details in the Actions column of a synchronization task to view the running details of the synchronization task. The Running Details page contains the following sections:
Process: displays information such as the status of environment preparation, batch synchronization subtasks, and the real-time synchronization subtask. You can check whether the subtasks are run as expected based on their status. This way, you can troubleshoot the issues that occur on the synchronization task at the earliest opportunity. The following icons are used to indicate different states:
If the icon is displayed, the subtask is successfully run.
If the icon is displayed, the subtask failed to be run.
If the icon is displayed, the subtask is waiting to be run.
Full Batch Synchronization and Real-time Synchronization: display the information about the batch synchronization subtasks and the real-time synchronization subtask generated by the synchronization task. The information includes the source name, data synchronization speed, synchronized data, resource group that is used, and data synchronization latency.
Steps: displays all steps that are required to complete the synchronization task from subtask creation to running of batch synchronization subtasks and the real-time synchronization subtask. You can view the start time, end time, and status of each step in this section.