DataWorks provides the real-time synchronization feature to allow you to synchronize data changes from a single table or all tables in a source to a destination in real time. This way, data in the destination is consistent with data in the source in real time.
Limits
You cannot run a real-time synchronization node on the DataStudio page. Instead, you must run a real-time synchronization node in Operation Center in the production environment after you save and commit the node.
Real-time synchronization nodes can be run only on exclusive resource groups for Data Integration. For more information, see Exclusive resource groups for Data Integration.
Real-time synchronization nodes cannot be used to synchronize views.
Overview
The following figure shows the capabilities of the real-time synchronization feature.
Capability | Description |
Data synchronization between various data sources | The real-time synchronization feature allows you to combine multiple types of data sources to form a star-shaped data synchronization link. You can synchronize data between different types of data sources. For more information, see Data source types that support real-time synchronization. |
Data synchronization from or to data sources that are deployed in complex network environments | The real-time synchronization feature supports data synchronization from or to Alibaba Cloud data sources, data centers, data sources that are hosted on Elastic Compute Service (ECS) instances, and data sources that do not belong to Alibaba Cloud. You can select appropriate network connectivity solutions to establish network connections between your resource group and data sources based on the network environments in which the data sources are deployed. Before you configure a data synchronization node, you must make sure that network connections are established between your resource group for Data Integration and data sources. For more information about how to establish a network connection between a resource group and a data source, see Establish a network connection between a resource group and a data source. |
Data synchronization scenarios | The real-time synchronization feature allows you to synchronize incremental data from a single table to another single table in real time, synchronize incremental data from tables in sharded databases to a single table in real time, and synchronize incremental data from multiple tables in a database to multiple tables in real time.
Note The real-time synchronization feature can be used to synchronize only incremental data in real time. If you want to synchronize full data from a source at a time and then synchronize incremental data from the source in real time, you can use the solution-based synchronization feature. You can use the solution-based synchronization feature to continuously synchronize data from a source to a destination, which helps ensure the consistency between data in the destination and data in the source in real time. For more information about how to select a data synchronization feature, see Overview. |
Configurations for real-time synchronization nodes | The real-time synchronization feature provides the following capabilities to allow you to configure a real-time synchronization node. You do not need to write code to configure the node. You need to only make simple configurations for the node to perform real-time ETL of incremental data from a single table or real-time synchronization of incremental data from multiple tables in a database. For more information, see Configure a real-time synchronization node to synchronize incremental data from a single table and Create a real-time synchronization node to synchronize all incremental data from a database.
|
O&M for real-time synchronization nodes |
|