All Products
Search
Document Center

DataWorks:Data Integration: Integration of data from various data sources

Last Updated:Sep 05, 2024

Data Integration is a stable, efficient, and scalable data synchronization service. It can be used to migrate and synchronize data among a wide range of heterogeneous data sources that reside in complex network environments in a fast and stable manner.

Overview

DataWorks Data Integration supports batch synchronization, real-time synchronization, and full and incremental synchronization that combines batch and real-time synchronization.

  • You can configure a scheduling cycle for a batch synchronization task.

  • Data synchronization among more than 50 types of heterogeneous data sources such as relational databases, data warehouses, non-relational databases, file storage systems, and message queues is supported.

  • Network connectivity solutions for data source connections in complex network environments are provided. You can use Data Integration to connect data sources that reside on the Internet, or in data centers or virtual private clouds (VPCs).

  • Security control and O&M monitoring are supported to ensure that the data synchronization process is secure and controllable.

Core technology and architecture

  • Engine architecture引擎架构A star-shaped engine architecture is provided. After a data source is added to Data Integration, the data source can be connected to another data source in Data Integration to form a data synchronization link for data synchronization. For more information about the supported data sources, see Supported data source types and synchronization operations.

  • Resource groups for Data Integration and network connectivity

    image

    Before you synchronize data, you must connect the data sources to your resource group for Data Integration, as shown in the preceding figure. DataWorks allows you to use serverless resource groups or exclusive resource groups for Data Integration to run data synchronization tasks in Data Integration. We recommend that you use serverless resource groups. For information about network connectivity solutions that can be used, see Network connectivity solutions.

Use scenarios

DataWorks Data Integration is suitable for data transmission scenarios such as data import into data warehouses or data lakes, sharding, real-time data archiving, and data forwarding between clouds.

Billing

You may be charged the following fees for running data synchronization tasks in Data Integration:

  • Fees charged on the DataWorks side (included in bills for DataWorks)

    • Fee for data synchronization. For information about the fee for data synchronization, see Billing of data synchronization.

    • Fee for task scheduling. If a data synchronization task is deployed to the production environment for scheduling and running, the scheduling fee is generated. For more information about the fee for task scheduling, see Billing of task scheduling.

    • (Optional) Fee for a DataWorks edition. You are charged the fee only if you use an advanced DataWorks edition. For more information, see Billing of DataWorks advanced editions.

  • Fees that are not charged on the DataWorks side (not included in bills for DataWorks)

    When you run a data synchronization task in Data Integration, some fees that may be generated by the configurations related to the data synchronization task are not charged on the DataWorks side. For example, the fee for interaction with databases during data synchronization, the computing and storage fee for a compute engine, and the fee for network services such as Express Connect, Internet Shared Bandwidth, and Elastic IP Address (EIP) are charged by related services other than DataWorks.

    Note

    After the configuration of your task is complete, you need to check the fees that are not generated by DataWorks resources at the earliest opportunity.

Use Data Integration

After you activate DataWorks of a specific edition, you can purchase a resource group and develop data synchronization tasks in Data Integration based on your business requirements. For more information, see Overview of Data Integration.