All Products
Search
Document Center

DataWorks:Data source configuration

Last Updated:Feb 08, 2026

A data source connects to various databases and storage services, such as MaxCompute, MySQL, and OSS. It is a prerequisite for a synchronization task in Data Integration. It defines the database from which the task reads data (the source) and the database to which it writes data (the destination).

The role of a data source

In a Data Integration task, a data source acts as an endpoint at both ends of the data flow:

  • Source (Reader): The task reads data from the data source configured as the source.

  • Destination (Writer): The task writes the processed data to the data source configured as the destination.

You must configure both a source and a destination data source before you can synchronize a single table or a full database, in either batch or real-time mode. A correctly configured data source with proper network connectivity is required for tasks to run successfully.

image

Supported data source types

For a list of data sources supported by DataWorks Data Integration, see Supported data sources and synchronization solutions. The configuration process might vary slightly depending on the data source type. Refer to the UI for specific details.

Create a data source

Important

DataWorks recommends creating and managing all data sources centrally in Management Center. Data sources created here are reusable, manageable, and support features like environment isolation. This approach is a best practice for enterprise-level data development and production workloads.

For configuration instructions, see: Data source management.

You can create a data source in either Management Center or Data Integration. The following table compares the two methods.

Capability

Management Center (recommended)

Data Integration

Management location

Management Center > Data Sources.

Data Integration > Data Sources.

Environment isolation

Supports separate configurations for the development environment and production environment to protect production workloads.

Not supported. Only a production environment is available.

Multi-module reusability

Can be used across all modules, including Data Integration, Data Studio, and Data Analysis.

Has limited functionality when used in other modules.

Permission control

Supports cross-workspace authorization.

Does not support authorization.

Applicable mode

Recommended for workspaces in standard mode. Aligns with enterprise standards.

Suitable for basic mode, or for standard mode scenarios that do not require isolation.

Cloning

Supports cloning to quickly create a new data source.

Not supported.

Both methods support third-party authentication and Use the RAM role-based authorization mode to add a data source.
  • The creation process is the same in both locations.

  • When you create a data source in Management Center, a corresponding data source with the same name is automatically created in Data Integration. Both share the same production environment configuration.

  • When you create a data source in Data Integration, a corresponding data source with the same name is also automatically created in Management Center. However, this data source contains information only for the production environment. The development environment will be marked as incomplete and must be configured manually.

  • Configuration parameters vary by data source type. For more information, see: Data source list.

Use a data source

image

Basic mode:

In a workspace that uses basic mode, there is only one environment. data sources created in Management Center and Data Integration are identical.

Standard mode:

A workspace in standard mode supports environment isolation for data sources. A single data source name can have two separate configurations: one for the development environment and one for the production environment. You can set them to different databases or instances to isolate test data from production data, which helps protect your production data.

  • In Data Integration, only the batch synchronization for a single table task type supports environment isolation. All other types of synchronization tasks use the production environment data source.

  • A data source created in Data Integration contains only the production environment configuration. Because its development environment information is missing, it cannot be used directly in data development tasks. You must go to Management Center to complete the development environment configuration before you can use it in Data Studio and for batch synchronization of a single table.

Next steps

After you configure the data source and it passes the connectivity test, you can configure a synchronization task in Data Integration:

FAQ

For more frequently asked questions about data sources, see: FAQ.