Data Transmission Service (DTS) replicates table schemas and streams incremental changes from a PolarDB for MySQL cluster to a DataHub project for real-time extract, transform, and load (ETL) pipelines and downstream analytics.
Prerequisites
Before you begin, make sure that you have:
Activated DataHub and created a project to receive the migrated data. For more information, see Get started with DataHub and Manage projects
Verified that the available storage space of the destination DataHub project exceeds the total data size in the source PolarDB for MySQL cluster
Migration types and SQL operations
Migration types
Schema migration: DTS migrates the schemas of selected objects from the source database to the destination database.
Incremental data migration: After full data migration completes, DTS continuously migrates incremental data from the source to the destination. This keeps data in sync without interrupting application services.
For this migration path, select Schema Migration or Incremental Data Migration based on your business requirements.
SQL operations for incremental migration
| Operation type | SQL statements |
|---|---|
| Data Manipulation Language (DML) | INSERT, UPDATE, and DELETE |
| DDL | ALTER TABLE and TRUNCATE TABLE |
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration and full data migration | Free of charge | Charged only when data is migrated from Alibaba Cloud over the Internet. For more information, see Billing overview. |
| Incremental data migration | Charged. For more information, see Billing overview. |
Required permissions
| Database | Required permission | References |
|---|---|---|
| PolarDB for MySQL cluster | Read permissions on the objects to be migrated | Create and manage a database account |
Limitations
DTS does not migrate foreign keys from the source database. As a result, cascade and delete operations in the source database are not migrated to the destination database.
| Category | Limitation |
|---|---|
| Source database | The server that hosts the source database must have sufficient outbound bandwidth. Insufficient bandwidth reduces the migration speed. |
| Source database | Tables to be migrated must have PRIMARY KEY or UNIQUE constraints, and all fields must be unique. Otherwise, the destination database may contain duplicate records. |
| Source database | When you select tables as migration objects and edit them (for example, rename tables or columns), a single task supports up to 1,000 tables. To migrate more than 1,000 tables, create multiple tasks or migrate the entire database. |
| Source database | For incremental data migration, enable binary logging and set the loose_polar_log_bin parameter to on. Otherwise, the precheck returns errors and the task cannot start. For more information, see Enable binary logging and Modify parameters. |
| Source database | Enabling binary logging for a PolarDB for MySQL cluster incurs storage charges for the space that binary logs occupy. |
| Source database | Binary logs of the source database must be retained for more than 24 hours. If DTS cannot obtain binary logs, the task may fail. In exceptional cases, data inconsistency or loss may occur. If you do not meet the retention requirement, the Service Level Agreement (SLA) of DTS does not guarantee service reliability or performance. |
| Source database | Do not perform Data Definition Language (DDL) operations that change database or table schemas during schema migration. DDL operations cause the task to fail. |
| Source database | Read-only nodes of the source PolarDB for MySQL cluster cannot be migrated. |
| Other | Only tables are supported as migration objects. If a column is added to a source table during migration, the new column is not migrated to the destination DataHub project. |
| Other | During full data migration, DTS consumes read and write resources of both the source and destination databases. This may increase database server loads. Migrate data during off-peak hours. |
| Other | DTS uses the ROUND(COLUMN, PRECISION) function to retrieve values from FLOAT and DOUBLE columns. If you do not specify a precision, DTS uses 38 digits for FLOAT and 308 digits for DOUBLE. Verify that these precision settings meet your business requirements. |
| Other | DTS attempts to resume failed tasks within the last seven days. Before you switch workloads to the destination database, stop or release the migration task. You can also run the REVOKE statement to remove write permissions from the DTS accounts on the destination database. Otherwise, resumed tasks may overwrite data in the destination database. |
| Other | Do not use tools such as pt-online-schema-change to perform DDL operations on objects during migration. DDL operations from external tools may cause the task to fail. |
| Usage note | DTS periodically executes `CREATE DATABASE IF NOT EXISTS test ` in the source database to advance the binary log file position. |
Procedure
Step 1: Open the data migration page
Log on to the Data Management (DMS) console.
In the top navigation bar, move the pointer over DTS.
Choose DTS (DTS) > Data Migration.
The actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console. You can also go to the Data Migration page of the new DTS console directly.
From the drop-down list on the right side of Data Migration Tasks, select the region where your data migration instance resides.
In the new DTS console, select the region in the upper-left corner instead.
Step 2: Configure source and destination databases
Click Create Task.
On the Create Data Migration Task page, configure the source and destination databases.
Source Database
Parameter Description Task Name DTS generates a name automatically. Specify a descriptive name to identify the task. The name does not need to be unique. Select DMS Database Instance Optional. Select an existing instance to auto-populate database parameters. If you do not select one, configure the parameters manually. Database Type Select PolarDB for MySQL. Connection Type Select Alibaba Cloud Instance. Instance Region Select the region where the source PolarDB for MySQL instance resides. Cross-account Select No if the source and destination belong to the same Alibaba Cloud account. PolarDB Cluster ID Select the ID of the source PolarDB for MySQL cluster. Database Account Enter the database account of the source PolarDB for MySQL cluster. The account must have read permissions on the objects to be migrated. Database Password Enter the password for the database account. Destination Database
Parameter Description Select DMS Database Instance Optional. Select an existing instance to auto-populate database parameters. If you do not select one, configure the parameters manually. Database Type Select DataHub. Connection Type Select Alibaba Cloud Instance. Instance Region Select the region where the destination DataHub project resides. Project Select the ID of the destination DataHub project. In the lower part of the page, click Test Connectivity and Proceed. DTS automatically adds the CIDR blocks of DTS servers to the IP address whitelist of Alibaba Cloud database instances, or to the security group rules of Elastic Compute Service (ECS) instances that host self-managed databases. For self-managed databases in a data center or from a third-party cloud provider, manually add the CIDR blocks of DTS servers to the database IP address whitelist. For more information, see CIDR blocks of DTS servers.
WarningAdding public CIDR blocks of DTS servers to a database whitelist or ECS security group introduces security risks. Take preventive measures such as strengthening username and password security, restricting exposed ports, authenticating API calls, regularly auditing whitelist or security group rules, forbidding unauthorized CIDR blocks, and connecting to DTS through Express Connect, VPN Gateway, or Smart Access Gateway.
Step 3: Configure migration objects and settings
Configure the objects to be migrated and the migration settings.
Parameter Description Migration Types Select Schema Migration or Incremental Data Migration based on your requirements. NoteIf you do not select Incremental Data Migration, do not write data to the source database during migration to maintain data consistency.
Processing Mode of Conflicting Tables Naming Rules of Additional Columns After DTS migrates data to DataHub, it adds additional columns to the destination topic. If column names conflict with existing columns, the migration fails. Select New Rule or Previous Rule based on your requirements. WarningBefore you configure this parameter, check for name conflicts between additional columns and existing columns. For more information, see Naming rules for additional columns.
Capitalization of Object Names in Destination Instance Set the capitalization rule for database, table, and column names. The default is DTS default policy. For more information, see Specify the capitalization of object names in the destination instance. Source Objects Select objects from the Source Objects section and click the right arrow icon to add them to Selected Objects. NoteSelect tables or databases as migration objects. Selecting tables excludes views, triggers, and stored procedures.
Selected Objects To rename an object, right-click it in the Selected Objects section. For more information, see Map the name of a single object. To rename multiple objects at a time, click Batch Edit. For more information, see Map multiple object names at a time. NoteRenaming objects may cause dependent objects to fail to migrate. To filter data, right-click an object in Selected Objects and specify WHERE conditions. For more information, see Specify filter conditions. To select specific SQL operations for a database or table, right-click the object in Selected Objects and choose the operations to migrate.
Click Next: Advanced Settings.
Parameter Description Monitoring and Alerting Configure alerting for the task. When enabled, alert contacts receive notifications if the task fails or migration latency exceeds the specified threshold. For more information, see Configure monitoring and alerting when you create a DTS task. Configure ETL Enable or disable the ETL feature. When enabled, enter data processing statements in the code editor. For more information, see What is ETL? and Configure ETL in a data migration or data synchronization task.
Step 4: Run the precheck and purchase the instance
Click Next: Save Task Settings and Precheck. You can move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters to view the API parameters for this task.
If a precheck item fails, click View Details next to the item. Analyze the cause, fix the issue, and run the precheck again.
If a precheck item triggers an alert that can be ignored, click Confirm Alert Details, then click Ignore in the View Details dialog box, and click OK. Then click Precheck Again. Ignoring alert items may cause data inconsistency and business risks.
DTS runs a precheck before the task starts. The task starts only after the precheck succeeds.
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the instance settings.
Parameter Description Resource Group The resource group for the data migration instance. The default value is default resource group. For more information, see What is Resource Management? Instance Class Select an instance class based on your migration speed requirements. For more information, see Instance classes of data migration instances. Read and select the Data Transmission Service (Pay-as-you-go) Service Terms check box.
Click Buy and Start. In the confirmation message, click OK. View the migration progress on the Data Migration page.