Data migration
The data migration feature helps you migrate data between homogeneous or heterogeneous data sources. This feature is suitable for scenarios such as data migration to the cloud, data migration across instances within Alibaba Cloud, and database splitting and scaling.
Category | Feature | Description | References |
Homogeneous migration | Logical migration | This feature allows you to migrate data between homogeneous databases. | Migrate data from a self-managed MySQL database to an ApsaraDB RDS for MySQL instance |
Physical migration | This feature allows you to migrate data from a self-managed database to the cloud by using a physical gateway. | ||
Heterogeneous migration | Logical migration | This feature allows you to migrate data between heterogeneous databases. | |
Traffic cutover | Database cutover to the cloud | This feature helps you smoothly migrate your business to cloud databases after you complete database and application evaluation and transformation. | |
Periodic tasks | Scheduled full migration | This feature allows you to migrate schema data and historical data from the source database to the destination database on a regular basis by using the scheduling policy configurations of the data integration feature. | Configure a data integration task between ApsaraDB RDS for MySQL instances |
Account migration | Full account migration | When you configure a synchronization or migration task, you can enable the account migration feature to easily migrate accounts including their passwords and permissions from the source database to the destination database. |
Data synchronization
The data synchronization feature helps you synchronize data between data sources in real time. This feature is suitable for various business scenarios such as active geo-redundancy, geo-disaster recovery, zone-disaster recovery, cross-border data synchronization, cloud-based business intelligence (BI) systems, and real-time data warehousing.
Category | Feature | Description | References |
Synchronization instance management | Reverse disaster recovery switching with several simple steps | This feature allows you to create a reverse instance with several simple steps for a synchronization instance that is running as expected. You can use the reverse instance to synchronize incremental data back from the destination database to the source database. | |
Disaster recovery and multi-active redundancy | Two-way synchronization | This feature allows you to configure real-time two-way data synchronization between two databases such as an ApsaraDB RDS for MySQL database and a self-managed MySQL database. This feature is suitable for various scenarios such as active geo-redundancy based on a cellular architecture and geo-disaster recovery. | |
Direction switch for two-way synchronization instances with several simple steps | This feature allows you to switch the direction of a two-way synchronization instance. If you need to switch between primary and secondary databases or between two cloud platforms, you can easily switch the direction of an existing two-way synchronization instance without the need to configure a new instance. | ||
Global active database (GAD) cluster | You can create a GAD cluster based on ApsaraDB RDS and Data Transmission Service (DTS). This way, you can implement disaster recovery for databases and allow users to access nearest resources. | N/A | |
Synchronization topology management | This feature allows you to upgrade the synchronization topology of a synchronization task from one-way synchronization to two-way synchronization to meet evolving business requirements. | ||
Management of conflict detection and resolution policies | Conflict detection | This feature allows you to detect conflicts such as uniqueness conflicts caused by INSERT operations, inconsistent records caused by UPDATE operations, and non-existent records to be deleted. | N/A |
Conflict resolution | The following conflict resolution policies are supported: TaskFailed, Ignore, and Overwrite. The TaskFailed policy specifies that the system reports an error and terminates the task when a conflict occurs. The Ignore policy specifies that the system uses the conflicting records in the destination instance when a conflict occurs. The Overwrite policy specifies that the system overwrites the conflicting records in the destination instance. | N/A | |
Heterogeneous synchronization | Synchronization to real-time data warehouses | This feature allows you to synchronize data to real-time data warehouses to perform high-throughput offline processing and high-performance online analysis. | Synchronize data from an ApsaraDB RDS for MySQL instance to an AnalyticDB for MySQL V3.0 cluster |
Non-database synchronization | This feature allows you to synchronize data to the specified function in Function Compute. You can write function code to process data. | ||
Heterogeneous database synchronization | This feature allows you to synchronize data between heterogeneous databases. | ||
Data shipping | Data shipping channel | You can create a data shipping instance to establish a data shipping channel. This way, you can use the data shipping SDK to ship data from data sources to DTS. | |
Data shipping SDK | You can use the data shipping SDK to ship data from various data sources to DTS, and then synchronize data from DTS to the destination database. This extends the types of data sources that are supported. | ||
Homogeneous synchronization | Real-time synchronization between logically homogeneous databases | This feature allows you to synchronize data between homogeneous databases. | Synchronize data from a self-managed MySQL database to an ApsaraDB RDS for MySQL instance |
Change tracking
The change tracking feature helps you obtain real-time incremental data from databases. You can consume incremental data based on your business requirements and write incremental data to the destination database. This allows you to implement various business scenarios such as cache updates, asynchronous business decoupling, real-time data synchronization between heterogeneous data sources, and business scenarios that involve complex extract, transform, and load (ETL) operations.
Category | Feature | Description | References |
Change tracking | Change tracking channels | You can create a change tracking instance to obtain real-time incremental data changes of databases. | |
Change tracking SDK | You can use the SDK client demo, flink-dts-connector, or Kafka client demo to display tracked data. Then, you can consume data in change tracking channels by using an SDK client, Flink client, or Kafka client. | ||
Traffic management for change tracking instances | If you specify a MySQL database as the source database for a change tracking instance, the system determines whether to charge you data transfer fees based on the configuration fee type that you select. |
Task management
DTS task management
Category | Feature | Description | References |
Task management | Task creation and configuration | You can create DTS tasks between various data sources. You can determine whether to configure a DTS task before or after you purchase a DTS instance. | |
Task scheduling | By default, DTS schedules a task to a shared cluster. You can select a dedicated cluster in advanced settings to schedule the task. | N/A | |
Task monitoring | In the DTS console, you can obtain the status of the connections between DTS and the source and destination databases and various performance metrics of a DTS instance. You can refer to the obtained information to manage DTS tasks. | ||
Task diagnostics | You can check the performance of the source database, destination database, network, and DTS during incremental migration. DTS provides diagnostic results and suggestions. | ||
Task modification | You can manage environment tags of instances, add or remove objects to be synchronized when a data synchronization instance is running, and modify the ETL configurations of a data synchronization or migration instance. | ||
Task deletion | To prevent additional fees, you can manually release a pay-as-you-go DTS instance or unsubscribe from a subscription DTS instance if all DTS tasks of the DTS instance are complete and the configurations of the DTS instance are no longer needed. | ||
Cross-account access | This feature allows you to configure a DTS task across Alibaba Cloud accounts for scenarios such as resource migration or merging across Alibaba Cloud accounts and business architecture adjustment. | ||
Cross-cloud access in a hybrid cloud | You can access databases over public IP addresses to migrate data between accounts that have different attributes, such as between an account in the public cloud and an account in the financial cloud. | N/A | |
Operation logs | You can query the operation logs of an instance to obtain information about the operations on the instance, operation results, and operator. | ||
Database connection management | You can register databases in advance. When you configure a task, you can directly select registered databases. DTS automatically enters the database information without requiring you to manually enter the database information. | ||
APIs | Alibaba Cloud pctowap open platform (POP) API operations | You can call API operations in OpenAPI Explorer for debugging. | |
SDKs | You can call API operations by using SDKs. | ||
Terraform | The open source tool Terraform is supported. | N/A | |
Network management | Cross-region connectivity | This feature allows you to transmit data between source and destination databases that reside in different regions. | |
Data compression and transmission | DTS supports concurrent data compression and transmission to help minimize the bandwidth utilization. | ||
Access over an internal endpoint | DTS allows you to connect to the source or destination database in a virtual private cloud (VPC) over an internal endpoint. | N/A | |
Database Gateway | You can connect a database to DTS by using Database Gateway. | ||
Access over the Internet | You can connect a database to DTS by using a public IP address. | ||
Cross-border data transmission | By default, DTS supports only data synchronization within the same country or region. | ||
Event center | Event notifications | You can use the event subscription feature of CloudMonitor to configure custom alert notifications for important events. This allows you to promptly become aware of the occurrence and progress of events, get informed of event dynamics in real time, and efficiently analyze and locate issues in the event of business interruptions. | |
Proactive O&M platform | The O&M event alert feature is supported. When the system detects risks that may cause DTS instances to fail to run as expected, corresponding O&M events are triggered and notifications are sent by using the console, emails, or internal messages. | ||
Serverless instances | Serverless instance management | You can suspend a serverless instance, view metric data, and modify the upper and lower limits of DTS units (DUs). | N/A |
Dedicated clusters | DU management | You can view the DUs that are created for a dedicated cluster and the DU usage. You can also modify the number of DUs for a task in a dedicated cluster to adjust the specifications of the task. | |
Disk configuration changes | If the disk usage of a dedicated cluster is too high to meet your business requirements, you can expand the disk capacity of the dedicated cluster to ensure that the disk space of the dedicated cluster meets your business requirements. | Increase the storage space of nodes in a DTS dedicated cluster | |
DTS instance migration between a dedicated cluster and a public cluster | You can migrate DTS instances between a dedicated cluster and a public cluster. | Migrate a DTS instance between a dedicated cluster and a public cluster | |
Dedicated cluster management | You can manually renew a dedicated cluster, modify node configurations of the dedicated cluster, and specify the overcommit ratio for the dedicated cluster. | ||
Security | Data encryption | DTS supports SSL-secured connections to databases. | |
Operation isolation | You can use Resource Access Management (RAM) identities, including RAM users and RAM roles, that are granted minimum required permissions to access DTS. This improves data security and reduces security risks caused by permission abuse. | N/A | |
Account permission management | You can use system policies to authorize access to resources such as ApsaraDB RDS and Elastic Computer Service (ECS) instances within the specified Alibaba Cloud account. This allows you to perform DTS tasks by using a database account that has sufficient permissions. | ||
Reliability | High availability (HA) clusters | DTS uses servers with high specifications to ensure the performance of each data synchronization or migration instance. | |
Resumable upload | DTS supports automatic resumable upload to effectively ensure the reliability of data transmission. | ||
Disaster recovery protection for data sources | If the source or destination database cannot be connected or other problems occur, DTS supports immediate and continuous retry operations. | N/A |
Data integration
You can perform drag-and-drop operations or execute Flink SQL statements to configure ETL tasks. The ETL feature is integrated with the data replication capabilities of DTS to implement streaming data extraction, data transformation and processing, and data loading. This helps reduce development barriers and adverse impacts on business systems, improve efficiency, enrich real-time data processing and computing scenarios, and empower enterprises with digital transformation.
Category | Feature | Description | References |
Read/write splitting and traffic distribution | Real-time caching for transaction processing | You can migrate data from a MySQL database, such as a self-managed MySQL database or an ApsaraDB RDS for MySQL database, to a Redis instance. This reduces the loads on backend relational databases and improves user experience. | |
Metadata filtering and mapping | Database, table, and column filtering | When you configure objects for a DTS task, you can select databases, tables, and columns as task objects. | N/A |
Filtering by using DDL and DML statements | When you configure objects for a data migration or data synchronization task, you can execute SQL statements to filter the data that needs to be incrementally synchronized or migrated. | N/A | |
Database, table, and column name mapping | When you configure objects for a data migration or data synchronization task, you can specify the names of task objects including databases, tables, and columns in the destination instance. You can use this feature to synchronize or migrate data to the specified objects in the destination instance. You can also create an object that has the same schema as a source object in the destination instance and assign a different name to the new object. | ||
Topology mapping | The data synchronization feature supports multiple types of synchronization topologies. You can plan your data synchronization instances based on your business requirements. | ||
Data filtering and mapping | WHERE condition-based filtering | When you configure objects for a data synchronization or migration task, you can specify SQL conditions to synchronize or migrate only data that meets the specified conditions to the destination database. | |
Data type mapping | When you synchronize or migrate data between heterogeneous databases, data type mapping is performed during schema synchronization or migration. This way, data types in the source database are converted to data types that are supported by the destination database. |
Data verification
The data verification feature is provided by DTS to monitor data differences between the source and destination databases. You can perform data verification on the source and destination databases without downtime. This helps you detect inconsistencies in data and schemas at the earliest opportunity.
Category | Feature | Description | References |
Homogeneous verification | Metadata verification | This feature allows you to verify the schema of homogeneous data. | |
Full data verification | This feature allows you to verify historical data in homogeneous databases. | ||
Incremental data verification | This feature allows you to verify incremental data synchronized or migrated between homogeneous databases. | ||
Heterogeneous verification | Metadata equivalence verification | This feature allows you to verify the schema of heterogeneous data. | |
Full data verification | This feature allows you to verify historical data in heterogeneous databases. | ||
Incremental data verification | This feature allows you to verify incremental data synchronized or migrated between heterogeneous databases. | ||
Data correction | Metadata correction | If data schema inconsistency is detected during data verification, you can correct data based on the verification results. | |
Full data correction | If full data inconsistency is detected during data verification, you can download revised SQL statements to correct data based on the verification results. | ||
Incremental data correction | If incremental data inconsistency is detected during data verification, you can correct data based on the verification results. |
Scenario-based solution
Scenario-based DTS solution
Category | Feature | Description | References |
ZeroETL | Synchronization from a PolarDB for MySQL cluster to an AnalyticDB for MySQL 3.0 cluster | You can use federated analytics together with the AnalyticDB Pipeline Service (APS) feature of AnalyticDB for MySQL to synchronize data from PolarDB for MySQL to AnalyticDB for MySQL Data Lakehouse Edition (V3.0) in real time. This facilitates data synchronization and management. | Use federated analytics to synchronize data to Data Lakehouse Edition |