This topic describes how to configure a data verification task in Data Transmission Service (DTS). The data verification feature helps you check whether data is consistent between the source and destination databases. In the Advanced Settings step of configuring a data synchronization or migration instance, you can configure Data Verification Mode to detect data inconsistency at the earliest opportunity.
Usage notes
Incremental data verification is not supported for database instances that are deployed on the classic network.
Data verification is not supported for DTS instances for which the multi-table merging feature is enabled.
Data verification is not supported for DTS instances whose Status is Completed.
If a table to be verified contains neither a primary key nor a unique index and the table contains more than 10,000 rows of data, DTS does not perform data verification on the table.
If the data to be verified has neither PRIMARY KEY nor UNIQUE constraints, DTS does not perform incremental data verification.
If you pause and restart a data synchronization or migration instance, full data verification configured for the instance is also restarted to verify data again.
Before you perform data verification, evaluate the impact of data verification on the performance of the source and destination databases. We recommend that you perform data verification during off-peak hours, such as when the CPU load of the source and destination databases is less than 30%. During data verification, DTS uses the read resources of the source and destination databases. This may increase the loads of the database servers. If you perform data verification when the database performance is poor, the database specifications are low, or the business volume is large, high pressure may be caused on the databases. The database services may become unavailable.
If you configure a data verification task in a data synchronization or migration instance, the valid values for Data Verification Mode are determined by the values that you select for Synchronization Types or Migration Types.
For example, if you select only Incremental Data Migration for Migration Types, Incremental Data Verification is the only valid value for Data Verification Mode.
DTS performs incremental data verification only on the data that is incrementally synchronized or migrated to the destination database. Incremental data verification is not performed on the data that is manually changed in the destination database.
You cannot select Incremental Data Verification if you configure a data verification task from the Data Verification page.
If you select Verify All Fields Based on Sampled Rows when you configure data verification for a MongoDB instance, the sampling percentage is fixed to 100%.
If you select Verify All Fields Based on Sampled Rows for Full Data Verification, we recommend that you do not use the extract, transform, and load (ETL) feature. Otherwise, the data modified by using the ETL feature is detected as inconsistent data.
Data verification is not supported for materialized views.
Supported regions
For more information, see Supported regions.
Supported scenarios
Data verification is supported in two-way synchronization scenarios between the following source and destination databases.
The following source and destination databases support Alibaba Cloud database instances and self-managed database instances.
Data verification mode | Source database type | Destination database type |
Full data verification | MySQL | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, PostgreSQL, AnalyticDB for PostgreSQL, and Oracle |
PolarDB for MySQL | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, AnalyticDB for PostgreSQL, and Oracle | |
PostgreSQL | MySQL, PostgreSQL, AnalyticDB for PostgreSQL, PolarDB for PostgreSQL, Oracle, and PolarDB for Oracle | |
PolarDB for PostgreSQL | MySQL, PostgreSQL, AnalyticDB for PostgreSQL, PolarDB for PostgreSQL, and Oracle | |
Standalone MongoDB databases | Standalone MongoDB databases | |
MongoDB databases that use the replica set architecture | MongoDB databases that use the replica set architecture | |
MongoDB databases that use the sharded cluster architecture | MongoDB databases that use the sharded cluster architecture | |
Tair and Redis databases that use the master-replica architecture | Tair and Redis databases that use the master-replica architecture | |
Tair and Redis databases that use the cluster architecture | Tair and Redis databases that use the cluster architecture | |
SQL Server | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, SQL Server, and AnalyticDB for PostgreSQL | |
Oracle | AnalyticDB for MySQL, AnalyticDB for PostgreSQL, MySQL, PolarDB for MySQL, PolarDB for PostgreSQL, PolarDB for Oracle, PostgreSQL, and Oracle | |
PolarDB for Oracle | PolarDB for Oracle, AnalyticDB for MySQL, and Oracle | |
Db2 for LUW | MySQL, PolarDB for MySQL, AnalyticDB for MySQL, and AnalyticDB for PostgreSQL | |
Db2 for i (AS/400) | MySQL | |
TiDB | MySQL, PolarDB for MySQL, and AnalyticDB for MySQL | |
AnalyticDB for MySQL | MySQL, PolarDB for MySQL, and AnalyticDB for MySQL | |
Incremental data verification | MySQL | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, PostgreSQL, AnalyticDB for PostgreSQL, and PolarDB for PostgreSQL |
PolarDB for MySQL cluster | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, and AnalyticDB for PostgreSQL | |
PostgreSQL | MySQL, PostgreSQL, AnalyticDB for PostgreSQL, and PolarDB for PostgreSQL | |
PolarDB for PostgreSQL | PostgreSQL, AnalyticDB for PostgreSQL, and PolarDB for PostgreSQL | |
SQL Server | MySQL, AnalyticDB for MySQL, PolarDB for MySQL, SQL Server, and AnalyticDB for PostgreSQL | |
Oracle | AnalyticDB for MySQL, MySQL, PolarDB for MySQL, and PolarDB for Oracle | |
Schema verification | MySQL | MySQL and PolarDB for MySQL |
PolarDB for MySQL | MySQL and PolarDB for MySQL | |
PostgreSQL | PostgreSQL and PolarDB for PostgreSQL | |
PolarDB for PostgreSQL | PostgreSQL and PolarDB for PostgreSQL |
Configure a data verification task from the Data Verification page
Go to the Data Verification page.
Log on to the Data Management (DMS) console.
In the top navigation bar, move the pointer over DTS.
Choose .
Go to the configuration page of a data verification task in one of the following ways:
If you have purchased a data verification instance, click Configure Task in the Actions column of the instance.
For more information about how to purchase a data verification instance, see Purchase a data verification instance.
If you have not purchased a data verification instance, click Create Task.
Configure the source and destination databases. For more information, see Overview of data synchronization scenarios or Overview of data migration scenarios.
NoteYou cannot change the value of the Instance Region parameter in this step if you purchase the instance before you configure the task.
In the lower part of the page, click Test Connectivity and Proceed.
If the source or destination database is an Alibaba Cloud database instance, such as an ApsaraDB RDS for MySQL or ApsaraDB for MongoDB instance, DTS automatically adds the CIDR blocks of DTS servers to the whitelist of the instance. If the source or destination database is a self-managed database hosted on an Elastic Compute Service (ECS) instance, DTS automatically adds the CIDR blocks of DTS servers to the security group rules of the ECS instance, and you must make sure that the ECS instance can access the database. If the database is deployed on multiple ECS instances, you must manually add the CIDR blocks of DTS servers to the security group rules of each ECS instance. If the source or destination database is a self-managed database that is deployed in a data center or provided by a third-party cloud service provider, you must manually add the CIDR blocks of DTS servers to the whitelist of the database to allow DTS to access the database. For more information, see the CIDR blocks of DTS servers section of the Add the CIDR blocks of DTS servers topic.
WarningIf the CIDR blocks of DTS servers are automatically or manually added to the whitelist of the database or instance, or to the ECS security group rules, security risks may arise. Therefore, before you use DTS to synchronize data, you must understand and acknowledge the potential risks and take preventive measures, including but not limited to the following measures: enhancing the security of your username and password, limiting the ports that are exposed, authenticating API calls, regularly checking the whitelist or ECS security group rules and forbidding unauthorized CIDR blocks, or connecting the database to DTS by using Express Connect, VPN Gateway, or Smart Access Gateway.
Configure Data Verification Mode.
Select one or more data verification modes based on your business requirements. If you select Full Data Verification, you must also configure the parameters that are described in the following table.
NoteFull Data Verification: verifies data in the full data synchronization or migration stage of the DTS task.
Schema Verification: verifies the schemas of objects in the data synchronization or migration task.
Full data verification and incremental data verification start after schema verification is complete.
You cannot select Incremental Data Verification if you configure a data verification task from the Data Verification page.
Parameter
Description
Full Data Verification
Verify All Fields Based on Sampled Rows: verifies all fields of the data sampled based on the specified sampling percentage. You can specify an integer that ranges from 10 to 100 in the field as the sampling percentage.
Verify based on the number of table rows: verifies the number of table rows in the full data synchronization or migration stage of the DTS task. DTS does not verify the data content.
NoteIf the full data verification mode is Verify based on the number of table rows, you are not charged. If the full data verification mode is Verify All Fields Based on Sampled Rows, you are charged based on the actual amount of data to be verified.
Full Data Verification Time Rule
You can set this parameter only to Start Now.
Timeout Settings for Full Data Verification
No: does not specify a timeout period for full data verification. The system does not forcibly stop full data verification if the verification times out.
Yes: specifies a timeout period for full data verification. The system starts to count the duration when full data verification starts. If full data verification is not complete within the specified timeout period, the system forcibly stops the verification. The value must be an integer that ranges from 1 to 72.
Select Verification Benchmark
Default: checks the data consistency between the source and destination databases based on the source and destination databases.
Source Database: checks the data consistency between the source and destination databases based on the source database. Objects that exist in the destination database but do not exist in the source database are not checked.
Destination Database: checks the data consistency between the source and destination databases based on the destination database. Objects that exist in the source database but do not exist in the destination database are not checked.
Maximum Number of Rows Can be Read per Second
During full data verification, DTS uses the read resources of the source and destination databases. You can enable throttling for full data verification based on your business requirements. To configure throttling, you must specify the maximum number of data rows and bytes that are read per second. This reduces the loads of the database servers.
NoteA value of 0 specifies that the full data verification speed is not limited. If you set Maximum Number of Rows Can be Read per Second and Maximum Data Volume Can be Read per Second (Byte/s) to 0, throttling is disabled.
Maximum Data Volume Can be Read per Second (Byte/s)
Specify Verification Objects.
Select the objects to be verified in the source database.
In the Source Objects section, select the objects that you want to verify and click to move the objects to the Selected Objects section.
NoteYou can specify Table Name and Column Name for a destination table only if you select tables as the objects to be verified.
Select the objects to be verified in the destination database.
In the Selected Objects section, right-click a schema moved from the Source Objects section.
In the Edit Schema dialog box, specify Schema Name for the destination database and click OK.
In the Selected Objects section, right-click a table moved from the Source Objects section.
In the Edit Table dialog box, specify Table Name of the destination table.
Optional. In the Column Name column of the Column section, specify the column name of the destination table.
NoteYou can also clear Synchronize All Tables and manually select the columns for data verification.
Click OK.
Optional. Configure verification alerts.
If you select Full Data Verification for Data Verification Mode, you can configure Full Data Verification Alert based on your business requirements.
Parameter
Description
Full Data Verification Alert
No: disables alerting.
Yes: enables alerting. You must also select and configure alert rules. The following information describes the alert rules:
An alert is triggered if full data verification fails.
An alert is triggered if the volume of inconsistent data detected by full data verification is larger than or equal to the specified threshold.
ImportantIf you enable alerting for data verification and want to receive a notification when an alert is triggered, you must subscribe to alert messages in CloudMonitor. For more information, see Configure alert rules for DTS tasks in the CloudMonitor console.
Click Next: Advanced Settings to configure advanced settings.
Parameter
Description
Dedicated Cluster for Task Scheduling
By default, DTS schedules the task to a shared cluster. You do not need to configure this parameter. You can also purchase a dedicated cluster of the specified specifications to run DTS tasks. For more information, see What is a DTS dedicated cluster.
Retry Time for Failed Connections
The retry time range for failed connections. If the source or destination database fails to be connected after the data synchronization task is started, DTS immediately retries a connection within the time range. Valid values: 10 to 1440. Unit: minutes. Default value: 720. We recommend that you set this parameter to a value greater than 30. If DTS reconnects to the source and destination databases within the specified time range, DTS resumes the data synchronization task. Otherwise, the data synchronization task fails.
NoteIf you specify different retry time ranges for multiple data synchronization tasks that have the same source or destination database, the shortest retry time range takes precedence.
When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at your earliest opportunity after the source and destination instances are released.
Retry Time for Other Issues
The retry time range for other issues. For example, if the DDL or DML operations fail to be performed after the data synchronization task is started, DTS immediately retries the operations within the time range. Valid values: 1 to 1440. Unit: minutes. Default value: 10. We recommend that you set this parameter to a value greater than 10. If the failed operations are successfully performed within the specified time range, DTS resumes the data synchronization task. Otherwise, the data synchronization task fails.
ImportantThe value of the Retry Time for Other Issues parameter must be smaller than the value of the Retry Time for Failed Connections parameter.
Environment Tag
Optional. You can select an environment tag that is used to identify the data verification instance.
Save the task settings and run a precheck.
To view the parameters to be specified when you call the relevant API operation to configure the DTS task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
If you do not need to view or have viewed the parameters, click Next: Save Task Settings and Precheck in the lower part of the page.
NoteDTS performs a precheck before a data verification task starts. You can start the data verification task only after the task passes the precheck.
If the task fails to pass the precheck, click View Details next to each failed item. After you analyze the causes based on the check results, troubleshoot the issues. Then, run a precheck again.
If an alert is triggered for an item during the precheck:
If an alert item cannot be ignored, click View Details next to the failed item and troubleshoot the issues. Then, run a precheck again.
If the alert item can be ignored, click Confirm Alert Details. In the View Details dialog box, click Ignore. In the message that appears, click OK. Then, click Precheck Again to run a precheck again. If you ignore the alert item, data inconsistency may occur, and your business may be exposed to potential risks.
View the progress of the data verification task.
If you have purchased a data verification instance, click Back when Success Rate reaches 100%.
If you have not purchased a data verification instance, perform the following steps when Success Rate reaches 100%: Click Next: Purchase Instance, read and select Data Transmission Service (Pay-as-you-go) Service Terms, click Buy and Start, and then click OK.
Configure the data verification feature for a data synchronization or migration task in DTS
Go to the Data Verification page.
NoteIf you use the previous versions of the configuration page, go to the Advanced Settings step.
For information about how to go to the Advanced Settings step when you create a DTS task, see the topics listed in Overview of data synchronization scenarios or Overview of data migration scenarios.
For an existing DTS task, perform the following steps:
Go to the Data Synchronization Tasks or Data Migration Tasks page in the DTS console.
Find the DTS task that you want to manage and click the task ID.
On the Basic Information or Task Management page, click Create Verification Task.
Configure Data Verification Mode.
Select one or more data verification modes based on your business requirements. If you select Full Data Verification, you must also configure the parameters that are described in the following table.
NoteFull Data Verification: verifies data in the full data synchronization or migration stage of the DTS task. If the task does not contain the incremental data synchronization or migration stage, data verification starts after the task is created and full data is written to the destination database. If the task contains the incremental data synchronization or migration stage, data verification starts after the task is created and incremental data is written to the destination database without latency for the first time.
Incremental Data Verification: verifies data that is incrementally synchronized or migrated. Data verification starts after the task is created and incremental data is written to the destination database without latency for the first time.
Schema Verification: verifies the schemas of objects in the data synchronization or migration task. If the task does not contain the incremental data synchronization or migration stage, data verification starts after the task is created and schemas and full data are written to the destination database. If the task contains the incremental data synchronization or migration stage, data verification starts after the task is created and incremental data is written to the destination database without latency for the first time.
Full data verification and incremental data verification start after schema verification is complete.
Parameter
Description
Full Data Verification
Verify All Fields Based on Sampled Rows: verifies all fields of the data sampled based on the specified sampling percentage. You can specify an integer that ranges from 10 to 100 in the field as the sampling percentage.
Verify based on the number of table rows: verifies the number of table rows in the full data synchronization or migration stage of the DTS task. DTS does not verify the data content.
NoteIf the full data verification mode is Verify based on the number of table rows, you are not charged. If the full data verification mode is Verify All Fields Based on Sampled Rows, you are charged based on the actual amount of data to be verified.
Full Data Verification Time Rule
You can set this parameter only to Start Now.
Timeout Settings for Full Data Verification
No: does not specify a timeout period for full data verification. The system does not forcibly stop full data verification if the verification times out.
Yes: specifies a timeout period for full data verification. The system starts to count the duration when full data verification starts. If full data verification is not complete within the specified timeout period, the system forcibly stops the verification. The value must be an integer that ranges from 1 to 72.
Select Verification Benchmark
Default: checks the data consistency between the source and destination databases based on the source and destination databases.
Source Database: checks the data consistency between the source and destination databases based on the source database. Objects that exist in the destination database but do not exist in the source database are not checked.
Destination Database: checks the data consistency between the source and destination databases based on the destination database. Objects that exist in the source database but do not exist in the destination database are not checked.
Maximum Number of Rows Can be Read per Second
During full data verification, DTS uses the read resources of the source and destination databases. You can enable throttling for full data verification based on your business requirements. To configure throttling, you must specify the maximum number of data rows and bytes that are read per second. This reduces the loads of the database servers.
NoteA value of 0 specifies that the full data verification speed is not limited. If you set Maximum Number of Rows Can be Read per Second and Maximum Data Volume Can be Read per Second (Byte/s) to 0, throttling is disabled.
Maximum Data Volume Can be Read per Second (Byte/s)
Specify Verification Objects.
In the Selected Objects section, you can select the objects that do not require data verification and click to remove them.
NoteBy default, the objects to be synchronized or migrated are added to the Selected Objects section.
Configure alerting for data verification.
Configure alerting for data verification based on your business requirements. The following table describes the parameters.
Parameter
Description
Full Data Verification Alert
No: disables alerting.
Yes: enables alerting. You must also select and configure alert rules. The following information describes the alert rules:
An alert is triggered if full data verification fails.
An alert is triggered if the volume of inconsistent data detected by full data verification is larger than or equal to the specified threshold.
Incremental Data Verification Alert
No: disables alerting.
Yes: enables alerting. You must also select and configure alert rules. The following information describes the alert rules:
An alert is triggered if incremental data verification fails.
An alert is triggered if the volume of inconsistent data detected by incremental data verification within the specified consecutive periods is larger than or equal to the specified threshold. You can specify the number of consecutive periods, the statistical period, and the threshold of the inconsistent data volume.
An alert is triggered if the latency of data migration or synchronization detected by incremental data verification within the specified consecutive periods is greater than or equal to the specified threshold. You can specify the number of consecutive periods, the statistical period, and the latency of data migration or synchronization.
ImportantIf you enable alerting for data verification and want to receive a notification when an alert is triggered, you must subscribe to alert messages in CloudMonitor. For more information, see Configure alert rules for DTS tasks in the CloudMonitor console.
Complete the subsequent steps as prompted.
Related API operations
API operation | Description |
Configures a data verification task. |
FAQ
Does DTS detect the database names, table names, and column names that are modified by using the name mapping feature as inconsistent data?
The data verification feature is compatible with the name mapping feature for databases and tables. If you use the name mapping feature to modify database names, table names, and column names, only the modified column names are detected as inconsistent data.
Why am I unable to select a data verification mode for a DTS instance?
The DTS instance may not support the data verification mode, or the Synchronization Types or Migration Types that corresponds to the data verification mode is not selected in the Configure Objects and Advanced Settings step.
Does DTS determine the data in a destination table is inconsistent from the source table if the data is filtered by using filter conditions before data synchronization or migration?
During full data verification, DTS determines the data is consistent if the data is filtered by using Filter Conditions before data synchronization or migration.
During incremental data verification, DTS determines the data is consistent if the data is filtered by using Filter Conditions before data synchronization or migration. Filter Conditions are not supported for incremental data synchronization or migration from source databases whose Database Type is MongoDB.