All Products
Search
Document Center

PolarDB:Verify data in the primary and secondary instances

Last Updated:Aug 02, 2024

This topic describes how to verify data in the primary and secondary instances of a Global Database Network (GDN).

Prerequisites

A GDN is created and at least one secondary instance is added to the GDN. For more information, see Create and manage a GDN.

Usage notes

  • The operations described in this topic can be performed only on a secondary instance.

  • Data verification tasks consumes storage resources. We recommend that you perform data verification tasks during off-peak hours.

Submit a data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} [CHANNEL='channel_name'] [MODE=direct|snapshot];

Limits

  • Tables without a primary key cannot be verified.

  • If you submit a new data verification task for a table that is being verified, the data verification task is ignored.

  • If you submit a new data verification task for a table that has been verified, the verification results of the table generated in the last task are replaced by the results generated by the new task.

Note
  • Data verification tasks are performed in an asynchronous manner.

  • You can submit a task to verify a single table or all tables in the specified database.

  • In the syntax, the CHANNEL parameter indicates the current channel name of the secondary instance. You can obtain the parameter value by executing the SHOW SLAVE STATUS statement on the secondary instance and viewing the value of Channel_Name. For more information, see SHOW SLAVE STATUS.

  • In the syntax, MODE indicates the verification mode. Valid values:

    • snapshot: Verify data by building consistency snapshots for upstream and downstream data. To specify this value for MODE, you must enable the Sync Point feature. For more information, see Enable Sync Point.

    • direct: Verify data by directly reading upstream and downstream data. The verification results in this mode may be inaccurate.

Example

mysql> check replica table `testdb`.`testtb` channel='test' mode=direct;
Query OK, 0 rows affected (0.05 sec)

Enable Sync Point

You can execute the following SQL statements on the primary instance to enable the Sync Point feature:

set global enable_polarx_sync_point = true;
set global enable_sync_point = true;
set global enable_xa_tso = true;
set global enable_auto_commit_tso = true;
# Specify the interval at which sync points are generated. Unit: millisecond.
set global SYNC_POINT_TASK_INTERVAL = 5000;

After the Sync Point feature is enabled, the Change Data Capture (CDC) node builds TSO mappings between the primary and secondary instances in real time during data synchronization. You can view all TSO mappings in the information_schema.rpl_sync_point view of the secondary instance.

mysql> select * from information_schema.rpl_sync_point limit 10;
+-------+---------------------+---------------------+---------------------+
| ID    | PRIMARY_TSO         | SECONDARY_TSO       | CREATE_TIME         |
+-------+---------------------+---------------------+---------------------+
| 31482 | 7211850887478640704 | 7211850889504489536 | 2024-06-27 06:00:41 |
| 31483 | 7211850908445966400 | 7211850910203379776 | 2024-06-27 06:00:46 |
| 31484 | 7211850929417486400 | 7211850931690799168 | 2024-06-27 06:00:51 |
| 31485 | 7211850950393200704 | 7211850952322580544 | 2024-06-27 06:00:56 |
| 31486 | 7211850971368915008 | 7211850973738696768 | 2024-06-27 06:01:01 |
| 31487 | 7211850992327852096 | 7211850994282397760 | 2024-06-27 06:01:06 |
| 31488 | 7211851013303566400 | 7211851015677542464 | 2024-06-27 06:01:11 |
| 31489 | 7211851034275086400 | 7211851036204466240 | 2024-06-27 06:01:16 |
| 31490 | 7211851055250800704 | 7211851057595416640 | 2024-06-27 06:01:21 |
| 31491 | 7211851076218126400 | 7211851078210420800 | 2024-06-27 06:01:26 |
+-------+---------------------+---------------------+---------------------+
10 rows in set (0.04 sec)
Note

A TSO is a timestamp that is used to uniquely identify a transaction in binary logs. For more information, see Simple Log Service (Binary logging).

A TSO mapping includes the TSOs of the primary and secondary instances at the same point in time.

View the progress of a data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} SHOW PROGRESS;
Note

You can view data verification progress of a single table or all tables in the specified database.

Example

mysql> CHECK REPLICA TABLE `test_db`.`test_tb` SHOW PROGRESS;
+----------+---------+-------+----------+---------+
| DATABASE | TABLE   | STAGE | STATUS   | SUMMARY |
+----------+---------+-------+----------+---------+
| test_db  | test_tb | CHECK | FINISHED | SUCCESS |
+----------+---------+-------+----------+---------+
1 row in set (0.04 sec)

Pause a data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} PAUSE;
Note

The data verification task of each table is automatically split into one or more subtasks based on the size of data involved in the task. After you execute a SQL statement to pause a data verification task, all ongoing subtasks of the task continue to execute, and subtasks that have not been started are paused.

Example

mysql> CHECK REPLICA TABLE `testdb`.`testtb` PAUSE;
Query OK, 0 rows affected (0.05 sec)

Continue a paused data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} CONTINUE;
Note

After you continue a paused data verification task, all paused subtasks of the task start to be executed.

Example

mysql> CHECK REPLICA TABLE `testdb`.`testtb` CONTINUE;
Query OK, 0 rows affected (0.05 sec)

View the result of a data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} SHOW DIFF;

Example

mysql> CHECK REPLICA TABLE `test_db`.`test_tb` SHOW DIFF;
+----------+---------+------------+--------+--------------+-------------+--------------+-------------+
| DATABASE | TABLE   | ERROR_TYPE | STATUS | SRC_KEY_NAME | SRC_KEY_VAL | DST_KEY_NAME | DST_KEY_VAL |
+----------+---------+------------+--------+--------------+-------------+--------------+-------------+
| test_db  | test_tb | Miss       | FOUND  | [id]         | [2]         | [id]         | NULL        |
+----------+---------+------------+--------+--------------+-------------+--------------+-------------+
1 row in set (0.00 sec)
Note

The ERROR_TYPE field in the result indicates the reason why the upstream and downstream data is inconsistent. Valid values:

  • Miss: Some data is missing in the downstream.

  • Orphan: Additional data is included in the downstream.

  • Diff: The upstream data and downstream data are different.

Delete a data verification task

Syntax

CHECK REPLICA TABLE {`test_db`.`test_tb`} | {`test_db`} CANCEL;
Note

After you delete a data verification task, all results generated by the task are also cleared.

Example

mysql> CHECK REPLICA TABLE `testdb`.`testtb` CANCEL;
Query OK, 0 rows affected (0.05 sec)