Configure matching rules for migration objects - ApsaraDB for OceanBase

This topic describes the background information, limitations, procedure, scenarios, and troubleshooting tips for configuring matching rules for migration or synchronization objects.

Background

When you create a data migration or synchronization task, you must specify the objects to migrate or synchronize. The data transmission service allows you to directly specify objects, import objects, or configure matching rules for objects. You can configure wildcard-based matching rules to specify migration or synchronization objects. You can also configure object mapping logic between the source and the destination. This allows you to specify a large number of objects to be migrated or synchronized in a simple and efficient manner. New tables meeting the matching rules can be automatically synchronized to the destination through DDL operations for incremental synchronization. For more information, see Supported DDL operations for synchronization and limitations.

Wildcard patterns for data migration/synchronization between databases

The following table describes the wildcard patterns supported by the data transmission service for data migration/synchronization between databases.

Note

In the following table, an asterisk (*) indicates a wildcard.

Category	Supported rule	Example	Description
Direct migration	.	kd_test.person	All tables whose name starts with person in all databases whose name starts with kd_test are migrated from the source to the destination. The source database names and table names remain unchanged.
	*.<source table>	kd_test*.person	All tables named person in all databases whose name starts with kd_test are migrated from the source to the destination. The source database names and table names remain unchanged.
	<source database>.*	kd_test.person*	All tables whose name starts with person in the database named kd_test are migrated from the source to the destination. The source database name and table names remain unchanged.
	<source database>.<source table>	kd_test.person	The table named person in the database named kd_test is migrated from the source to the destination. The source database name and table name remain unchanged.
Migration object renaming	<source database>.<source table>=<destination database>.<destination table>	kd_test.person=kd_test_new.person_new	The table named person in the database named kd_test is migrated from the source to the destination. The kd_test database is renamed as kd_test_new, and the person table is renamed as person_new.
	<source database>.=<destination database>.	kd_test.person=kd_test_new.person	All tables whose name starts with person in the database named kd_test are migrated from the source to the destination. The kd_test database is renamed as kd_test_new, and the source table names remain unchanged.
	.<source table>=.<destination table>	kd_test.person=kd_test.person_new	All tables named person in all databases whose name starts with kd_test are migrated from the source to the destination. The tables are renamed as person_new, and the source database names remain unchanged.
Data merge migration	<source database>.*=<destination database>.<destination table>	kd_test.person*=kd_test.person_all	All tables whose name starts with person in the database named kd_test at the source are aggregated to the person_all table in the kd_test database at the destination.
	*.<source table>=<destination database>.<destination table>	kd_test*.person=kd_test_all.person	All tables named person in all databases whose name starts with kd_test at the source are aggregated to the person table in the kd_test_all database at the destination.
	.=<destination database>.<destination table>	kd_test.person=kd_test_all.person_all	All tables whose name starts with person in all databases whose name starts with kd_test at the source are aggregated to the person_all table in the kd_test_all database at the destination.
	.=<destination database>.*	kd_test.person=kd_test_all.person*	All tables whose name starts with person in all databases whose name starts with kd_test at the source are aggregated to the kd_test_all database at the destination. The source table names remain unchanged.
	.=*.<destination table>	kd_test.person=kd_test*.person_all	All tables whose name starts with person in all databases whose name starts with kd_test at the source are aggregated to the person_all table in the databases whose name starts with kd_test at the destination. The source database names remain unchanged.

The requirements for matching rules are as follows:

You cannot use wildcards for both database and table names at the destination, for example, kd_test*.person*=kd_test*.person*.
If you use wildcards for both the source and destination databases, the database expression must be the same for the source and destination, indicating smooth database migration.
If you use wildcards for both the source and destination tables, the table expression must be the same for the source and destination, indicating smooth table migration.
If you use a wildcard for databases at the destination, you must also use a wildcard for databases at the source.
If you use a wildcard for tables at the destination, you must also use a wildcard for tables at the source.

Wildcard patterns for data migration/synchronization between a database and a Message Queue instance

The following table describes the wildcard patterns supported by the data transmission service for data migration/synchronization between a database and a Message Queue instance.

Note

In the following table, an asterisk (*) indicates a wildcard.

Supported rule	Example	Description
.=<topic name>	.=topic	Multiple tables in multiple databases are mapped to one topic.
*.<source table>=<topic name>	*.b=topic	The tables named b in multiple databases are mapped to one topic.
<source database>.*=<topic name>	a.*=topic	Multiple tables in database a are mapped to one topic.
<source database>.<source table>=<topic name>	a.b=topic	Table b in database a is mapped to a topic.

Limitations

The data transmission service supports multiple rules. Make sure that each rule is placed in a single row and has no leading or trailing spaces.
Matching rules for migration or synchronization objects must not be empty. Object exclusion rules can be empty.
DDL operations for change are not supported during schema migration and full data migration.
When you configure matching rules to select migration or synchronization objects, table names cannot contain line breaks, spaces, or special characters. The special characters are . | " ' ` ( ) = ; / & \ * ? [ ] [ ! ]
You are not allowed to configure multiple matching rules to map different tables in the same database at the source to different databases at the destination, for example, a.a* = b.a* & a.b* = c.b*.
In a scenario of database or table aggregation, reverse incremental migration is not supported.
Note
The data transmission service checks whether database or table aggregation exists only when a data migration or synchronization task is saved or started. The data transmission service does not block database or table aggregation that occurs during the running of a task. However, the mappings between databases or tables may fail to be correctly identified during reverse incremental migration, which compromises the data quality.
At present, the data transmission service does not support the DDL statement CREATE DATABASE. If a new database created at the source meets the object matching rules, you must manually create a corresponding database at the destination to continue with the data synchronization of the new database.

Considerations

After you configure object matching rules and exclusion rules, objects that are within the difference set of the object matching and exclusion rules can be selected.
Note
A difference set between two sets contains all elements that exist in one set but do not exist in the other set.
After DDL synchronization is enabled, when you use a DDL statement to create a new table or modify the schema of a table at the source, if the table is within the difference set of the object matching and exclusion rules, this DDL statement can be synchronized to the destination by the data transmission service.
Take note of the following considerations if you want to aggregate multiple tables:
- We recommend that you configure the mappings between the source and destination databases by specifying matching rules.
- We recommend that you manually create schemas at the destination. If you create a schema by using the data transmission service, skip the failed objects in the schema migration step.
- If you select DDL Synchronization, databases or tables may be deleted by mistake. For example, when multiple databases or tables at the source are aggregated to a single database or table at the destination, if a database or table is dropped at the source, the aggregated database or table may be dropped at the destination.
- When you create a data migration task, click Full Data Migration and select Ignore for Processing Strategy When Destination Table Has Records.
  Note
  If you select Ignore, data is pulled in IN mode for full verification. In this case, verification is inapplicable if the destination contains data that does not exist in the source, and the verification performance is downgraded.
If a renaming mapping rule is configured for tables, the renaming mapping rule takes precedence. For example, if the rules a.b[0-3] and a.b[3-5]=a.c are configured, the a.b3 table is renamed as a.c.
When you execute the DDL statement RENAME TABLE, if the renamed table does not meet the original matching or exclusion rule, unexpected synchronization issues may occur. Proceed with caution.

Configure matching rules between databases

Complete the steps before Select Migration Objects or Select Synchronization Objects for a data migration or synchronization task.
For more information, see the topics about data migration or data synchronization tasks of the corresponding data source types.
Note
- The procedure for selecting migration objects is the same as that for selecting synchronization objects. This section describes how to select migration objects.
- In data synchronization tasks for synchronizing data from a MySQL tenant of OceanBase Database to an AnalyticDB for MySQL database, from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database, or between MySQL tenants of OceanBase Database, matching rules are configured for selecting synchronization objects between databases.
In the Select Migration Objects section, select Match Rules.
In the Specify Migration Scope section, specify object migration rules in the Object Migration Rule field and object exclusion rules in the Object Exclusion Rule field. The Object Exclusion Rule field is optional. For more information, see Wildcard rules.
Note
If the configured rule contains spaces, the object migration or synchronization project may fail.

Click Verify.

To view the matching results, click Preview Objects after the verification succeeds. The wildcard-based matching rules and exclusion rules apply to tables and views. The matching results are displayed on the Final Objects, New Objects, and Removed Objects tabs.

Tab	Description
Final Objects	Displays the migration objects that are hit by the specified matching rules.
New Objects	Displays the migration objects that are not in the result of the previous matching.
Removed Objects	Displays the migration objects that are only in the result of the previous matching.

After you configure matching rules for selecting migration or synchronization objects, you can set filtering conditions.

Choose Matching Result > Final Objects and move the pointer over the target table object.
Click Settings.
In the Settings dialog box, specify a standard SQL WHERE clause to filter data by row. Then, click Validate Syntax. For more information, see Use SQL conditions to filter data.
After the syntax validation is passed, click OK.
You can also view column information of the migration objects in the View Columns section.

Complete subsequent task settings as prompted.

Sample scenarios

Smooth migration of objects
Migrate all tables whose name starts with test in all databases whose name starts with jenkins_api at the source to the destination, and retain the original database and table names. To do so, configure the matching rule as shown in the following figure.
Renaming of objects after migration
Migrate all tables whose name starts with test in the database named jenkins_my2dh_one at the source to the destination, rename the jenkins_my2dh_one database as jenkins_my2dh_one_new, and retain the original table names. To do so, configure the matching rule as shown in the following figure.
Aggregation of objects
Aggregate all tables whose name starts with order in all databases whose name starts with jenkins_api at the source to the order table in the jenkins_api_all database at the destination. To do so, configure the matching rule as shown in the following figure.
Configuration of object exclusion rules
Exclude historical tables whose name starts with history_ and log tables whose name ends with log in the jenkins_api_mysql56 database at the source from migration. To do so, configure the matching rule as shown in the following figure.

Configure matching rules for data migration/synchronization from a database to a Message Queue instance

When you synchronize data from OceanBase Database to a DataHub, Kafka, or RocketMQ instance, you can configure matching rules to select the objects to synchronize.

Create a data synchronization task and proceed to the Select Synchronization Objects step as prompted.
For more information, see the topics for data synchronization tasks of the corresponding type.
In the Select Synchronization Objects section, select Match Rules.
Specify Object Synchronization Rules and Object Exclusion Rule (optional). For more information, see Wildcard patterns supported for matching rules.
When you configure matching rules, the business logic varies with the type of the data synchronization task.
- If the destination is a DataHub instance, the topic type can be Tuple or BLOB.
  - For the Tuple type, you can enter only existing topic names without wildcards or spaces. After you select tables, they are mapped to topics in a one-to-one manner.
  - For the BLOB type, many-to-one and one-to-one mapping methods are supported, and no spaces are allowed.
    If you have selected Schema Synchronization when you specified the synchronization type, you can choose to enter the name of an existing topic or create a new topic. Only one mapping method is supported. You can select only one mapping method for creating or selecting topics. If you did not select Schema Synchronization when you specified the synchronization types, you can enter only the name of an existing topic.
- If the destination is a Kafka or RocketMQ instance, many-to-one and one-to-one mapping methods are supported, and no spaces are allowed.
  If you have selected Schema Synchronization when you specified the synchronization types, you can choose to enter the name of an existing topic or create a new topic. If you did not select Schema Synchronization when you specified the synchronization types, you can enter only the name of an existing topic.
Click Verify.
To view the matching results, click Preview Objects after the verification succeeds. The matching results are displayed on the Final Objects, New Objects, and Removed Objects tabs.
After you configure matching rules to select synchronization objects, you can set filtering conditions.
1. Choose Matching Result > Final Objects and move the pointer over the target table object.
2. Click Settings.
3. In the Settings dialog box, you can perform the following operations:
  - In the Row Filters section, specify a standard SQL WHERE clause to filter data by row. Then, click Validate Syntax. For more information, see Use SQL conditions to filter data.
  - Select the sharding columns that you want to use from the Sharding Columns drop-down list. You can select multiple fields as sharding columns. This parameter is optional.
    Unless otherwise specified, select the primary keys as sharding columns. If the primary keys are not load-balanced, select load-balanced fields with unique identifiers as sharding columns to avoid potential performance issues. Sharding columns can be used for the following purposes:
    - Load balancing: Threads used for sending messages can be recognized based on the sharding columns if the destination table supports concurrent writes.
    - Orderliness: The data transmission service ensures that messages are received in order if the values of the sharding columns are the same. The orderliness specifies the sequence of executing DML statements for a column.
  - In the Select Columns section, select the columns to be synchronized. For more information, see Column filtering.
4. Click OK.
Complete subsequent task settings as prompted.

FAQ

Insufficient privilege
Pay attention to the privilege settings of the source database user. If you do not grant all required privileges to the migration user, some objects are not displayed in the frontend by the data transmission service, and you cannot correctly configure matching rules. In this case, you must add these objects to Object Exclusion Rule to prevent the data migration or synchronization task from being interrupted because the data transmission service cannot find the target objects.
DML filtering unsupported
If DDL synchronization is disabled, the data transmission service allows you to select objects based on matching rules. If a new table created during incremental synchronization meets a matching rule, the related DDL statements will be ignored but the data transmission service will continue to synchronize DML statements. As a result, the data migration or synchronization task will be interrupted because objects cannot be migrated or synchronized to the destination. Therefore, you must create the table at the destination or add the table to the blocklist.