Objects of data migration, data synchronization, and change tracking tasks - Data Transmission Service

You can call the API operations of Data Transmission Service (DTS) to configure or query the objects of a data migration, data synchronization, or change tracking task. This topic describes the related API operations and provides object definitions and configuration examples.

Related API operations and parameters

API	Description
ConfigureDtsJob ConfigureSubscription	You can use the request parameter `DbList` to specify the objects of a data migration, data synchronization, or change tracking task.
DescribeDtsJobDetail DescribeDtsJobs	You can use the response parameter `DbObject` to query the objects of a data migration, data synchronization, or change tracking task.

Object definitions

The value of an object-related parameter is a JSON string. This section describes object-related parameters.

If the objects include multiple databases, refer to the following parameter description to define the objects:

{
    "The name of source database 1": {
        "name": "The database name that you want to use for source database 1 in the destination instance",
        "all": true (indicates that the entire database is used as the source object)
    },
    "The name of source database 2": {
        "name": "The database name that you want to use for source database 2 in the destination instance",
        "all": false (indicates that the source object is not the entire database),
        "Table": {
            "The name of source table A": {
                "name": "The table name that you want to use for source table A in the destination instance",
                "all": true (indicates that the entire table is used as the source object),
                "dml_op": "The DML operations that you want to migrate or synchronize during incremental data migration or synchronization",
                "ddl_op": "The DDL operations that you want to migrate or synchronize during incremental data migration or synchronization"
            }
        }
    },
    "The name of source database 3": {
        "name": "The database name that you want to use for source database 3 in the destination instance",
        "all": true (indicates that the entire database is used as the source object),
        "dml_op": "The DML operations that you want to migrate or synchronize during incremental data migration or synchronization",
        "ddl_op": "The DDL operations that you want to migrate or synchronize during incremental data migration or synchronization"
    }
}

If the objects are specific columns or include filter conditions, refer to the following parameter description to define the objects:

{
    "The name of the source database": {
        "name": "The database name that you want to use for the source database in the destination instance",
        "all": false (indicates that the source object is not the entire database),
        "Table": {
            "The name of source table A": {
                "name": "The table name that you want to use for source table A in the destination instance",
                "all": false (indicates that the source object is not the entire table),
                "filter": "id>10"
                "column": {
                    "id": {
                        "key": "PRI",
                        "name": "id",
                        "type": "int(11)",
                        "sharedKey": false,
                        "state": "checked"
                    }
                },
                "shard": 12
            }
        }
    }
}

If you migrate or synchronize objects to an AnalyticDB for MySQL cluster or an AnalyticDB for PostgreSQL instance, refer to the following parameter description to define the objects:

{
    "The name of the source database": {
        "name": "The database name that you want to use for the source database in the destination instance",
        "all": false (fixed value) You must set the value to false if you migrate objects to an AnalyticDB for MySQL cluster or an AnalyticDB for PostgreSQL instance, regardless of whether the objects are specific tables or an entire database. In this case, you must also specify the partition key that corresponds to each table.
        "Table": {
            "The name of source table A": {
                "all": true (indicates that the entire table is used as the source object),
                "name": "The table name that you want to use for source table A in the destination instance",
                "primary_key": "The primary key. Example: id",
                "type": "The table type. Example: dimension",
            }
            "The name of source table B": {
                "all": true (indicates that the entire table is used as the source object),
                "name": "The table name that you want to use for source table B in the destination instance",
                "part_key": "The partition key. Example: id",
                "primary_key": "The primary key. Example: id",
                "type": "The table type. Example: partition",
                "tagColumnValue": "The tag column value"
            }
        }
    }
}

If an independent conflict resolution policy is specified for the objects to be synchronized, refer to the following parameter description to define the objects:

Note

The independent conflict resolution policy is supported only for two-way data synchronization between MySQL databases or between PolarDB for MySQL clusters.
You can configure independent conflict resolution policies at the database level or table level.
The global conflict resolution policy does not apply to the columns for which an independent conflict resolution policy is specified.

Table-level policy

{
    "The name of source database 1": {
      "name": "The database name that you want to use for source database 1 in the destination instance",
      "all": true (indicates that the entire database is used as the source object),
      "conflict": "Task-level conflict resolution policy"
    },
    "Name of source database 2": {
      "name": "The database name that you want to use for source database 2 in the destination instance",
      "all": false (indicates that the source object is not the entire database),
      "conflict": "overwrite",
      "Table": {
        "The name of source table A": {
          "name": "The table name that you want to use for source table A in the destination instance",
          "all": true (indicates that the entire table is used as the source object),
          "cdr_cmp_col": "Columns for conflict detection",
          "cdr_rslv_col": "Columns for conflict detection",
          "resolve_method": "Table-level conflict resolution policy"
        }
      }
    }
}

Database-level policy

If the object to be synchronized is the entire database, use the following code to define independent conflict resolution policies:

"Name of source database 1": {
  "name": "The database name that you want to use for source database 1 in the destination instance",
  "all": true (indicates that the entire database is used as the source object),
  "conflict": "Task-level conflict resolution policy",
  "cdr_cmp_col": "Columns for conflict detection",
  "cdr_rslv_col": "Columns for conflict detection",
  "resolve_method": "Database-level conflict resolution policy"
  }
}

If the object to be synchronized is not the entire database, use the following code to define independent conflict resolution policies:

"Name of source database 2": {
  "name": "The database name that you want to use for source database 2 in the destination instance",
  "all": false (indicates that the source object is not the entire database),
  "conflict": "Task-level conflict resolution policy",
  "cdr_cmp_col": "Columns for conflict detection",
  "cdr_rslv_col": "Columns for conflict detection",
  "resolve_method": "Database-level conflict resolution policy",
  "Table": {
    "The name of source table A": {
      "name": "The table name that you want to use for source table A in the destination instance",
      "all": true (indicates that the entire table is used as the source object)
    }
  }
}

Parameter	Description
`name`	The database name, table name, or column name that you want to use in the destination instance. For example, if you want to migrate data from a database named dtssource to a database named dtstarget, you must set the name parameter to dtstarget.
`all`	Specifies whether to select all tables or columns. Valid values: `true` Note If you set this parameter to true, you do not need to configure the information about specific tables or columns. `false`
`Table`	The information about the source table.
`filter`	The condition used to filter the source data. The filter condition takes effect only at the table level. For example, you can enter `id>10` to migrate or synchronize only data records whose id column value is greater than 10. For more information about how to specify filter conditions, see Specify filter conditions. Note You cannot specify filter conditions for change tracking tasks.
`column`	The information about the source column.
`key`	Specifies whether to use the column as the primary key. Valid values: `PRI`: yes. Empty string: no.
`sharedKey`	Specifies whether to use the column as the shard key. Valid values: `true` `false` Note This parameter is required only if the source database is a Kafka instance.
`type`	The data type of the column.
`state`	The state of the column. If this parameter is set to `checked`, the column is selected.
`shard`	The number of shards in the source table. Note This parameter is required only if the source database is a Kafka instance.
`dml_op`	The DML operations that you want to migrate or synchronize during incremental data migration or synchronization. Valid values: `i`: INSERT operations. `u`: UPDATE operations. `d`: DELETE operations. Empty string: All DML operations that are supported by the task are migrated or synchronized during incremental migration or synchronization. `none`: No DML operation is migrated or synchronized during incremental migration or synchronization. Note For more information about the DML operations supported by different data migration or synchronization tasks, see the topics listed in Overview of data migration scenarios or Overview of data synchronization scenarios.
`ddl_op`	The DDL operations that you want to migrate or synchronize during incremental data migration or synchronization. Valid values: `ct`: CREATE TABLE operations. `at`: ALTER TABLE operations. `dt`: DROP TABLE operations. `rt`: RENAME TABLE operations. `tt`: TRUNCATE TABLE operations. Empty string: All DDL operations that are supported by the task are migrated or synchronized during incremental migration or synchronization. `none`: No DDL operation is migrated or synchronized during incremental migration or synchronization. Note For more information about the DDL operations supported by different data migration or synchronization tasks, see the topics listed in Overview of data migration scenarios or Overview of data synchronization scenarios.
`primary_key`	The primary key. This parameter is valid and required only if the destination instance is an AnalyticDB for MySQL cluster or an AnalyticDB for PostgreSQL instance.
`part_key`	The partition key. This parameter is valid and required only if the destination instance is an AnalyticDB for MySQL cluster or an AnalyticDB for PostgreSQL instance.
`type` Important The `type` parameter here is not the same as the `type` parameter that represents the data type of a field.	The type of the source table. This parameter is required if the destination instance is an AnalyticDB for MySQL cluster or an AnalyticDB for PostgreSQL instance. Valid values: dimension: dimension table. partition: partitioned table.
`tagColumnValue`	The custom value of the __dts_data_source tag column. This parameter is valid and required only if the destination instance is an AnalyticDB for MySQL cluster.
`conflict`	The global conflict resolution policy at the task level. You must specify a value for this parameter for each source database and the values that you specify must be the same. Valid values: `overwrite`: If a conflict occurs during data synchronization, the conflicting records in the destination database are overwritten. `interrupt`: If a conflict occurs during data synchronization, the data synchronization task reports an error and exits the process. The task enters a failed state, and you must manually resolve the conflict. Note In the DTS console, the state of the task is displayed as TaskFailed. `ignore`: If a conflict occurs during data synchronization, the data synchronization task ignores the current statement and continues the process. The conflicting records in the destination database are used.
`resolve_method`	The independent conflict resolution policy at the table level. The independent conflict resolution policy is supported only for incremental data synchronization. Valid values: `overwrite`: If a conflict occurs during data synchronization, the conflicting records in the destination database are overwritten. `interrupt`: If a conflict occurs during data synchronization, the data synchronization task reports an error and exits the process. The task enters a failed state, and you must manually resolve the conflict. Note In the DTS console, the state of the task is displayed as TaskFailed. `ignore`: If a conflict occurs during data synchronization, the data synchronization task ignores the current statement and continues the process. The conflicting records in the destination database are used. `use_max`: If a conflict occurs during data synchronization, the two records in the conflicting columns are compared and the record that represents a larger value is written to the destination database. If this record does not exist or the field type of this record does not meet the requirements, the effect of use_max is equivalent to that of `overwrite`. `use_min`: If a conflict occurs during data synchronization, the two records in the conflicting columns are compared and the record that represents a smaller value is written to the destination database. If this record does not exist or the field type of this record does not meet the requirements, the effect of use_min is equivalent to that of `ignore`.
`cdr_cmp_col`	In a table-level conflict resolution policy, this parameter specifies the columns for conflict detection. The primary key column and unique key column are excluded. The values of cdr_cmp_col and cdr_rslv_col must be the same. Important The `cdr_cmp_col` and `cdr_rslv_col` parameters are required if the resolve_method parameter is set to `use_max` or `use_min`. By default, primary key and unique key columns are selected. You do not need to manually select these columns. In a database-level conflict resolution policy, this parameter specifies the column for conflict detection. The values of cdr_cmp_col and cdr_rslv_col must be the same. Important Parameters `cdr_cmp_col` and `cdr_rslv_col` must be specified.
`cdr_rslv_col`

Configuration examples

Example 1: Replicate data from all tables in the dtstestdata database.
```
{"dtstestdata": {   "name": "dtstestdata",   "all": true }}
```
Example 2: Replicate data from all tables in the dtstestdata database and change the database name to dtstestdata_new.
```
{"dtstestdata": {   "name": "dtstestdata_new",   "all": true }}
```

Example 3: Replicate data from a table named customer in the dtstestdata database.

{"dtstestdata": {
   "name": "dtstestdata",
   "all": false,
   "Table": {
     "customer": {
       "name": "customer",
       "all": true, 
       "column": { 
         "id": {
           "key": "PRI",
           "name": "id",
           "type": "int(11)",
           "sharedKey": false,
           "state": "checked"  
         },
         "gmt_create": {
           "key": "",
           "name": "gmt_create",
           "type": "datetime",
           "sharedKey": false,
           "state": "checked"
         },
         "gmt_modify": {
           "key": "",
           "name": "gmt_modify",
           "type": "datetime",
           "sharedKey": false,
           "state": "checked"
         },
         "valid_time": {
           "key": "",
           "name": "valid_time",
           "type": "datetime",
           "sharedKey": false,
           "state": "checked"
         },
         "creator": {
           "key": "",
           "name": "creator",
           "type": "varchar(200)",
           "sharedKey": false,
           "state": "checked"
         }
       },
       "shard": 12
     }
   }
 }
}

Example 4: Replicate data from specific columns of the customer and order tables in the dtstestdata database.

{"dtstestdata": {
   "name": "dtstestdata",
   "all": false,
   "Table": {
     "customer": {
       "name": "customer",
       "all": false, 
       "column": { 
         "id": {
           "key": "PRI",
           "name": "id",
           "type": "int(11)",
           "sharedKey": false,
           "state": "checked"  
         },
         "level": {
           "key": "",
           "name": "level",
           "type": "varchar(5000)",
           "sharedKey": false,
           "state": "checked"
         },
         "name": {
           "key": "",
           "name": "name",
           "type": "varchar(500)",
           "sharedKey": false,
           "state": "checked"
         },
       },
       "shard": 12
     },
     "order": {
       "name": "order",
       "all": false,
      "column": {
         "id": {
           "key": "PRI",
           "name": "id",
           "type": "int(11)",
           "sharedKey": false,
           "state": "checked"
         }
       },
       "shard": 12
     }
   }
 }
}

Example 5: Replicate data from the customer, order, and commodity tables in the dtstestdata database to the destination AnalyticDB for MySQL cluster or AnalyticDB for PostgreSQL instance.

{
    "dtstestdata": {
        "name": "dtstestdatanew",
        "all": false,
        "Table": {
            "order": {
                "name": "ordernew",
                "all": true,
                "part_key": "id",
                "primary_key": "id",
                "type": "partition"
            },
            "customer": {
                "name": "customernew",
                "all": true,
                "primary_key": "id",
                "type": "dimension"
            },
            "commodity": {
                "name": "commoditynew",
                "all": false,
                "filter": "id>10",
                "column": {
                    "id": {
                        "key": "PRI",
                        "name": "id",
                        "type": "int(11)"
                    }
                },
                "part_key": "id",
                "primary_key": "id",
                "type": "partition"
            }
        }
    }
}

Example 6:

Table-level policy

The following sample code provides an example on how to configure a global conflict resolution policy named interrupt for the objects to be synchronized and configure an independent conflict resolution policy named overwrite for the primary key column, unique key column, and name column of the customer table in the dtstestdata2 database to be synchronized.

{
    "dtstestdata1": {
      "name": "dtstestdata1",
      "all": true,
      "conflict": "interrupt"
    },
    "dtstestdata2": {
      "name": "dtstestdata2",
      "all": false,
      "conflict": "interrupt",
      "Table": {
        "customer": {
          "name": "customer",
          "all": true,
          "cdr_cmp_col": "name",
          "cdr_rslv_col": "name",
          "resolve_method": "overwrite"
        }
      }
    }
  }

Database-level policy

Synchronize an entire database named dtstestdata1: The following sample code provides an example on how to configure an independent conflict resolution policy named use_max for the name and addr columns of all tables in the dtstestdata1 database.
```
"dtstestdata1": {
  "name": "dtstestdata1",
  "all": true,
  "conflict": "overwrite",
  "cdr_cmp_col": "name,addr",
  "cdr_rslv_col": "name,addr",
  "resolve_method": "use_max"
  }
}
```

Synchronize a part of a database named dtstestdata2: The following sample code provides an example on how to configure an independent conflict resolution policy named use_max for the name and addr columns of all tables to be synchronized in the dtstestdata2 database.

"dtstestdata2": {
  "name": "dtstestdata2",
  "all": false,
  "conflict": "overwrite",
  "cdr_cmp_col": "name,addr",
  "cdr_rslv_col": "name,addr",
  "resolve_method": "use_max",
  "Table": {
    "person": {
      "name": "person",
      "all": true
    },
    "class": {
      "name": "class",
      "all": true
    }
  }
}

Example 7: Assume that Source Database Type is set to Tair/Redis for a data synchronization instance. The following code indicates that a data record in the database named 0 or 1 is synchronized only if the key prefix is HProp (HProp as Prefixes of Keys to Be Synchronized). In the database named 2, a data record is synchronized only if the key prefix is dts and does not contain dtstest (dts as Prefixes of Keys to Be Synchronized and dtstest as Prefixes of Keys to Be Filtered Out).

{
    "0": {
        "name": "0", 
        "all": true,
         "filter": "[{"condition":"HProp","filterType":"white","filterPattern":"prefix"}]"
    }, 
    "1": {
        "name": "1", 
        "all": true,
         "filter": "[{"condition":"HProp","filterType":"white","filterPattern":"prefix"}]"
    }, 
    "2": {
        "name": "2", 
        "all": true,
         "filter": "[{"condition":"dts","filterType":"white","filterPattern":"prefix"},{"condition":"dtstest","filterType":"black","filterPattern":"prefix"}]"
    }
}