All Products
Search
Document Center

Data Transmission Service:Track data changes from a PolarDB for MySQL cluster

Last Updated:Nov 21, 2023

Data Transmission Service (DTS) allows you to track data changes from databases in real time. You can use the change tracking feature in the following scenarios: lightweight cache updates, business decoupling, asynchronous data processing, and synchronization of extract, transform, and load (ETL) operations. This topic describes how to create a change tracking task to track data changes from a PolarDB for MySQL cluster.

Prerequisites

Usage notes

Category

Description

Limits on the source database

  • The source tables must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, part of the tracked data changes may be duplicate.
  • If you select tables as the objects to be tracked, up to 500 tables can be tracked in a single change tracking task. If you run a change tracking task to track more than 500 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to track the tables in batches or configure a change tracking task for the entire database.
  • The following requirements for binary logs must be met:

    • The binary logging feature must be enabled. The loose_polar_log_bin parameter must be set to on. Otherwise, error messages are returned during the precheck, and the change tracking task cannot be started.

    • The binary logs of the source database must be stored for more than 24 hours. Otherwise, DTS may fail to obtain the binary logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. Make sure that you set the retention period of binary logs based on the preceding requirements. Otherwise, the service level agreement (SLA) of DTS does not guarantee service reliability or performance.

  • A read-only instance or temporary instance cannot be used as the source instance for change tracking.

Other limits

  • You must make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the ROUND(COLUMN,PRECISION) function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits.
  • DTS does not track the DDL operations that are performed by using gh-ost or pt-online-schema-change. Therefore, the change tracking client may fail to write the consumed data to the destination tables due to schema conflicts.

Procedure

  1. Go to the Change Tracking Tasks page.
    1. Log on to the Data Management (DMS) console.
    2. In the top navigation bar, click DTS.
    3. In the left-side navigation pane, choose DTS (DTS) > Change Tracking.
    Note
    • If you log on to the DMS console and click the Enter Simple Mode icon in the upper-right corner, you can move the pointer over the p483176 icon in the upper-left corner, and then choose All functions > DTS > Change Tracking. For more information, see Customize the layout and style of the DMS console.
    • You can also configure the settings by using the new DTS console.
  2. To the right of Change Tracking Tasks, select the region in which you want to create the change tracking task.
    Note If you use the new DTS console, you must select the region from the drop-down list to the right of Workbench on the Change Tracking Tasks page of the DTS console.
  3. Click Create Task. On the page that appears, configure the source database and the consumer network type.

    Warning After you configure the source instance, we recommend that you read the limits that are displayed in the upper part of the page. Otherwise, the task may fail or the tracked data cannot be consumed.

    Section

    Parameter

    Description

    N/A

    Task Name

    The name of the change tracking task. DTS automatically assigns a name to the task. We recommend that you specify a descriptive name that makes it easy to identify the task. You do not need to use a unique task name.

    Source Database

    Select an existing DMS database instance. (Optional. If you have not registered a DMS database instance, ignore this option and configure database settings in the section below.)

    The instance to which the source database belongs. You can choose whether to use an existing instance based on your business requirements.
    • If you use an existing instance, DTS automatically applies the parameter settings of the source database.
    • If you do not use an existing instance, you must set parameters for the source database.

    Database Type

    The type of the source database. Select PolarDB for MySQL.

    Access Method

    The access method of the source database. Select Alibaba Cloud Instance.

    Instance Region

    The region in which the PolarDB for MySQL cluster resides.

    Replicate Data Across Alibaba Cloud Accounts

    In this example, No is selected.

    PolarDB Cluster ID

    The ID of the PolarDB for MySQL cluster.

    Database Account

    The account of the source database. Enter a database account that has read-only permissions on the PolarDB for MySQL cluster, or a custom account that has the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions.

    Database Password

    The password of the database account.

    Encryption

    Specifies whether to encrypt the connection to the source database. You can set this parameter based on your business requirements. For more information about the SSL encryption feature, see Configure the SSL encryption feature.

    Consumer Network Type

    Network type

    The Network Type parameter is set to VPC. You must select a VPC and a vSwitch. For more information, see VPCs.
    Note
    • After a change tracking task is configured, you cannot change the settings in the Consumer Network Type section.
    • If your change tracking client is deployed in a VPC, we recommend that you select the same VPC and vSwitch as the client.
    • If you track data changes over internal networks, the network latency is minimal.
  4. In the lower part of the page, click Test Connectivity and Proceed.
    If the source database is an Alibaba Cloud database, such as an ApsaraDB RDS for MySQL or ApsaraDB for MongoDB instance, DTS automatically adds the CIDR blocks of DTS servers in the corresponding region to the whitelist of the database instance. If the source database is a self-managed database hosted on an ECS instance, DTS automatically adds the CIDR blocks of DTS servers in the corresponding region to the security group rules of the ECS instance. To allow DTS to access the database, you must also manually add the CIDR blocks of DTS servers in the corresponding region to the security settings of the database. If the source database is a self-managed database that is deployed in a data center or provided by a third-party cloud service provider, you must manually add the CIDR blocks of DTS servers in the corresponding region to the security settings of the database to allow DTS to access the database. For more information, see Add the CIDR blocks of DTS servers to the security settings of on-premises databases.
    Warning If the public CIDR blocks of DTS servers are automatically or manually added to the whitelist of a database instance or to the security group rules of an ECS instance, security risks may arise. Therefore, before you use DTS to track data changes, you must understand and acknowledge the potential risks and take preventive measures, including but not limited to the following measures: enhancing the security of your username and password, limiting the ports that are exposed, authenticating API calls, regularly checking the whitelist or security group rules and forbidding unauthorized CIDR blocks, or connecting the database to DTS by using Express Connect, VPN Gateway, or Smart Access Gateway.
  5. Configure objects for change tracking and advanced settings.

    Parameter

    Description

    Data Change Types

    By default, the Data Change Types parameter is specified and cannot be modified.

    • Data Updates

      DTS tracks data updates of the selected objects, including the INSERT, DELETE, and UPDATE operations.

    • Schema Updates

      DTS tracks the create, delete, and modify operations that are performed on all object schemas of the source instance. You must use the change tracking client to filter the data to be tracked.

    Source Objects

    Select one or more objects from the Source Objects section and click the Rightwards arrow icon to add the objects to the Selected Objects section.
    Note You can select tables or databases as the objects for change tracking.
    • If you select a database as the object, DTS tracks data changes of all objects, including new objects in the database.
    • If you select a table as the object, DTS tracks only data changes of this table. In this case, if you want to track data changes of another table, you must add the table to the selected objects. For more information, see Modify the objects for change tracking.
  6. Click Next: Advanced Settings.

    Parameter

    Description

    Select the dedicated cluster used to schedule the task

    By default, DTS schedules tasks to shared clusters. You do not need to configure this parameter. You can purchase dedicated clusters of specified specifications to run DTS change tracking tasks. For more information, see What is a DTS dedicated cluster.

    Set Alerts

    Specifies whether to configure alerting for the change tracking task. If alerting is configured and the task fails or the latency exceeds the threshold, the alert contacts receive notifications. Valid values:

    Retry Time for Failed Connections

    The retry time range for failed connections. If the change tracking task fails, DTS immediately retries a connection within the time range. Valid values: 10 to 1440. Unit: minutes. Default value: 120. We recommend that you set the time range to more than 30 minutes. If DTS reconnects to the source instance within the specified time range, DTS resumes the change tracking task. Otherwise, the change tracking task fails.

    Note
    • If an instance serves as the source database of multiple change tracking tasks, the less value of this parameter that is set for the instance takes precedence.

    • When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at your earliest opportunity after the source instance is released.

    The wait time before a retry when other issues occur in the source and destination databases.

    The retry time range for other issues. For example, if DDL or DML operations fail to be performed after the change tracking task is started, DTS immediately retries the operations within the retry time range. Valid values: 1 to 1440. Unit: minutes. Default value: 10. We recommend that you set the parameter to a value greater than 10. If the failed operations are successfully performed within the specified retry time range, DTS resumes the change tracking task. Otherwise, the change tracking task fails. Otherwise, the change tracking task fails.

    Important

    The value of the The wait time before a retry when other issues occur in the source and destination databases. parameter must be smaller than the value of the Retry Time for Failed Connections parameter.

    Environment Tag

    The environment tag that is used to identify the DTS instance. You can select an environment tag based on your business requirements. In this example, no environment tag is selected.

    Whether to delete SQL operations on heartbeat tables of forward and reverse tasks

    Specifies whether to write SQL operations on heartbeat tables to the source database while the DTS instance is running.

    • Yes: does not write SQL operations on heartbeat tables. In this case, a latency of the DTS instance may be displayed.

    • No: writes SQL operations on heartbeat tables. In this case, specific features such as physical backup and cloning of the source database may be affected.

  7. In the lower part of the page, click Next: Save Task Settings and Precheck.

    You can move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters to view the parameter settings of the API operation that is called to configure the instance.

    Note
    • Before you can start the change tracking task, DTS performs a precheck. You can start the change tracking task only after the task passes the precheck.
    • If the task fails to pass the precheck, click View Details next to each failed item. After you troubleshoot the issues based on the causes, run a precheck again.
    • If an alert is triggered for an item during the precheck:
      • If an alert item cannot be ignored, click View Details next to the failed item and troubleshoot the issues. Then, run a precheck again.
      • If an alert item can be ignored, click Confirm Alert Details. In the View Details dialog box, click Ignore. In the message that appears, click OK. Then, click Precheck Again to run a precheck again. If you ignore the alert item, data inconsistency may occur, and your business may be exposed to potential risks.
  8. Wait until the success rate becomes 100%. Then, click Next: Purchase Instance.
  9. On the Purchase Instance page, specify the billing method of the change tracking instance. The following table describes the parameters.
    ParameterDescription
    Billing Method
    • Subscription: You pay for the instance when you create an instance. The subscription billing method is more cost-effective than the pay-as-you-go billing method for long-term use.
    • Pay-as-you-go: A pay-as-you-go instance is charged on an hourly basis. The pay-as-you-go billing method is suitable for short-term use. If you no longer require a pay-as-you-go instance, you can release the pay-as-you-go instance to reduce costs.
    Subscription Length
    If you select the subscription billing method, set the subscription duration and the number of instances that you want to create. The subscription duration can be one to nine months, one year, two years, three years, or five years.
    Note This parameter is displayed only if you select the subscription billing method.
  10. Read and select the check box to agree to the Data Transmission Service (Pay-as-you-go) Service Terms.
  11. Click Buy and Start to start the change tracking task. You can view the progress of the task in the task list.

What to do next

When the change tracking task is running, you can create consumer groups based on the downstream client to consume the tracked data.
  1. For more information about how to create and manage consumer groups, see Create consumer groups.
  2. Use one of the following methods to consume the tracked data: