All Products
Search
Document Center

DataWorks:Configure a serverless synchronization task

Last Updated:Nov 07, 2024

This topic describes the characteristics of a serverless synchronization task and how to configure a serverless synchronization task.

Limits

  • Serverless synchronization tasks are supported in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Hong Kong), UK (London), US (Silicon Valley), US (Virginia), Japan (Tokyo), Germany (Frankfurt), and Malaysia (Kuala Lumpur).

  • Serverless synchronization tasks support the following synchronization types:

    • Real-time synchronization of all data in a MySQL database to Hologres (You can directly select tables or select tables based on regular expressions.)

    • Batch synchronization of all data in a database in a Hologres data source to another Hologres data source

  • The data sources used in a serverless synchronization task must reside in the current region and belong to the current Alibaba Cloud account.

Usage notes

  • You do not need to configure a resource group for a serverless synchronization task. This way, you can focus only on your business.

  • You do not need to take note of the network connectivity situation of a serverless synchronization task when you configure the task. However, you must make sure that the CIDR block of the virtual private cloud (VPC) in which the source is deployed does not conflict with the CIDR block of the VPC in which the destination is deployed.

  • Serverless synchronization tasks are charged based on the pay-as-you-go billing method. After you start a serverless synchronization task, an order ID is generated. You can use the order ID to query the details and the fee deduction information of the order in the Expenses and Costs console. For more information, see Billing of serverless resource groups.

  • You are charged for your serverless synchronization task based on the pay-as-you-go billing method only when the task is running. If the serverless synchronization task stops or fails, billing is stopped. If you no longer require a serverless synchronization task, you can delete it. The deletion operation is irreversible. After you delete the serverless synchronization task, the order generated for the task is released.

Configure a serverless synchronization task

Step 1: Create a serverless synchronization task

  1. Go to the Data Integration page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. In the left-side navigation pane of the Data Integration page, click Serverless Synchronization Task.

    image

  3. In the upper part of the Serverless Synchronization Task page, select a source type and a destination type, and click Create Serverless Synchronization Task.

Step 2: Configure basic information for the serverless synchronization task

  • If you want to use existing data sources, you can turn on Use Existing Data Source in the Data Source Configuration section. Then, you can select desired data sources from the Data Source drop-down lists.

  • If you do not use existing data sources, you can directly configure information about the desired data sources in the Data Source Configuration section. In this case, you do not need to add data sources to Data Integration or Management Center in advance.

  • After you configure information about the source and destination, you can click Test. For an ApsaraDB RDS for MySQL instance and a Hologres instance, the network connectivity test is automatically passed if no IP address whitelist is configured for the instances. If IP address whitelists are configured for the instances, you must add the required IP addresses to the IP address whitelists to allow DataWorks to access the instances. For information about the IP addresses, see Configure network connectivity.

Step 3: Configure source tables and mapping rules for the serverless synchronization task

Configure source tables and mapping rules for the serverless synchronization task based on the synchronization type of the serverless synchronization task by following the instructions displayed in the DataWorks console.

Step 4: Complete the configuration of the serverless synchronization task

After you complete the preceding configuration, click Complete.

The first time you click Complete, the system automatically checks the configuration of the serverless synchronization task. The configuration check is only a pre-check and does not block the completion of task configuration.

Start the serverless synchronization task

  • When you start the serverless synchronization task, the system automatically performs another configuration check on the task. The serverless synchronization task can be successfully started only if the task passes the configuration check.

  • The check items of the configuration check vary based on the synchronization type of the serverless synchronization task.

  • The first time you start the serverless synchronization task, the system checks whether your account is attached the AliyunBSSOrderAccess and AliyunDataWorksFullAccess policies. The permissions are the same as the permissions that are required to purchase a pay-as-you-go serverless resource group.

View the running details of the serverless synchronization task

In the Tasks section of the Serverless Synchronization Task page, find the serverless synchronization task, and click the task name in the Name/ID column or the stage name in the Execution Overview column to go to the details page of the task. On the details page, you can view the following information:

  • Basic information: includes the data source information, order ID, and the synchronization type of the serverless synchronization task.

  • Execution status: includes the execution status of each stage. You can also view the operation logs, failover records, and resource utilization of the serverless synchronization task.

  • Details: include the details of schema migration, full data initialization, and real-time synchronization.

Modify the serverless synchronization task

  1. In the Tasks section of the Serverless Synchronization Task page, find the serverless synchronization task, click More in the Actions column, and then select Edit to go to the configuration page of the serverless synchronization task.

  2. Add source tables to or remove tables from the serverless synchronization task, or modify other configurations for the serverless synchronization task. Then, click Complete.

  3. Click Apply Updates that is displayed in the Actions column.

    • After you click Apply Updates, the system automatically checks the configuration of the serverless synchronization task. If the serverless synchronization task fails the check, the modifications cannot take effect.

    • The number of items that are checked after you click Apply Updates is less than the number of items that are checked the first time you start the serverless synchronization task. This is because the first startup requires resource preparation, but resource initialization is complete when you apply updates to the serverless synchronization task.

Appendixes

View the details of the order generated for the serverless synchronization task

Serverless synchronization tasks are different from other types of synchronization tasks. No resource group is configured for serverless synchronization tasks, and a serverless synchronization task is charged based on the pay-as-you-go billing method by task order.

Note

You are charged for your serverless synchronization task based on the pay-as-you-go billing method only when the task is running. If the serverless synchronization task stops or fails, billing is stopped. If you no longer require a serverless synchronization task, you can delete it. The deletion operation is irreversible. After you delete the serverless synchronization task, the order generated for the task is released.

You can perform the following steps to query the order generated for a serverless synchronization task.

  1. In the Tasks section of the Serverless Synchronization Task page, find the serverless synchronization task, and click the task name in the Name/ID column or the stage name in the Execution Overview column to go to the task details page.

  2. In the Basic Information section of the task details page, obtain the ID of the order generated for the serverless synchronization task.

    image

  3. Go to the Orders page of the Expenses and Costs console to query the details of the order based on the order ID.

Configure advanced settings

Select source databases and tables and configure mapping rules

After you select a source database and a table, data is automatically written to the destination schema or table whose name is the same as the source database or table. If no such destination schema or table exists, the system automatically creates the schema or table in the destination. You can configure the Customize Mapping Rules for Destination Schema Names or Customize Mapping Rules for Destination Table Names parameter to define the name of the schema or table to which you want to write data. This way, you can specify destination database and table names that have different prefixes from the source database and table names.

Configure destination tables

You can define the properties of destination tables. For example, you can specify whether to write data to an existing table or a new table, whether to add fields to a destination table, and whether to write data to a partitioned or non-partitioned destination table. You can also specify a partition field and a storage mode for a destination table, and assign values to the fields that are added to a destination table.

Note

After you configure the properties of destinations and click Apply and Refresh Mapping, source tables are automatically mapped to destination tables based on the table rules that you configure.

Configure rules to process DDL or DML messages

DDL or DML operations may be performed on the source. To ensure that data synchronized to the destination meets your business requirements, you can configure rules to process DDL or DML messages from the source based on the destination type. For information about how to configure rules to process DDL or DML messages, see Configure rules to process DDL messages or Configure rules to process DML messages.