All Products
Search
Document Center

AnalyticDB:Use federated analytics to synchronize data to Data Lakehouse Edition

Last Updated:Aug 15, 2024

You can use the federated analytics feature together with the AnalyticDB Pipeline Service (APS) feature of AnalyticDB for MySQL to synchronize data in real time from PolarDB for MySQL to AnalyticDB for MySQL Data Lakehouse Edition. This facilitates data synchronization and management.

You can join the DingTalk group 33600023146 to learn more about the federated analytics feature.

Important
  • The upgrade of the federated analytics feature of PolarDB for MySQL was completed on July 23, 2024. No entry point of this feature is available to create synchronization tasks. To create a synchronization task, you can go to the Data Integration page in the PolarDB console. For more information, see Use zero-ETL to synchronize data.

  • If you have configured a federated analytics link, you can still see the entry point of the federated analytics feature for link management in the China (Hangzhou), China (Shanghai), China (Shenzhen), China (Beijing), and US (Virginia) regions.

Prerequisites

Limits

  • PolarDB for MySQL supports the federated analytics feature only for AnalyticDB for MySQL Data Lakehouse Edition clusters.

  • The federated analytics feature is supported only in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), US (Silicon Valley), US (Virginia), Germany (Frankfurt), and UK (London).

  • You can create up to three synchronization jobs for each PolarDB for MySQL cluster and up to 30 synchronization jobs in each region.

Create a synchronization job

  1. Log on to the PolarDB console.

  2. In the upper-left corner of the console, select a region.

  3. In the left-side navigation pane, click Federated Analytics.

  4. Click Create Job. In the Create Job panel, configure the parameters that are described in the following table.

    Parameter

    Description

    Job Name

    The name of the job. Default value: data-sync-<Time>.

    PolarDB for MySQL Cluster

    The ID of the source PolarDB for MySQL cluster.

    Source Database Account

    The database account that is automatically created by the federated analytics feature for the PolarDB for MySQL cluster to synchronize data. The name of the database account starts with sync. Do not delete or modify the name.

    AnalyticDB for MySQL Cluster

    The ID of the destination AnalyticDB for MySQL cluster.

    You can select an existing AnalyticDB for MySQL cluster or click Click to create an AnalyticDB for MySQL cluster to create an AnalyticDB for MySQL cluster.

    Advanced Settings

    By default, advanced settings are disabled. In this case, the entire source cluster is synchronized.

    After you enable advanced settings, you can configure the Select the database or table to synchronize and Large Table Partition Key Settings parameters.

    Select the database or table to synchronize

    You can select the databases and tables that you want to synchronize. By default, all databases and tables are synchronized.

    Important
    • You cannot synchronize tables that do not have primary keys. These tables are automatically filtered out.

    • Each AnalyticDB for MySQL cluster can contain up to 2,048 databases. For more information, see Limits.

    Large Table Partition Key Settings

    To improve the data write and query performance, we recommend that you specify partition keys for tables. For more information, see Table schema design.

    The following partition formats are supported:

    • value: partitioned by value.

    • yyyyMMdd: partitioned by year, month, and day.

    • yyyyMM: partitioned by year and month.

    • yyyy: partitioned by year.

  5. Click OK. The job automatically starts.

    The created job is displayed on the Federated Analytics page. You can click View, Edit, Delete, Suspend or Start in the Actions column.

    Important

    Deleted jobs cannot be recovered.

  6. To analyze data, click the destination cluster ID to go to the AnalyticDB for MySQL console. For more information, see SQL editor.