All Products
Search
Document Center

:Create a node to synchronize schemas of MaxCompute tables

Last Updated:Dec 17, 2024

DataWorks allows you to create multiple Hologres external tables whose schemas are the same as those of the source MaxCompute tables at a time by creating and configuring a node on the DataStudio page. You can then use the created Hologres external tables to query data of the source MaxCompute tables in an accelerated manner. This topic describes how to create a node to synchronize schemas of MaxCompute tables.

Background information

Hologres is an end-to-end real-time data warehousing service developed by Alibaba Cloud and is seamlessly connected to MaxCompute at the underlying layer. You can create external tables in Hologres to query data of MaxCompute tables in an accelerated manner.

DataWorks allows you to synchronize the schemas of MaxCompute tables based on the IMPORT FOREIGN SCHEMA statement in a visualized manner.

Limits

You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.

Note

The operations described in this topic are performed in the China (Shanghai) region. You can perform operations in other regions based on the instructions displayed in the DataWorks console.

Go to the configuration tab of the node that you create to synchronize schemas of MaxCompute tables

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. Create a workflow.

    If you have an existing workflow, skip this step.

    1. Move the pointer over the 新建 icon and select Create Workflow.

    2. In the Create Workflow dialog box, configure the Workflow Name parameter.

    3. Click Create.

  3. Create a Schema Synchronization from MaxCompute node.

    1. Move the pointer over the 新建 icon and choose Create Node > Hologres > Schema Synchronization from MaxCompute.

      You can also find the desired workflow, right-click the workflow, and then choose Create Node > Hologres > Schema Synchronization from MaxCompute.

    2. In the Create Node dialog box, configure the Name, Engine Instance, Node Type, and Path parameters.

    3. Click Confirm to go to the configuration tab of the node.

Configure the Schema Synchronization from MaxCompute node

  1. Configure the node information.

    On the configuration tab of the node, configure the Hologres connection information for the Hologres external tables, the source information of MaxCompute tables, and the policy that is used to handle conflicts that may occur when you create the Hologres external tables.配置节点信息

    1. Configure the parameters in the Destination Information section.

      The parameters that you configure in this section determine the Hologres compute engine in which you want to store the Hologres external tables.

      Parameter

      Description

      Destination Name

      The name of the Hologres compute engine.

      Destination Database

      The name of the database in the Hologres compute engine.

      Schema

      The name of the schema in the database. Default value: public.

    2. Configure the parameters in the Source (Batch create tables based on the following data) section.

      The parameters that you configure in this section determine the source MaxCompute tables based on which you create Hologres external tables. DataWorks allows you to create external tables whose schemas are the same as those of the source tables based on the parameters that you configure in this section. You can then use the created external tables to query data of the source tables in an accelerated manner.

      Parameter

      Description

      Type

      The type of the source table based on which you create a Hologres external table. Only MaxCompute is supported.

      Servers

      The server where the source tables reside.

      You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see postgres_fdw.

      Source Project

      The name of the project to which the source tables belong.

      Select Tables for Query Acceleration

      The source tables based on which you create Hologres external tables.

      • All Tables in Database: All tables in the selected project.

      • Selected Tables: Specific tables in the selected project. If you select this option, you can search for tables by name.

        Note

        Fuzzy match is supported. After you enter a keyword, all tables whose names contain the keyword are displayed.

    3. Configure the parameters in the Advanced Settings section.

      The parameters that you configure in this section determine the policy that is used to handle conflicts that may occur when you create the Hologres external tables.

      Parameter

      Description

      Action for Table Name Conflicts

      The policy that is used to handle the following conflict: The name of a Hologres external table that you want to create is the same as the name of an existing table in Hologres. Valid values:

      • Ignore Conflicts and Continue Creating Tables

      • Update and Change Names of Tables with Same Names

      • Report Error and Create No Table

      Data Type Not Supported

      The policy that is used to handle the following conflict: The type of data in a Hologres external table that you want to create is not supported in Hologres. Valid values:

      • Report Error and Import Failed: If you select this option, the Hologres external table fails to be created.

      • Ignore and Skip Unsupported Fields: If you select this option, the system skips fields whose data types are not supported and continues to create the Hologres external table.

  2. Save the node configurations and run the node.

    1. In the top navigation bar of the configuration tab of the node, click the 保存 icon to save the node configurations.

    2. In the top navigation bar of the configuration tab of the node, click the 运行 icon to create the Hologres external tables.

Note

You must select a serverless resource group of DataWorks that is connected to the Hologres compute engine to run the node. For more information, see Network connectivity solutions.

What to do next

After the Hologres external tables are created, you can go to the Workspace Tables pane in DataStudio to view the created external tables. For more information, see Manage tables. You can also run Hologres commands to query data of the source MaxCompute tables in an accelerated manner. For more information, see Create a foreign table in Hologres to accelerate queries on MaxCompute data.

Note

You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.