DataWorks allows you to create multiple Hologres external tables whose schemas are the same as those of the source MaxCompute tables at a time by creating and configuring a node on the DataStudio page. You can then use the created Hologres external tables to query data of the source MaxCompute tables in an accelerated manner. This topic describes how to create a node to synchronize schemas of MaxCompute tables with a few clicks.
Background information
Hologres is an all-in-one real-time data warehousing service developed by Alibaba Cloud and is seamlessly connected to MaxCompute at the underlying layer. You can create external tables in Hologres to query data of MaxCompute tables in an accelerated manner.
DataWorks allows you to synchronize the schemas of MaxCompute tables with a few clicks based on the IMPORT FOREIGN SCHEMA statement in a visualized manner.
Limits
You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.
Go to the configuration tab of the node that you create to synchronize schemas of MaxCompute tables with a few clicks
Go to the DataStudio page.
Log on to the DataWorks console. In the left-side navigation pane, choose Data Modeling and Development > DataStudio. On the page that appears, select the desired workspace from the drop-down list and click Go to DataStudio.
Create a workflow.
If you have an existing workflow, skip this step.
Move the pointer over the icon and select Create Workflow.
In the Create Workflow dialog box, configure the Workflow Name parameter.
Click Create.
Create a Schema Synchronization from MaxCompute node.
Move the pointer over the icon and choose .
You can also find the desired workflow, right-click the workflow, and then choose
.In the Create Node dialog box, configure the Name, Engine Instance, Node Type, and Path parameters.
Click Confirm to go to the configuration tab of the node.
Configure the Schema Synchronization from MaxCompute node
Configure the node information.
On the configuration tab of the node, configure the Hologres connection information for the Hologres external tables, the source information of MaxCompute tables, and the policy that is used to handle conflicts that may occur when you create the Hologres external tables.
Configure the parameters in the Destination Information section.
The parameters that you configure in this section determine the Hologres compute engine in which you want to store the Hologres external tables.
Parameter
Description
Destination Name
The name of the Hologres compute engine.
Destination Database
The name of the database in the Hologres compute engine.
Schema
The name of the schema in the database. Default value: public.
Configure the parameters in the Source (Batch create tables based on the following data) section.
The parameters that you configure in this section determine the source MaxCompute tables based on which you create Hologres external tables. DataWorks allows you to create external tables whose schemas are the same as those of the source tables based on the parameters that you configure in this section. You can then use the created external tables to query data of the source tables in an accelerated manner.
Parameter
Description
Type
The type of the source table based on which you create a Hologres external table. Only MaxCompute is supported.
Servers
The server where the source tables reside.
You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see postgres_fdw.
Source Project
The name of the project to which the source tables belong.
Select Tables for Query Acceleration
The source tables based on which you create Hologres external tables.
All Tables in Database: All tables in the selected project.
Selected Tables: Specific tables in the selected project. If you select this option, you can search for tables by name.
NoteFuzzy match is supported. After you enter a keyword, all tables whose names contain the keyword are displayed.
Configure the parameters in the Advanced Settings section.
The parameters that you configure in this section determine the policy that is used to handle conflicts that may occur when you create the Hologres external tables.
Parameter
Description
Action for Table Name Conflicts
The policy that is used to handle the following conflict: The name of a Hologres external table that you want to create is the same as the name of an existing table in Hologres. Valid values:
Ignore Conflicts and Continue Creating Tables
Update and Change Names of Tables with Same Names
Report Error and Create No Table
Data Type Not Supported
The policy that is used to handle the following conflict: The type of data in a Hologres external table that you want to create is not supported in Hologres. Valid values:
Report Error and Import Failed: If you select this option, the Hologres external table fails to be created.
Ignore and Skip Unsupported Fields: If you select this option, the system skips fields whose data types are not supported and continues to create the Hologres external table.
Save the node configurations and run the node.
In the top navigation bar of the configuration tab of the node, click the icon to save the node configurations.
In the top navigation bar of the configuration tab of the node, click the icon to create the Hologres external tables.
You must select a serverless resource group of DataWorks that is connected to the Hologres compute engine to run the node. For more information, see Network connectivity solutions.
What to do next
After the Hologres external tables are created, you can go to the Workspace Tables page in DataStudio to view the created external tables. For more information, see Manage tables. You can also run Hologres commands to query data of the source MaxCompute tables in an accelerated manner. For more information, see Create a foreign table in Hologres to accelerate queries on MaxCompute data.
You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.