After you create a dimension table, fact table, aggregate table, or application table, you can materialize the table to the desired compute engine. This topic describes how to materialize a table and view the materialization records of a table.
Prerequisites
The required table is created. DataWorks Data Modeling allows you to use one of the following methods to perform data modeling:
Use the visualized user interface (UI). For information about how to create a table, see Create a logical model: dimension table, Create a logical model: fact table, Create a logical model: aggregate table, or Create a logical model: application table.
Import information about multiple tables to Data Modeling at a time.
Materialize a table
You can materialize a dimension table, fact table, aggregate table, or application table to the compute engine to which a data source added to the current workspace corresponds for subsequent data development and analysis. You can materialize a table to a MaxCompute, Hologres, E-MapReduce (EMR) Hive, CDH, or MySQL compute engine.
You can also export a table and then import the table to a compute engine for subsequent use. When you export a table, you can select the MaxCompute, Hologres, or Hive compute engine. For more information, see Import tables and export a table.
Materialize a table.
After a table is created, you can click the icon in the upper part of the configuration tab of the table to materialize the table. In the Publish dialog box, you must configure the parameters that are described in the following table.
Parameter
Description
Publish Type
You can select Physical Table, View, or Materialized View.
NoteIf you select View for the Publish Type parameter, the table can be materialized only to a MaxCompute compute engine.
A materialized view is a database object that stores the pre-calculation results of time-consuming queries, such as JOIN and AGGREGATE. You can directly reuse the pre-calculation results when you want to perform the same queries. This accelerates queries. For more information, see Materialized view operations.
If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.
Workspace
The workspace to which you want to materialize the table. This parameter is optional.
If a workspace is configured as a data development workspace on the Workspace for Data Modeling page in Data Warehouse Planning, you can select the workspace.
If no workspace is configured as a data development workspace, the Workspace parameter is not displayed, and the table is automatically materialized to the current workspace.
NoteFor information about how to configure a workspace as a data development workspace, see Modeling workspace.
Data Source Type
The type of the compute engine that is used to store data. Valid values:
MaxCompute, EMR Hive, Hologres, CDH, and MySQL.
NoteIf you select View for the Publish Type parameter, the table can be materialized only to a MaxCompute compute engine.
A materialized view is a database object that stores the pre-calculation results of time-consuming queries, such as JOIN and AGGREGATE. You can directly reuse the pre-calculation results when you want to perform the same queries. This accelerates queries. For more information, see Materialized view operations.
If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.
Data Source Name
The compute engine to which you want to materialize the table.
Effective Environment
Valid values: Development and Production.
NoteIf you use a workspace that is in basic mode, you can publish the table only to the production environment.
If you use a workspace that is in standard mode, you can publish the table to the development or production environment.
Publish Mode
Incremental Publish: If you select this mode, only modifications that are made to a table are materialized to the compute engine.
Delete and Recreate: If you select this mode, the system deletes the table that is materialized to the compute engine and recreates the same table that you want to materialize.
NoteThe first time you materialize a table, the system creates the table in the related compute engine regardless of the mode in which you materialize the table.
If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.
Resource Group
The exclusive resource group for scheduling that is required to materialize the table.
NoteYou can use only an exclusive resource group for scheduling to materialize a table as a physical table of an EMR or CDH compute engine.
If no exclusive resource groups for scheduling are available, you can create one. For more information, see Create and use an exclusive resource group for scheduling.
Description
The description of the materialize operation.
DDL
The SQL statement that is used to generate a materialized view. The SQL statement is automatically generated by DataWorks based on the configurations of the table. The materialize operation can succeed only if the AS SELECT clause specified in the SQL statement is successfully executed. You must manually modify the
AS SELECT
clause as an executable clause.NoteThis parameter is required only when the Publish Type parameter is set to Materialized View.
For information about operations on materialized views, see Materialized view operations.
After you configure the parameters, click Publish. The system publishes the table to the development or production environment and materializes the table to the compute engine that you select.
NoteYou can view the progress of the materialize operation in the Publish dialog box.
You can repeat the publish operation to publish the table to the other environment.
Optional: After the table is materialized, click the icon in the upper part of the configuration tab of the table to view the result and details of the latest materialize operation.
The value of the Publishing Status parameter can be Succeeded, Failed, or Publishing. The following figure shows the logs that are generated if the materialize operation succeeds.
NoteIf
deploy plan execute success
is displayed in the logs, the materialize operation succeeds.If the value of the Publishing Status parameter is Failed, you must locate and troubleshoot the error based on the error message that is displayed.
If No publishing logs found is displayed in the Publishing Logs dialog box, the table is never materialized.
View materialization records
After a table is materialized, you can go to the configuration tab of the table and click Publishing Record in the right-side navigation pane to view the materialization details and the information about the table.
What to do next
After you materialize a table to the desired compute engine, you can go to the DataStudio page to perform data development operations based on the table. For more information about DataStudio, see Features on the DataStudio page.