All Products
Search
Document Center

DataWorks:Materialize a table to a compute engine

Last Updated:Jul 17, 2024

After you create a dimension table, fact table, aggregate table, or application table, you can materialize the table to the desired compute engine. This topic describes how to materialize a table and view the materialization records of a table.

Prerequisites

The required table is created. DataWorks Data Modeling allows you to use one of the following methods to perform data modeling:

Materialize a table

You can materialize a dimension table, fact table, aggregate table, or application table to the compute engine to which a data source added to the current workspace corresponds for subsequent data development and analysis. You can materialize a table to a MaxCompute, Hologres, E-MapReduce (EMR) Hive, CDH, or MySQL compute engine.

You can also export a table and then import the table to a compute engine for subsequent use. When you export a table, you can select the MaxCompute, Hologres, or Hive compute engine. For more information, see Import tables and export a table.

  1. Materialize a table.

    After a table is created, you can click the 发布 icon in the upper part of the configuration tab of the table to materialize the table. In the Publish dialog box, you must configure the parameters that are described in the following table.发布设置

    Parameter

    Description

    Publish Type

    You can select Physical Table, View, or Materialized View.

    Note
    • If you select View for the Publish Type parameter, the table can be materialized only to a MaxCompute compute engine.

    • A materialized view is a database object that stores the pre-calculation results of time-consuming queries, such as JOIN and AGGREGATE. You can directly reuse the pre-calculation results when you want to perform the same queries. This accelerates queries. For more information, see Materialized view operations.

    • If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.

    Workspace

    The workspace to which you want to materialize the table. This parameter is optional.

    • If a workspace is configured as a data development workspace on the Workspace for Data Modeling page in Data Warehouse Planning, you can select the workspace.

    • If no workspace is configured as a data development workspace, the Workspace parameter is not displayed, and the table is automatically materialized to the current workspace.

    Note

    For information about how to configure a workspace as a data development workspace, see Modeling workspace.

    Data Source Type

    The type of the compute engine that is used to store data. Valid values:

    MaxCompute, EMR Hive, Hologres, CDH, and MySQL.

    Note
    • If you select View for the Publish Type parameter, the table can be materialized only to a MaxCompute compute engine.

    • A materialized view is a database object that stores the pre-calculation results of time-consuming queries, such as JOIN and AGGREGATE. You can directly reuse the pre-calculation results when you want to perform the same queries. This accelerates queries. For more information, see Materialized view operations.

    • If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.

    Data Source Name

    The compute engine to which you want to materialize the table.

    Effective Environment

    Valid values: Development and Production.

    Note
    • If you use a workspace that is in basic mode, you can publish the table only to the production environment.

    • If you use a workspace that is in standard mode, you can publish the table to the development or production environment.

    Publish Mode

    • Incremental Publish: If you select this mode, only modifications that are made to a table are materialized to the compute engine.

    • Delete and Recreate: If you select this mode, the system deletes the table that is materialized to the compute engine and recreates the same table that you want to materialize.

    Note
    • The first time you materialize a table, the system creates the table in the related compute engine regardless of the mode in which you materialize the table.

    • If you select Materialized View for the Publish Type parameter, only Delete and Recreate is supported for the Publish Mode parameter.

    Resource Group

    The exclusive resource group for scheduling that is required to materialize the table.

    Note
    • You can use only an exclusive resource group for scheduling to materialize a table as a physical table of an EMR or CDH compute engine.

    • If no exclusive resource groups for scheduling are available, you can create one. For more information, see Create and use an exclusive resource group for scheduling.

    Description

    The description of the materialize operation.

    DDL

    The SQL statement that is used to generate a materialized view. The SQL statement is automatically generated by DataWorks based on the configurations of the table. The materialize operation can succeed only if the AS SELECT clause specified in the SQL statement is successfully executed. You must manually modify the AS SELECT clause as an executable clause.

    Note
    • This parameter is required only when the Publish Type parameter is set to Materialized View.

    • For information about operations on materialized views, see Materialized view operations.

  2. After you configure the parameters, click Publish. The system publishes the table to the development or production environment and materializes the table to the compute engine that you select.

    Note
    • You can view the progress of the materialize operation in the Publish dialog box.

    • You can repeat the publish operation to publish the table to the other environment.

  3. Optional: After the table is materialized, click the 日志 icon in the upper part of the configuration tab of the table to view the result and details of the latest materialize operation.

    The value of the Publishing Status parameter can be Succeeded, Failed, or Publishing. The following figure shows the logs that are generated if the materialize operation succeeds.发布日志

    Note
    • If deploy plan execute success is displayed in the logs, the materialize operation succeeds.

    • If the value of the Publishing Status parameter is Failed, you must locate and troubleshoot the error based on the error message that is displayed.

    • If No publishing logs found is displayed in the Publishing Logs dialog box, the table is never materialized.

View materialization records

After a table is materialized, you can go to the configuration tab of the table and click Publishing Record in the right-side navigation pane to view the materialization details and the information about the table.image

What to do next

After you materialize a table to the desired compute engine, you can go to the DataStudio page to perform data development operations based on the table. For more information about DataStudio, see Features on the DataStudio page.