All Products
Search
Document Center

:Use the MaxCompute schema feature in DataWorks

Last Updated:Jun 07, 2024

After the schema feature is enabled for your MaxCompute project, the logic for using features that involve items such as MaxCompute tables, resources, and functions in each DataWorks module is adjusted based on the schema feature. This topic describes the use scenarios and working principles of the schema feature and support of DataWorks modules for the schema feature.

Background information

After the schema feature is enabled for your MaxCompute projects, the original two-layer model structure project_name.table_name for MaxCompute tables is changed to a three-layer model structure project_default.schema_default.table_name that includes schemas. You can use schemas to classify items such as tables, resources, and functions in your MaxCompute projects. For more information, see Schema-related operations.

Precautions

Take note of the following information if you want to enable the schema feature for your existing projects as a tenant user:

  • DataWorks displays schema-related interactions only when the schema feature is enabled at the tenant level. To enable the schema feature at the tenant level, the odps.namespace.schema parameter must be set to true for all MaxCompute projects within the related tenant.

    Note

    If only specific projects use custom schemas for storage and the odps.namespace.schema parameter is set to false at the tenant level, DataWorks cannot display schema-related interactions because DataWorks does not support use of custom schemas.

  • By default, paths in the project.table format in the original code may be parsed into paths in the project_default.project.table format based on the parsing rules of MaxCompute. As a result, an error is reported because the table path cannot be found.

  • The dependencies that are obtained by using the automatic parsing feature in DataStudio remain unchanged.

Enable the MaxCompute schema feature

You can determine whether to enable the MaxCompute schema feature based on your business requirements. The following table describes information about how to enable the feature in different scenarios.

Scenario

How to enable

Description

You have a small number of nodes and MaxCompute resources

  • MaxCompute: Submit a ticket to apply to enable the MaxCompute schema feature. To enable the MaxCompute schema feature, set the odps.namespace.schema parameter to true.

  • DataWorks: After the MaxCompute schema feature is enabled, the GUIs for MaxCompute tables in the original two-layer model structure are changed to the GUIs for the same MaxCompute tables in a three-layer model structure, and you cannot switch back to the original GUIs.

The MaxCompute schema feature takes effect at the tenant level. After you enable the feature as a tenant user, the three-layer model structure that includes schemas is applied to your MaxCompute tables in all regions. You cannot disable the feature after it is enabled.

You have a large number of nodes and MaxCompute resources

We recommend that you do not enable the MaxCompute schema feature. Therefore, set the odps.namespace.schema parameter to false.

N/A.

Support of the DataWorks modules for the MaxCompute schema feature

If the schema feature is enabled for your MaxCompute project, you must specify a schema for most operations that are performed in the DataWorks console. The following table describes the support of the DataWorks modules for the MaxCompute schema feature.

Module

Support for the MaxCompute schema feature

DataStudio

Changes to the operations performed in DataStudio:

  • Basic operations

    • You must specify a schema when you perform operations related to tables, resources, and functions.

    • You can view tables by MaxCompute compute engine instance or schema on the Workspace Tables and Tenant Tables pages.

    • When you import a table, the schema to which the table belongs is displayed.

  • Dependency configuration

    The automatic parsing rules of MaxCompute tables in a three-layer model structure that includes schemas are different from those of MaxCompute tables in a two-layer model structure. For more information, see Automatic parsing rules of the MaxCompute schema feature.

Data Modeling

You need to specify a schema when you perform the following operations in Dimensional Modeling:

  • Query existing tables or redundant tables.

  • Publish models to the MaxCompute compute engine.

  • Perform reverse modeling.

维度建模支持选择Schema

Data Integration

You need to specify a schema when you select a source table and a destination table for batch synchronization nodes, real-time synchronization nodes, and data synchronization solutions. You can also create a schema when you configure a data synchronization node.数据集成同步任务支持选择Schema

Data Map

Table names are displayed in the schema.table format in Data Map.

Note

You cannot search for tables by schema.

数据地图展示Schema

Data Quality

  • Table names are displayed in the schema.table format in Data Quality.

  • If you use a custom SQL statement to create a monitoring rule, you must specify a table in the schema.table format.

Note

You cannot search for tables by schema.

Security Center

You can view the schemas of tables and filter tables by schema. The following figure shows the table information on the Permission Application tab.安全中心展示Schema

Approval Center

You can view the schemas of tables and filter tables by schema. The following figure shows information about compute engines on the Processing Details page.审批中心展示Schema

DataService Studio

You can view the schemas of tables and filter tables by schema on pages on which MaxCompute tables are displayed. For example, table names are specified in the schema.table format in the Write Query SQL section.

DataAnalysis

You can view the schemas of tables and filter tables by schema on pages on which MaxCompute tables are displayed. For example, table names are displayed in the schema.table format on the SQL Query page and in the SQL editor.

Data Security Guard

You can view the schemas of tables and filter tables by schema on pages on which MaxCompute tables are displayed. For example, the schemas of tables are displayed on the related pages for sensitive data identification and dynamic and static masking of sensitive data.数据保护伞展示Schema

Automatic parsing rules of the MaxCompute schema feature

When you use the automatic parsing feature for MaxCompute tables in a three-layer model structure that includes schemas, the system completes the format of the table names in code as odps_project.schema.table. If the schema parameter is set to default, the table schema is hidden. The following table provides the details.

Schema type

odps_project.schema.table syntax

Automatic parsing result

default

project.default.table

default.table

odps_project.table

Non-default schema (such as schemaA)

project.schemaA.table

schemaA.table

odps_project.schemaA.table

Note

For information about the automatic parsing feature, see Scheduling dependency configuration guide.