All Products
Search
Document Center

DataWorks:Create a logical model: dimension table

Last Updated:Dec 06, 2024

A dimension is the basic unit of dimensional modeling. In dimensional modeling, measures are referred to as facts and environments are referred to as dimensions. A dimension table contains the details about a specific attribute in a fact table. Common dimensions include the date dimension and the city dimension. This topic describes how to create a dimension table.

Prerequisites

A common layer is used to process and integrate common data to define unified dimension tables. An application layer is used to reconstruct the data that is processed and integrated at a common layer based on your business requirements. You can create a dimension table at a common layer or an application layer based on your business requirements. The following table describes the prerequisites that must be met before you create a dimension table at a specific data layer.

Data layer category

Prerequisite

References

Common layer

A common layer is created. A dimension table must belong to a common layer or an application layer.

Create a data layer

A data domain is created. The data domain is used to determine the perspective of business data stored in a dimension table.

Data domain

Application layer

An application layer is created. A dimension table must belong to a common layer or an application layer.

Create a data layer

A data mart is created. A dimension table at an application layer must belong to a data mart, which is used to organize data for a specific product or scenario.

Data mart

Background information

Extract all the dimensions that possibly exist in each data domain, and store the dimensions and attributes of the dimensions in dimension tables. For example, when you analyze e-commerce business data, possible dimensions (attributes of each dimension) include order (order ID, order creation time, buyer ID, and seller ID), user (gender and birthdate), and commodity (commodity ID, commodity name, and commodity put-on-shelf time). You can create the following dimension tables: order dimension table, user dimension table, and commodity dimension table. The attributes of each dimension are used as the fields in the dimension table. You can deploy the dimension tables in a data warehouse and perform extract, transform, and load (ETL) operations to store dimension data in the format defined in the dimension table. This allows business personnel to access the data for subsequent data analysis.

维度表As illustrated in the preceding figure:

  • When you create a dimension table, you can perform the following operations:

    • You can specify a business category or data domain in which you want to create the dimension table. You can view all dimension tables in a specific data domain or data mart.

    • You can specify the data layer at which you want to create the dimension table. In most cases, a dimension table is stored at the Dimension (DIM) layer, which belongs to the common layer category.

  • After you create a dimension table, you can add dimension attributes as fields of the dimension table. You can also associate the dimension table with fields of other dimension tables, partition the dimension table, and use the same field standard to define the value range of the fields. This ensures consistent attributes for the dimension data across the entire data domain.

  • After you create and configure a dimension table, you can perform the following operations:

    • You can publish and materialize the dimension table to a compute engine instance. The dimension table can be used in the compute engine instance for data analysis.

    • When you design and create derived metrics and aggregate tables, you can directly associate and use the fields of the dimension table.

Create a dimension table

  1. Go to the Data Modeling page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Modeling. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Modeling.

  2. In the top navigation bar of the Data Modeling page, click Dimensional Modeling to go to the Dimensional Modeling page.

  3. Create a dimension table.

    1. On the Dimensional Modeling page, move the pointer over the 加号 icon and choose Logical Model > Create Dimension Table.

    2. In the Basic Information section of the configuration tab that appears, configure basic information for the dimension table.

      You can configure parameters such as Data Layer, Data Domain, Business Category, and Mart for the dimension table. After the dimension table is created, you can view it in the specified data layer, data domain, business category, or data mart. In this example, a common dimension table is created. 创建维度表The following table describes key parameters.

      Parameter

      Description

      Data domain

      The data domain to which the dimension table belongs.

      Business category

      The business category to which the dimension table belongs.

      Storage Policy

      The policy based on which you want to store data in the dimension table, such as the storage period of data and the volume range of data.

      Dimension Table

      The dimension with which you want to associate the dimension table. You can analyze data in the dimension table based on this dimension.

      Note

      For information about how to create a dimension, see Create a conceptual model: dimension.

      Naming Rule

      The checker that is used to check whether the name of the dimension table conforms to the configured naming conventions. After you select a checker, you must configure the Name parameter based on the naming conventions configured in the checker.

      Note

      For information about how to configure a checker, see Configure and use a checker at a data layer.

      Name

      The name of the dimension table. If you configure the Naming Rule parameter, the name that you specify must conform to the naming conventions defined in the checker specified by the Naming Rule parameter.

      Display Name

      The display name of the dimension table.

      Life Cycle

      The lifecycle of the dimension table. Unit: day. The maximum value of this parameter is 36000.

      Owner

      The owner of the dimension table. The default owner is the creator of the dimension table.

      Description

      The description of the dimension table.

  4. After you complete the configuration, click Save in the upper part of the configuration tab.

    You can view and manage the dimension table in the specified data domain or business category in the left-side navigation tree on the Dimensional Modeling page.

Add fields for the dimension table

After you configure basic information for the dimension table, you must add fields for the dimension table.

Note

You can use fast modeling language (FML) statements to add fields for a table, configure attributes for the fields, and configure associations and partitions for the table. For more information, see Use FML statements to configure and manage data tables.

You can add fields for the dimension table in shortcut mode or script mode. The shortcut mode supports the following method to add fields for a dimension table: Import from Table/View. To use this method to add fields for a dimension table, you must import all or specific fields from a physical table or view that is selected from the Search for Existing Table/View drop-down lists to the dimension table.

Note

You can import fields only from physical tables or views that are created within MaxCompute, Hologres, and E-MapReduce (EMR) Hive compute engine instances.

Shortcut Mode: Import from Table/View

从表导入

  1. Click Expand next to Import from Table/View.

  2. In the Search For Existing Table/View drop-down lists, select a compute engine type from the left one, and enter a keyword and select the desired physical table or view from the displayed search results from the right one. Click the following icons next to the name of the selected physical table or view to import all or specific fields from the physical table or view to the dimension table.

    Note
    • When you search for the desired physical table or view, keyword-based fuzzy match is supported. After you enter a keyword, all physical tables or views whose names contain the keyword are displayed.

    • You can search for only tables in the production environment. You cannot search for tables in the development environment.

    • 导入全部字段: imports all fields from the selected physical table or view to the dimension table.

    • 部分字段: imports specific fields from the selected physical table or view to the dimension table.

  3. If you choose to import specific fields, select the fields from the physical table or view in the dialog box that appears. Then, click Import.

    Note

    If one of the imported fields has no display name, you can fill the description of the field as the display name of the field as prompted.

Script Mode

The script mode allows you to edit code to manage the fields in the table. After you click Script Mode, the statement that is used to create the table is displayed in the code editor in the Script Mode dialog box. You can modify the statement based on your business requirements. Then, click OK.代码模式

Configure attributes and associations for fields in the dimension table

After fields are added for the table, you can configure Field Standard to Associate, Redundant Field, and Associated Granularity/Metric for the fields.

  1. Configure attributes for fields in the table.

    By default, the following basic attributes are displayed for fields: Field Name, Data Type, Field Display Name, Description, Primary Key, Not Null, and Actions. You can click Field Display Settings in the upper-right corner of the list of the fields that are added and select the attributes that you want to display. You can also modify the values of the attributes based on your business requirements.

  2. Configure Field Standard to Associate and Lookup Table to Associate for fields in the table.

    You can associate field standards and lookup tables with the added fields by configuring Field Standard to Associate and Lookup Table to Associate for fields in the table.

    • Field Standard to Associate: allows you to manage the values of the fields that have the same meaning but different names in a centralized manner. A field standard can also be used to define the value range and measurement unit for the fields.

    • Lookup Table to Associate: allows you to define the value range of the fields that use a field standard.

  3. Configure Redundant Field for fields in the table.

    In a traditional star model, dimensions are stored in dimension tables. Foreign keys of fact tables are used to obtain dimensions to reduce the storage space that is required. In the dimension model design in DataWorks Data Modeling, some frequently used fields, such as the user ID and the commonly used analysis dimension, can be configured as redundant fields to improve downstream query efficiency, simplify data acquisition operations, and reduce the number of tables that need to be associated.

    Example 1: You can configure the "consignee address" and "consignee mobile phone number" attributes of the "delivery address" dimension in the "delivery address" dimension table as the redundant fields of the "order creation" fact table.

    Example 2: You can configure the "procurement information" and "brand information" attributes in the "commodity information" table as the redundant fields of the "commodity" dimension table.

    You can find the field for which you want to configure redundant fields and click Redundant Field in the Actions column. In the dialog box that appears, select one or more fields that you want to associate with the field and click Save.

    冗余字段

  4. Then, click Save in the upper part of the configuration tab of the table.

What to do next

After the preceding configuration is complete, you must configure partitions for the table and publish the table to the required environment. For more information, see Materialize a table to a compute engine.