All Products
Search
Document Center

MaxCompute:Recommendations and management of materialized views

Last Updated:Dec 16, 2024

With continuous use of MaxCompute, you may perform repeated queries on MaxCompute tables by using the same computational logic. To reduce repeated computing, accelerate computing, and improve computing efficiency, MaxCompute provides the intelligent analysis feature on materialized views to help you analyze data and provide recommendations on how to create materialized views. This topic describes how to create a materialized view by using the intelligent analysis feature.

Limits

  • You can enable the intelligent analysis feature only for a single project. You cannot use the feature to perform cross-project or multi-project analysis and recommendation.

  • To accurately generate recommendations for creating a materialized view and view the recommendations on the Materialized View Recommendations tab, you must make sure that the following conditions are met after you enable the intelligent analysis feature:

    • Cycle tasks run for more than three consecutive days.

    • The number of rows of input data in the public subquery that is selected for creating a materialized view is greater than 1,000,000.

    • The public subquery includes operators that are related to data reorganization, such as join or aggregation operators.

  • The materialized view recommendations feature can be used only in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and Germany (Frankfurt).

Procedure

View recommendations of materialized views

On the Materialized View Recommendations tab, you can view the impact and recommendation details of beneficial jobs after a materialized view is created in a specific region. This helps you determine whether to adopt the recommendation.

  1. Log on to the MaxCompute console. In the top navigation bar, select a region.

  2. In the left-side navigation pane, choose Workspace > Materialized Views.

  3. On the Materialized Views page, click the Materialized View Recommendations tab.

  4. On the Materialized View Recommendations tab, configure the following parameters to check whether recommendations for creating materialized views are generated.

    Parameter

    Description

    Project Name

    Select a MaxCompute project from the drop-down list.

    Source Table Name

    Enter the name of a source table. Fuzzy match is supported. You can enter multiple table names. Separate the names with commas (,).

    Analyze Time Interval

    Select the start date and end date to obtain the public subqueries that are performed within the specified date range.

You can view the following information on the Materialized View Recommendations tab:

  • Impacts

    Item

    Description

    Estimated Number of Beneficial Jobs Per Day

    The estimated number of beneficial jobs each day after a recommended materialized view is created.

    Current CU-hour Consumption and Percentage Per Day of Beneficial Jobs

    The ratio of the estimated compute unit (CU)-hours consumed by beneficial jobs per day after a recommended materialized view is created to the CU-hours consumed by all jobs per day. A greater percentage value indicates a larger impact.

    Current Computing Resource Consumption and Percentage Per Day of Beneficial Jobs

    The ratio of the estimated computing resources used (Amount of scanned data × Complexity) by beneficial jobs per day after a recommended materialized view is created to the computing resources used (Amount of scanned data × Complexity) by all jobs per day. A greater percentage value indicates a larger impact.

    Estimated Computing Resources Required for Creating All MVs

    Creating a recommended materialized view requires one-time resource consumption. When you create a materialized view and use pay-as-you-go computing resources to run the materialized view, computing costs are generated. You can estimate the computing costs based on the computing resources required (Amount of scanned data × Complexity).

  • Public subquery details

    Column

    Description

    MV Summary

    The summary of the public subquery.

    • Aggregate specifies whether the recommended subquery includes an aggregate operation.

    • Inputs specifies the names of all source tables used by a subquery.

    You can click the information in the MV Summary column to go to the Public Subquery Details panel and view the jobs in which the public subquery is frequently used and the SQL script of the public subquery. You can view the following information:

    • Public Subquery No.: the unique ID that is automatically generated in the backend. The ID is associated with a materialized view.

    • Public Subquery: the SQL script of the public subquery.

      • Click the image icon to expand the SQL script or view the SQL script in full screen.

      • Click the image icon to copy the SQL script.

      • Click the image icon to download the SQL script.

    • Original SQL List (Data in the Last 5 Days): displays only the original job instances that use the subquery in the previous five days. The following information about each job instance is displayed:

      • Instance ID: the ID of the original job instance that uses the subquery.

      • Submit Project Name: the name of the project in which the job instance runs.

      • Submitted By: the submitter of the job.

      • Execution Time (s): the duration for which the subquery runs. Unit: seconds.

      • Original Query SQL: the operations that can be performed.

        • View Details: You can click View Details to view SQL details.

        • Download: You can click Download to download the related SQL script.

    Project

    The project in which the job that uses the subquery runs.

    Recommend Rating

    The recommendation rating for creating a materialized view. The recommendation rating is represented by one star to five stars. A greater number of stars indicates that creation of a materialized view is more recommended.

    Number of Beneficial Jobs

    The number of jobs that use the public subquery.

    Amount of Scanned Data

    The data scanned by the public subquery.

    Complexity

    The complexity of the public subquery.

    Computing Resources Required for MV Creation

    The computing resources used for creating a materialized view. The value is calculated based on the following formula: Amount of scanned data × Complexity.

    Analysis Time

    The recommended analysis time of the public subquery.

    Operations Included

    • Include Join Operation: An SQL query contains multi-table joins.

    • Aggregate: An SQL query contains aggregate operations.

    You can filter operations.

    Creation Status

    Specifies whether a materialized view is created for the public subquery.

    Actions

    If the creation status of the subquery is Not Created, you can click Create Materialized View in the Actions column to create a materialized view.

    If the creation status of the subquery is Created, you can click View Materialized Views in the Actions column to view the materialized view details.

Create a materialized view

On the Materialized View Recommendation tab, determine whether to create a materialized view based on your business requirements and recommendation details. To create a materialized view, perform the following steps:

  1. On the Materialized View Recommendation tab, find a recommendation based on which you want to create a materialized view and click Create Materialized View in the Actions column.

  2. On the Create Materialized View page, configure the following parameters for a materialized view.

    Parameter

    Description

    Materialized View Name

    The name of the materialized view. By default, rec_mv_{$project_name}_{$timestamp} is entered. You can modify the name based on your business requirements. The name must start with a letter or digit and contain letters, digits, and underscores (_).

    Refresh Type

    The method that is used to update data in the materialized view. Valid values:

    • Refresh at a Fixed Interval: updates data in the materialized view at a specific interval. The interval can be minutes, hours, or days.

    • Cron: updates data in the materialized view in Cron mode.

    Script

    The SQL script that is used to create the recommended materialized view.

    • Click the image icon to expand the SQL script or view the SQL script in full screen.

    • Click the image icon to copy the SQL script.

    • Click the image icon to download the SQL script.

  3. Click OK to create a materialized view.

Important

The creation of a materialized view generates storage costs. The storage costs and calculation method are the same as those of standard tables. In addition, if you use pay-as-you-go computing resources to run the created materialized view, computing costs are generated. You can estimate the costs of creating a materialized view based on the used computing resources (Amount of scanned data × Complexity).

Manage materialized views and view actual benefits

  1. Log on to the MaxCompute console. In the top navigation bar, select a region.

  2. In the left-side navigation pane, choose Workspace > Materialized Views.

  3. On the Materialized Views page, click the MV Management and Benefits tab.

  4. Select a MaxCompute project and a profit statistics range to view the list of created materialized views and benefits. You can also manage the materialized views.

    • The following table describes the benefit items.

      Item

      Description

      Hits

      The number of times the materialized views are called within the specified profit statistics range.

      Saved Computing Duration

      The difference between the computing duration of jobs that hit the materialized views in the selected project and the estimated computing duration of these jobs if the materialized views are not created within the specified profit statistics range.

      Saved CU-hours

      The difference between the CU-hours consumed by jobs that hit the materialized views in the selected project and the estimated CU-hours consumed by these jobs if the materialized views are not created within the specified profit statistics range.

      Saved Computing Resources

      The difference between the computing resources used (Amount of scanned data × Complexity) by jobs that hit the materialized views in the selected project and the estimated computing resources used (Amount of scanned data × Complexity) by these jobs if the materialized views are not created within the specified profit statistics range. Only pay-as-you-go jobs are counted.

      Savings in Computing Costs (Catalog Price)

      The costs of the saved computing resources. The value is calculated based on the following formula: Computing cost savings (catalog price) = Saved computing resources × CNY 0.3 per GB (unit price for SQL computing jobs in the pay-as-you-go Standard Edition).

    • The following table describes the columns in the list of materialized views.

      Column

      Description

      Materialized View Name

      The name of the materialized view.

      Hits

      The number of times the materialized view is called within the specified profit statistics range. The profit statistics range can be Since MV Creation or Previous 1 Day.

      Saved Computing Duration

      The difference between the computing duration of jobs that hit the materialized view and the estimated computing duration of these jobs if the materialized view is not created within the specified profit statistics range.

      Saved CU-hours

      The difference between the CU-hours consumed by jobs that hit the materialized view and the estimated CU-hours consumed by these jobs if the materialized view is not created within the specified profit statistics range.

      Saved Computing Resources

      The difference between the computing resources used (Amount of scanned data × Complexity) by jobs that hit the materialized view and the estimated computing resources used (Amount of scanned data × Complexity) by these jobs if the materialized view is not created within the specified profit statistics range. Only pay-as-you-go jobs are counted.

      Created At

      The time when the materialized view was created.

      Status

      The status of the materialized view. Valid values:

      • Created

      • Creating

      Actions

      The operations that you can perform on the materialized view.

      • View Details: You can click View Details to view the basic information and call details of the materialized view. The Materialized View Details panel displays the following information:

        • Materialized View Name: the name of the materialized view.

        • Basic Information section:

          • Created At: the time when the materialized view was created.

          • Table Schema Updated At: the time when the table schema of the materialized view was updated.

          • Storage Size: the size of the storage space occupied by the materialized view.

          • Refresh Interval (Minutes): the frequency at which the data in the materialized view is updated.

          • Refreshed At: the time when the data in the materialized view was last updated.

        • Code Details: the SQL script that defines the materialized view.

          • Click the image icon to expand the SQL script or view the SQL script in full screen.

          • Click the image icon to copy the SQL script.

          • Click the image icon to download the SQL script.

        • Materialized View Call Details section: displays the details of all jobs that call the materialized view. You can view the following information about each job:

          • Instance ID: the instance ID of the job. You can click the instance ID to view the detailed information about the job.

          • Call MV Execution Time (s): the duration for which the job runs after the job calls the materialized view. Unit: seconds.

          • Save Time (s): the saved time for the job running after the job calls the materialized view.

          • Consumed CU-hours: the CU-hours consumed by the job when the job is running.

          • Saved CU-hours: the saved CU-hours of the job after the job calls the materialized view.

          • Used Computing Resources: the computing resources used when the job is running. The value is calculated based on the following formula: Used computing resources = Amount of scanned data × Complexity.

          • Saved Computing Resources: the saved computing resources of the job after the job calls the materialized view. The value of used computing resources is calculated based on the following formula: Amount of scanned data × Complexity.

      • Delete: You can click Delete to delete the materialized view.