×
Community Blog Best Practices for the Artificial Intelligence Recommendation of Materialized Views in MaxCompute

Best Practices for the Artificial Intelligence Recommendation of Materialized Views in MaxCompute

This article introduces the materialized views of MaxCompute, including its features and use cases.

By Junwei Xia, Senior Product Expert of Alibaba Cloud and Junzheng Zheng, Senior Technical Expert of Alibaba Cloud

What Is a Materialized View?

The materialized view in MaxCompute is a data object that pre-computes and stores result data. A materialized view behaves like a virtual table within a MaxCompute project. It contains aggregated, filtered, and joined results from one or more tables. Materialized views significantly reduce query processing time and save job computing resources. Leveraging the powerful automatic query rewriting capability of the MaxCompute optimizer, complex operations can be automatically replaced with operations that read materialized view results when a job can reuse them.

What Is the Artificial Intelligence Recommendation of Materialized Views?

To use materialized views, it is necessary to understand their working principle, business data behavior, and usage scenarios. This poses a challenge for regular users.

MaxCompute offers the Artificial Intelligence Recommendation of materialized views, which allows users to leverage materialized views seamlessly. By enabling this feature, MaxCompute can automatically analyze business data usage scenarios, recommend suitable materialized views, and visualize their impact. This greatly reduces the threshold for using materialized views and expands the range of scenarios where they can be applied.

Features of the Artificial Intelligence Recommendation of Materialized Views

· Easy to use. Instead of having to understand the intricate details of materialized views, users can simply select their projects and enable the automatic intelligent analysis feature.

· Intelligent. MaxCompute automatically analyzes users' historical jobs, identifies recurring jobs, intelligently extracts common computational logic from the job collection as materialized view computational logic, and presents it to the user in an easily understandable SQL text format, sorted by recommendation degree.

· Easy to manage. The MaxCompute console offers a comprehensive solution for activating, managing, and displaying the effectiveness of materialized views.

Scenarios of the Artificial Intelligence Recommendation of Materialized Views

Data Governance

With the growth of enterprise business, there is an increase in business data, and each department has diverse data analysis needs. In daily operations, there is often cross-usage of data among different departments, resulting in a significant amount of repeated calculations with the same logic.

Finding these repeated calculations is challenging for regular users or big data platform administrators because the repetitive part may only be a fraction of the overall computational logic. Modifying these repeated calculations is also difficult. If a table with repeated calculations is re-abstracted, it requires modifying all downstream dependent jobs and going through testing before relaunching. This additional workload makes it difficult to promote efficient data governance.

By using the Artificial Intelligence Recommendation of materialized views, MaxCompute automatically analyzes common computational logic within projects and provides recommendations for creating materialized views. With materialized views, you can leverage the optimizer's powerful rewriting capabilities to automatically apply the calculation results to jobs without modifying the original logic.

For example, as shown in the following figure, if there is no materialized view, the logic of rhombuses and circles in Tab4 and Tab5 is calculated repeatedly, which is calculated twice in this figure.

1

After the materialized view MV1 is created, the logic of rhombuses and circles is calculated only once. This saves computing resources and improves the computation speed.

2

Intelligent Data Modeling

The first step in traditional big data processing is for data analysis experts who possess both technical and business knowledge to construct a data warehouse and layer it accordingly. A typical model consists of operational data storage, data warehouse details, data warehouse services, and application data services. However, traditional modeling methods have the following drawbacks:

  1. The quality of the model directly affects the accuracy of calculations and heavily relies on the expertise of the modeling professionals.
  2. As businesses evolve and data volumes increase, it is inevitable that the existing model may become unsuitable. Modifying the model again will have an impact on all existing tasks.
  3. Some models are built but are rarely or never used by users, resulting in wasted computing and storage resources.

3

With the Artificial Intelligence Recommendation of materialized views, users no longer need to rely on experts for advance modeling. Intelligent and automatic modeling can be achieved. After users utilize the data, the backend automatically analyzes the repeated computational logic. MaxCompute then recommends and creates materialized views to achieve flexible and fast automatic modeling. Users do not need to worry about data storage or the efficiency of computing resources, allowing them to focus more on business development. This feature is particularly beneficial for small and medium-sized companies as they do not need to hire data modelers. They can rely on the Artificial Intelligence Recommendation of materialized views provided by MaxCompute.

4

Data Report/Dashboards

This feature also provides acceleration capabilities for users' intelligent BI reports and dashboards. MaxCompute automatically analyzes data that is frequently refreshed, recommending the creation of materialized views. With materialized views, users can pre-compute the data required for reports or dashboards. When reports or dashboards are used, MaxCompute automatically rewrites routes to query the materialized views, greatly reducing the response time.

How to Use the Artificial Intelligence Recommendation of Materialized Views

Using the Artificial Intelligence Recommendation of Materialized Views is very simple and can be done by following these steps:

  1. Log in to the MaxCompute console. On the left-side navigation pane, click on Materialized Views.
  2. On the Materialized Views page, go to the Settings tab, enable Intelligent Analysis, and add the name of the project to be analyzed.
  3. After T+1 days, check the Materialized View Recommendations tab to view the common subqueries recommended by the system based on user behavior.
  4. Select the corresponding subquery to create a materialized view.
  5. After T+1 days, check the Materialized View Management tab to see which query calculations are using the materialized view and compare the effects before and after using the materialized view.

Example of Artificial Intelligence Recommendation of Materialized Views

The data middle platform team at Alibaba Group is responsible for building the common layer of the entire Alibaba data warehouse. Their goal is to consolidate the logic of repeated calculations, allowing multiple downstream businesses to access the same result table and save computing and storage resources. However, with the rapid growth of data volume and business complexity, it has become challenging for the traditional common layer to maintain its original state. The main reasons for this are as follows:

• Difficulty in identifying numbers.
• Similar logic, but the result table is not fully accessible.
• Difficulty in manually identifying common logic.

The Artificial Intelligence Recommendation feature of materialized views provided by MaxCompute can address these challenges. The data middle platform team converts the recommendations into materialized views, significantly reducing repeated calculations between downstream jobs and saving computing resources.

Instructions for the Artificial Intelligence Recommendation of Materialized Views

While materialized views can bring positive benefits to users in most cases, it is important to note that they may not solve all problems. Users should keep the following points in mind:

  1. Pay attention to whether data expansion occurs after a common subquery is materialized into a materialized view. If the expansion is several times or higher, it is not recommended to use materialized views.
  2. For post-paid users, it should be noted that materialized views currently save computing resources and reduce complexity but may not necessarily reduce the amount of data scanned. If data expansion occurs during data materialization, the amount of data scanned may increase.
0 1 0
Share on

Alibaba Cloud MaxCompute

137 posts | 19 followers

You may also like

Comments