This topic describes the real-time materialized views of PolarDB for PostgreSQL.
Prerequisites
Your PolarDB for PostgreSQL cluster runs the following engine:
PostgreSQL 14 (revision version 14.8.11.0 or later)
PostgreSQL 11 (revision version 1.1.27 or later)
You can execute one of the following statements to view the minor version of your PolarDB for PostgreSQL cluster:
PostgreSQL 14
select version();
PostgreSQL 11
show polar_version;
Background information
Unlike common views, materialized views can store query results. In complex query scenarios, using materialized views to save query results can significantly improve query efficiency. However, the data of materialized views does not change with the data in the base tables. This means that you may not always obtain the most up-to-date results when you use materialized views for querying.
To address this issue, PolarDB introduces the concept of real-time materialized views. Compared with materialized views, real-time materialized views have the following benefits:
Real-time materialized views support statement-level updates. After DML statements (INSERT, DELETE, and UPDATE) are executed on base tables, data in the materialized views is automatically updated to remain consistent with data in the base tables.
Real-time materialized views make maximum use of the incremental data in the base tables. When you refresh a materialized view, all data in the materialized view is queried. Compared with frequent refresh of materialized views, real-time materialized views provide better performance.
Real-time materialized views can greatly improve query performance and ensure data consistency with base tables.
Terms
Base table: a common table used in the definition of a materialized view.
Delta: a collection of data that is added or removed when the data in the base table changes, compared with the data in the materialized view.
Refresh: maintains a materialized view so that the data in the materialized view is consistent with the data obtained by querying the current base table based on the view definition.
Apply Delta: inserts or deletes calculated incremental data from a real-time materialized view to maintain data consistency between the real-time materialized view and the base table.
Limits
Real-time materialized views are subject to the following limits:
The base table must be a common table instead of a partitioned table or an inherited.
Only
INNER JOIN
is supported. OtherJOIN
types are not supported.Only
mutable
functions are supported.Only view definitions that contain simple queries, projections,
DISTINCT
, and specific aggregate functions are supported. View definitions that contain complex queries are not supported. The complex queries include subqueries,[NOT] EXISTS
,[NOT] IN
,LIMIT
,HAVING
,DISTINCT ON
,WITH(CTE)
,ORDER BY
, window functions,GROUPING SETS
,CUBE
,ROLLUP
,UNION
,INTERSECT
, andEXCEPT
.When the
GROUP BY
clause is used, the group specified in theGROUP BY
clause must be in the projection.Only the following built-in aggregate functions are supported:
MIN
,MAX
,SUM
,AVG
, andCOUNT
.
Performance degradation
Real-time materialized views greatly improve query performance but have a significant impact on the write performance of the base tables. If the number of read operations is greater than the number of write operations, we recommend that you use real-time materialized views.
The impact of real-time materialized views on the write performance of the base tables depends on factors such as the view definitions and the write loads, structures, and indexes of the base tables. Before you create real-time materialized views in the production environment, we recommend that you first test the impact of real-time materialized views on the write performance of the base tables in the test environment. You can use real-time materialized views in the production environment if the write performance meets the requirements.
The following methods can be used to reduce the costs of real-time materialized views:
Create a few real-time materialized views on the same base table.
Batch write data to the base table. For example, you can execute the
COPY
orINSERT INTO SELECT
statement to batch import data.Create primary keys for all base tables, and include the primary keys for all base tables in the projected column for the definitions of real-time materialized views.
How it works
Create a real-time materialized view
Rewrite the query of the materialized view and calculate the hidden columns required to maintain the real-time materialized view.
Create a trigger for the base table to refresh the real-time materialized view.
Create unique indexes for the real-time materialized view when specified conditions are met to accelerate delta refresh.
Refresh a real-time materialized view
Data changes in the base table activate the trigger.
Obtain incremental data from the base table by using the trigger.
Calculate the delta of the real-time materialized view based on the definitions and the incremental data of the current base table.
Apply the calculated incremental data to the real-time materialized view to implement delta refresh.
Delete a real-time materialized view
Delete the delta refresh trigger from the base table of the real-time materialized view.
Delete the real-time materialized view.
Usage notes
Prerequisites
Before you create real-time materialized views for a cluster, you must update the minor engine version of the cluster to 1.1.27 and create the
polar_ivm
plug-in in the cluster.CREATE EXTENSION polar_ivm WITH SCHEMA pg_catalog;
NoteYou can execute the following statement to query the minor version of your PolarDB for PostgreSQL cluster:
show polar_version;
Create a real-time materialized view
CREATE MATERIALIZED VIEW table_name[ (column_name [, ...] ) ] REFRESH FAST ON COMMIT AS query [ WITH [ NO ] DATA ]
The following table describes the parameters.
Parameter
Description
table_name
The name of the real-time materialized view, which can be schema-qualified.
column_name
The name of a column in the real-time materialized view. If no column name is provided, it is obtained from the queried column names.
WITH DATA
The default option. A complete real-time materialized view is immediately created.
WITH NO DATA
Only the structure of the real-time materialized view is created, but no data is populated. No real-time refresh is performed on the real-time materialized view.
When you query the real-time materialized view, an error is reported before the
REFRESH MATERIALIZED VIEW
statement is executed on the real-time materialized view.query
The definition of the real-time materialized view, which can be a SELECT statement, a TABLE expression, or a VALUES expression. The query runs within a secure and restricted operation.
Refresh a real-time materialized view
REFRESH MATERIALIZED VIEW table_name
Notetable_name: the name of the real-time materialized view.
For a real-time materialized view created by using the
WITH DATA
parameter, no manual refresh is required to ensure data consistency between the view and the base table.For a real-time materialized view created by using the
WITH NO DATA
parameter, manual refresh generates data based on the definition and enables real-time refresh for subsequent modifications of the base table.
Delete a real-time materialized view
DROP MATERIALIZED VIEW [ IF EXISTS ] table_name [, ...] [ CASCADE | RESTRICT ]
The following table describes the parameters.
Parameter
Description
IF EXISTS
If the real-time materialized view does not exist, a prompt, instead of an error, is returned.
table_name
The name of the real-time materialized view, which can be schema-qualified.
CASCADE
The objects that depend on the real-time materialized view, such as other materialized views or regular views, are automatically deleted, and then all objects that depend on automatically deleted objects are deleted.
RESTRICT
If any object depends on the real-time materialized view, the real-time materialized view is not deleted. This parameter is selected by default.
Performance test
Create the dependency plug-in for the real-time materialized view.
CREATE EXTENSION IF NOT EXISTS polar_ivm WITH SCHEMA pg_catalog ;
Create the base table and populate the table with data.
CREATE TABLE t( a INT, b VARCHAR); INSERT INTO t VALUES (1,'a'), (2,'b'), (3,'c'), (4,'d'), (5,'e');
Create the real-time materialized view.
CREATE MATERIALIZED VIEW mv REFRESH FAST ON COMMIT AS SELECT max(a),min(a),b FROM t GROUP BY b;
Execute a DML statement on the base table.
Query the data of the real-time materialized view.
SELECT * FROM mv ORDER BY b;
Sample result:
max | min | b -----+-----+--- 1 | 1 | a 2 | 2 | b 3 | 3 | c 4 | 4 | d 5 | 5 | e (5 rows)
The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Insert new data into the base table and query the data of the real-time materialized view.
INSERT INTO t VALUES(6,'f'); SELECT * FROM mv ORDER BY b;
Sample result:
max | min | b -----+-----+--- 1 | 1 | a 2 | 2 | b 3 | 3 | c 4 | 4 | d 5 | 5 | e 6 | 6 | f (6 rows)
The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Delete the data of the base table and query the data of the real-time materialized view.
DELETE FROM t WHERE a = 2; SELECT * FROM mv ORDER BY b;
Sample result:
max | min | b -----+-----+--- 1 | 1 | a 3 | 3 | c 4 | 4 | d 5 | 5 | e 6 | 6 | f (5 rows)
The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Update the data of the base table and query the data of the real-time materialized view.
UPDATE t SET a = a + 1; SELECT * FROM mv ORDER BY b;
Sample result:
max | min | b -----+-----+--- 2 | 2 | a 4 | 4 | c 5 | 5 | d 6 | 6 | e 7 | 7 | f (5 rows)
The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Delete the real-time materialized view.
DROP MATERIALIZED VIEW mv;