PolarDB for PostgreSQL allows you to determine whether to use ePQ based on table size or the costs of execution plans.
Prerequisites
The feature is supported on the PolarDB for PostgreSQL clusters that run the following engine:
PostgreSQL 14 (revision version 14.9.15.0 or later)
You can execute the following statement to view the minor engine version of your PolarDB for PostgreSQL cluster:
select version();
Background information
ePQ is good at processing complex OLAP queries with long execution time, but is not suitable for simple OLTP queries with short execution time. For short-lived queries, the overheads for establishing connections, exchanging data, and releasing connections between compute nodes compromise performance. PolarDB for PostgreSQL allows you to determine whether to use ePQ based on table size or the costs of execution plans, so that you can select an appropriate execution mode for better performance.
Precautions
You can specify the cost threshold and table size threshold only for the following objects or operations:
Common tables
Partitioned tables
Materialized views
Queries without ePQ hints (
/*+ PX() */
)
Parameters
Parameter | Description |
polar_px_min_pg_plan_cost | The minimum cost of the execution plan for enabling ePQ. Valid values: 0 to 999999999999. Default value: 50000. ePQ is disabled if the cost for the execution plan cost on a single node is less than this threshold. |
polar_px_min_table_scan_size | The minimum table size for enabling ePQ. Valid values: 0 to 2147483647. Default value: 100 MB. ePQ is disabled if the size of all tables involved in the query is less than this threshold. |
polar_px_force_use | Specifies whether to force ePQ in queries. Valid values:
|
Usage
Prepare data.
CREATE TABLE test (id INT); INSERT INTO test SELECT generate_series(1, 1000); ANALYZE test;
Query the cost of the execution plan.
Explicitly disable ePQ and query the execution plan on a single node.
SET polar_enable_px TO OFF; EXPLAIN SELECT COUNT(*) FROM test; QUERY PLAN -------------------------------------------------------------- Aggregate (cost=17.50..17.51 rows=1 width=8) -> Seq Scan on test (cost=0.00..15.00 rows=1000 width=0) (2 rows)
In the preceding execution plan,
cost=xxx..xxx
in each line indicates the estimated execution cost of the current operator. The startup cost is displayed before..
and the total cost is displayed after it. The total cost of the topmost operator can be considered as the total execution cost of this query. For example,17.51
in the preceding execution plan is the total execution cost. Because the total cost of the execution plan on a single node is lower than the cost threshold defined by thepolar_px_min_pg_plan_cost
parameter, ePQ is not used even if ePQ is enabled.SET polar_enable_px TO ON; EXPLAIN SELECT COUNT(*) FROM test; QUERY PLAN -------------------------------------------------------------- Aggregate (cost=17.50..17.51 rows=1 width=8) -> Seq Scan on test (cost=0.00..15.00 rows=1000 width=0) (2 rows)
Query the table size.
You can query the table size by using the
pg_relation_size
function.SELECT pg_size_pretty(pg_relation_size('test')); pg_size_pretty ---------------- 40 kB (1 row)
The preceding result shows that the size of the
test
table is 40 kB, which is lower than the table size threshold defined by thepolar_px_min_table_scan_size
parameter. Therefore, ePQ is not used in any query that only references thetest
table even if ePQ is enabled.SET polar_enable_px TO ON; EXPLAIN SELECT COUNT(*) FROM test AS a, test AS b, test AS c WHERE a.id > b.id AND b.id > c.id; QUERY PLAN ---------------------------------------------------------------------------------- Aggregate (cost=5292822.50..5292822.51 rows=1 width=8) -> Nested Loop (cost=0.00..5015045.00 rows=111111000 width=0) Join Filter: (b.id > c.id) -> Nested Loop (cost=0.00..15032.50 rows=333333 width=4) Join Filter: (a.id > b.id) -> Seq Scan on test a (cost=0.00..15.00 rows=1000 width=4) -> Materialize (cost=0.00..20.00 rows=1000 width=4) -> Seq Scan on test b (cost=0.00..15.00 rows=1000 width=4) -> Materialize (cost=0.00..20.00 rows=1000 width=4) -> Seq Scan on test c (cost=0.00..15.00 rows=1000 width=4) (10 rows)
Ignore the specified thresholds.
To ignore the cost threshold and table size threshold, you can force ePQ in queries by using the
polar_px_force_use
parameter.SET polar_enable_px TO ON; SET polar_px_force_use TO ON; EXPLAIN SELECT COUNT(*) FROM test; QUERY PLAN ------------------------------------------------------------------------------------- Finalize Aggregate (cost=0.00..431.00 rows=1 width=8) -> PX Coordinator 6:1 (slice1; segments: 6) (cost=0.00..431.00 rows=1 width=8) -> Partial Aggregate (cost=0.00..431.00 rows=1 width=8) -> Partial Seq Scan on test (cost=0.00..431.00 rows=167 width=1) Optimizer: PolarDB PX Optimizer (5 rows)
Insert data and query the cost of the execution plan.
Insert data into the table to meet the conditions for enabling ePQ.
INSERT INTO test SELECT generate_series(1, 10000000); ANALYZE test;
The size of the
test
table exceeds the minimum table size defined by thepolar_px_min_table_scan_size
parameter.SELECT pg_size_pretty(pg_relation_size('test')); pg_size_pretty ---------------- 348 MB (1 row)
The cost of the execution plan also exceeds the minimum cost defined by the
polar_px_min_pg_plan_cost
parameter.SET polar_enable_px TO OFF; EXPLAIN SELECT COUNT(*) FROM test; QUERY PLAN ----------------------------------------------------------------------------------------- Finalize Aggregate (cost=97621.42..97621.43 rows=1 width=8) -> Gather (cost=97621.21..97621.42 rows=2 width=8) Workers Planned: 2 -> Partial Aggregate (cost=96621.21..96621.22 rows=1 width=8) -> Parallel Seq Scan on test (cost=0.00..86205.77 rows=4166177 width=0) (5 rows)
If ePQ is enabled, ePQ is used for the preceding query.
SET polar_enable_px TO ON; SET polar_px_force_use TO OFF; EXPLAIN SELECT COUNT(*) FROM test; QUERY PLAN -------------------------------------------------------------------------------------- Finalize Aggregate (cost=0.00..470.76 rows=1 width=8) -> PX Coordinator 6:1 (slice1; segments: 6) (cost=0.00..470.76 rows=1 width=8) -> Partial Aggregate (cost=0.00..470.76 rows=1 width=8) -> Partial Seq Scan on test (cost=0.00..467.66 rows=1666471 width=1) Optimizer: PolarDB PX Optimizer (5 rows)