All Products
Search
Document Center

PolarDB:Determine whether to use ePQ based on table size or costs

Last Updated:Apr 12, 2024

PolarDB for PostgreSQL allows you to determine whether to use ePQ based on table size or the costs of execution plans.

Prerequisites

The feature is supported on the PolarDB for PostgreSQL clusters that run the following engine:

PostgreSQL 14 (revision version 14.9.15.0 or later)

Note

You can execute the following statement to view the minor engine version of your PolarDB for PostgreSQL cluster:

select version();

Background information

ePQ is good at processing complex OLAP queries with long execution time, but is not suitable for simple OLTP queries with short execution time. For short-lived queries, the overheads for establishing connections, exchanging data, and releasing connections between compute nodes compromise performance. PolarDB for PostgreSQL allows you to determine whether to use ePQ based on table size or the costs of execution plans, so that you can select an appropriate execution mode for better performance.

Precautions

You can specify the cost threshold and table size threshold only for the following objects or operations:

  • Common tables

  • Partitioned tables

  • Materialized views

  • Queries without ePQ hints (/*+ PX() */)

Parameters

Parameter

Description

polar_px_min_pg_plan_cost

The minimum cost of the execution plan for enabling ePQ. Valid values: 0 to 999999999999. Default value: 50000. ePQ is disabled if the cost for the execution plan cost on a single node is less than this threshold.

polar_px_min_table_scan_size

The minimum table size for enabling ePQ. Valid values: 0 to 2147483647. Default value: 100 MB. ePQ is disabled if the size of all tables involved in the query is less than this threshold.

polar_px_force_use

Specifies whether to force ePQ in queries. Valid values:

  • on

  • off (default)

Usage

  1. Prepare data.

    CREATE TABLE test (id INT);
    INSERT INTO test SELECT generate_series(1, 1000);
    ANALYZE test;
  2. Query the cost of the execution plan.

    Explicitly disable ePQ and query the execution plan on a single node.

    SET polar_enable_px TO OFF;
    EXPLAIN SELECT COUNT(*) FROM test;
                              QUERY PLAN
    --------------------------------------------------------------
     Aggregate  (cost=17.50..17.51 rows=1 width=8)
       ->  Seq Scan on test  (cost=0.00..15.00 rows=1000 width=0)
    (2 rows)

    In the preceding execution plan, cost=xxx..xxx in each line indicates the estimated execution cost of the current operator. The startup cost is displayed before .. and the total cost is displayed after it. The total cost of the topmost operator can be considered as the total execution cost of this query. For example, 17.51 in the preceding execution plan is the total execution cost. Because the total cost of the execution plan on a single node is lower than the cost threshold defined by the polar_px_min_pg_plan_cost parameter, ePQ is not used even if ePQ is enabled.

    SET polar_enable_px TO ON;
    EXPLAIN SELECT COUNT(*) FROM test;
                              QUERY PLAN
    --------------------------------------------------------------
     Aggregate  (cost=17.50..17.51 rows=1 width=8)
       ->  Seq Scan on test  (cost=0.00..15.00 rows=1000 width=0)
    (2 rows)
  3. Query the table size.

    You can query the table size by using the pg_relation_size function.

    SELECT pg_size_pretty(pg_relation_size('test'));
     pg_size_pretty
    ----------------
     40 kB
    (1 row)

    The preceding result shows that the size of the test table is 40 kB, which is lower than the table size threshold defined by the polar_px_min_table_scan_size parameter. Therefore, ePQ is not used in any query that only references the test table even if ePQ is enabled.

    SET polar_enable_px TO ON;
    EXPLAIN SELECT COUNT(*) FROM test AS a, test AS b, test AS c WHERE a.id > b.id AND b.id > c.id;
                                        QUERY PLAN
    ----------------------------------------------------------------------------------
     Aggregate  (cost=5292822.50..5292822.51 rows=1 width=8)
       ->  Nested Loop  (cost=0.00..5015045.00 rows=111111000 width=0)
             Join Filter: (b.id > c.id)
             ->  Nested Loop  (cost=0.00..15032.50 rows=333333 width=4)
                   Join Filter: (a.id > b.id)
                   ->  Seq Scan on test a  (cost=0.00..15.00 rows=1000 width=4)
                   ->  Materialize  (cost=0.00..20.00 rows=1000 width=4)
                         ->  Seq Scan on test b  (cost=0.00..15.00 rows=1000 width=4)
             ->  Materialize  (cost=0.00..20.00 rows=1000 width=4)
                   ->  Seq Scan on test c  (cost=0.00..15.00 rows=1000 width=4)
    (10 rows)
  4. Ignore the specified thresholds.

    To ignore the cost threshold and table size threshold, you can force ePQ in queries by using the polar_px_force_use parameter.

    SET polar_enable_px TO ON;
    SET polar_px_force_use TO ON;
    EXPLAIN SELECT COUNT(*) FROM test;
                                         QUERY PLAN
    -------------------------------------------------------------------------------------
     Finalize Aggregate  (cost=0.00..431.00 rows=1 width=8)
       ->  PX Coordinator 6:1  (slice1; segments: 6)  (cost=0.00..431.00 rows=1 width=8)
             ->  Partial Aggregate  (cost=0.00..431.00 rows=1 width=8)
                   ->  Partial Seq Scan on test  (cost=0.00..431.00 rows=167 width=1)
     Optimizer: PolarDB PX Optimizer
    (5 rows)
  5. Insert data and query the cost of the execution plan.

    Insert data into the table to meet the conditions for enabling ePQ.

    INSERT INTO test SELECT generate_series(1, 10000000);
    ANALYZE test;

    The size of the test table exceeds the minimum table size defined by the polar_px_min_table_scan_size parameter.

    SELECT pg_size_pretty(pg_relation_size('test'));
     pg_size_pretty
    ----------------
     348 MB
    (1 row)

    The cost of the execution plan also exceeds the minimum cost defined by the polar_px_min_pg_plan_cost parameter.

    SET polar_enable_px TO OFF;
    EXPLAIN SELECT COUNT(*) FROM test;
                                           QUERY PLAN
    -----------------------------------------------------------------------------------------
     Finalize Aggregate  (cost=97621.42..97621.43 rows=1 width=8)
       ->  Gather  (cost=97621.21..97621.42 rows=2 width=8)
             Workers Planned: 2
             ->  Partial Aggregate  (cost=96621.21..96621.22 rows=1 width=8)
                   ->  Parallel Seq Scan on test  (cost=0.00..86205.77 rows=4166177 width=0)
    (5 rows)

    If ePQ is enabled, ePQ is used for the preceding query.

    SET polar_enable_px TO ON;
    SET polar_px_force_use TO OFF;
    EXPLAIN SELECT COUNT(*) FROM test;
                                          QUERY PLAN
    --------------------------------------------------------------------------------------
     Finalize Aggregate  (cost=0.00..470.76 rows=1 width=8)
       ->  PX Coordinator 6:1  (slice1; segments: 6)  (cost=0.00..470.76 rows=1 width=8)
             ->  Partial Aggregate  (cost=0.00..470.76 rows=1 width=8)
                   ->  Partial Seq Scan on test  (cost=0.00..467.66 rows=1666471 width=1)
     Optimizer: PolarDB PX Optimizer
    (5 rows)