Best Practices

0.0.201

PolarDB for PostgreSQL supports the Multi-node Elastic Parallel Query (ePQ) feature. This feature can efficiently handle lightweight analytical queries and meet the increasing demand for hybrid transaction/analytical processing (HTAP).

Overview

When ePQ is used for queries, PolarDB for PostgreSQL uses the ePQ optimizer to generate an execution plan that can be executed by multiple compute nodes in parallel. The execution engine of ePQ coordinates the execution of the plan on multiple compute nodes, and uses the CPU, memory, and I/O bandwidth of multiple nodes to scan and compute data.

To achieve serverless elastic scaling, you can use Grand Unified Configuration (GUC) parameters to dynamically adjust the number of compute nodes that participate in ePQ (scale-out) and the degree of parallelism for each node (scale-up).

ePQ is suitable for complex online analytical processing (OLAP) queries that require a long execution duration but not for simple online transaction processing (OLTP) queries that require a short execution duration. In short queries, operations that are performed between compute nodes, such as connection establishment, data exchange, and connection shutdown can degrade query performance. PolarDB for PostgreSQL determines whether to use ePQ or single-node execution for queries based on the table size or the cost of an execution plan. This ensures that the query execution method with higher query performance is selected.

For more information about how ePQ works and performs, see PolarDB for PostgreSQL: ePQ architecture.

Enable ePQ

The following statements enable the ePQ feature for a table that is named t1. If PolarDB PX Optimizer is returned in the plan, ePQ is enabled.

SET polar_enable_px = 1;
EXPLAIN SELECT * FROM t1;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 PX Coordinator 6:1  (slice1; segments: 6)  (cost=0.00..431.00 rows=1 width=8)
   ->  Partial Seq Scan on t1  (cost=0.00..431.00 rows=1 width=8)
 Optimizer: PolarDB PX Optimizer
(3 rows)

SELECT * FROM t1;

GUC parameters

Parameter	Description

Parameter	Description
polar_enable_px	Specifies whether to enable the ePQ feature. Valid values: on off (default) Note If this parameter is set to on, queries are preferentially sent to the ePQ optimizer to generate execution plans that can be executed in parallel on multiple compute nodes. All ePQ-related GUC parameters take effect.
polar_px_nodes	The names of the compute nodes that participate in ePQ. This parameter is empty by default, which indicates that all read-only nodes participate in ePQ. If you want only some read-only nodes to participate in ePQ, you can execute the following statements to obtain the names of the read-only nodes: `=> CREATE EXTENSION polar_monitor; CREATE EXTENSION => SELECT name,slot_name,type FROM polar_cluster_info WHERE type = 'Replica'; name \| slot_name \| type -------+-----------+--------- node2 \| replica1 \| Replica node3 \| replica2 \| Replica (2 rows)` Set the polar_px_nodes parameter to the obtained node names and separate the names with commas (,). `=> SET polar_px_nodes = 'node2,node3'; => SHOW polar_px_nodes; polar_px_nodes ---------------- node2,node3 (1 row)`
polar_px_dop_per_node	The number of worker processes that participate in ePQ on each compute node in the current session. Default value: 3. Note As a general best practice, we recommend that you set this parameter to half the number of CPU cores of the current node. If the CPU load of the current node is high, you can decrement the value of this parameter until the CPU utilization drops to or below 80%. If the query performance is low, you can increment the value of this parameter. However, the CPU utilization must not exceed 80%. Otherwise, other background processes in the system may be slowed down.
polar_px_max_workers_number	The maximum number of ePQ worker processes that can exist simultaneously on each compute node. Default value: 30. Note If the number of ePQ worker processes on a compute node exceeds this limit, a query error occurs. `ERROR: over px max workers, already xxx workers, max 30 workers` In this case, you can increase the value of this parameter to prevent similar errors. If you set this parameter to an excessively large value, an excessive number of processes may exist on the node, increasing the risk of out of memory (OOM) errors.
polar_px_wait_lock_timeout	The maximum amount of time that an ePQ process can block other processes using the same resources. Default value: 1800000. Unit: milliseconds (Half an hour). In most cases, ePQ processes are read-only queries that obtain shared locks on the tables being queried. However, some DDL operations require exclusive locks on the tables, causing the operations to be blocked by ePQ processes due to lock conflicts. After the DDL operations are blocked for the time specified by this parameter, ePQ queries are cancelled to free up resources for the processes performing the DDL operations. In most cases, ePQ is used to execute time-consuming analytical queries. Therefore, you must set the `statement_timeout` parameter to a value that allows for a reasonable response time. For example, the value `10,800,000` milliseconds, which is equivalent to 3 hours, may be appropriate for large and complex queries. If you use the default value `0` (unlimited time), the database connection used to execute the query may not be released for a long time due to the lengthy execution time of the query.
synchronous_commit	Specifies whether to ensure data consistency in elastic parallel queries. Valid values: on: ensures data consistency in elastic parallel queries. The database waits for write-ahead logging (WAL) records to be written to disks before returning a successful transaction commit to the client. off (default): does not ensure data consistency in elastic parallel queries.
polar_px_min_pg_plan_cost	The minimum cost of the execution plan for which ePQ is enabled. Valid values: 0 to 999999999999. Default value: 50000. The query whose single-node execution plan cost is lower than this threshold does not use ePQ.
polar_px_min_table_scan_size	The minimum size of the table for which ePQ is enabled. Valid values: 0 to 2147483647. Default value: 100 MB. If the size of all tables to be queried is less than this threshold, ePQ is not used.
polar_px_force_use	Specifies whether to forcibly use ePQ for queries. Valid values: on: forcibly uses ePQ for queries. off (default): does not forcibly use ePQ for queries.

Allow specific tables to use ePQ

If you only want to perform ePQ on a specific table, you can set polar_px_enable_check_workers to on. You also need to explicitly set the px_workers option for the table on which you need to perform ePQ.

ALTER TABLE t1 SET (px_workers = 1);

Note

Valid values for px_workers:

-1: disables ePQ on the table.
0 (default): ignores ePQ on the table.
1: enables ePQ on the table.

Enable the ePQ feature

Global level

In the PolarDB console, set polar_enable_px to on to enable ePQ globally. By default, ePQ is used for both transactional and analytical queries.

Use the following sample SQL statement to view the execution plan. If PolarDB PX Optimizer is returned in the plan, ePQ is enabled.

=> EXPLAIN SELECT * FROM t1;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 PX Coordinator 6:1  (slice1; segments: 6)  (cost=0.00..431.00 rows=1 width=8)
   ->  Partial Seq Scan on t1  (cost=0.00..431.00 rows=1 width=8)
 Optimizer: PolarDB PX Optimizer
(3 rows)

Session level

Set polar_enable_px to ON within a session to enable ePQ for all queries executed within the session.

SET polar_enable_px = ON;

Database level and user level

After ePQ is enabled at the global or session level, the system automatically uses ePQ for all queries that qualify for parallel execution before using other execution methods. From a best practice perspective, ePQ is more suitable for long queries with heavy analytical workloads than for short transactional queries. For short queries, the overhead associated with establishing connections, exchanging data, and destroying connections between compute nodes during the use of ePQ may result in decreased query performance.

If ePQ is required for specific business needs, analytical SQL queries that require ePQ can be extracted from the main database and executed on a separate database configured for ePQ queries.

ALTER DATABASE ap_database SET polar_enable_px = ON;

Alternatively, you can use a specific account to execute analytical SQL queries that require ePQ.

ALTER ROLE ap_role SET polar_enable_px = ON;

Query level

If you want to use ePQ only for specific queries within a session, such as for nightly report tasks, you can use the pg_hint_plan plug-in and SQL hints to enable ePQ for those queries. To make SQL hints work, make sure that the pg_hint_plan plug-in has been added to the GUC parameter shared_preload_libraries.

Prepend /*+ PX() */ to a query to enable ePQ for the query.

=> /*+ PX() */ EXPLAIN SELECT * FROM t1;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 PX Coordinator 6:1  (slice1; segments: 6)  (cost=0.00..431.03 rows=1000 width=8)
   ->  Partial Seq Scan on t1  (cost=0.00..431.00 rows=167 width=8)
 Optimizer: PolarDB PX Optimizer
(3 rows)

Prepend /*+ NoPX() */ to a query to disable ePQ for the query.

=> /*+ NoPX() */ EXPLAIN SELECT * FROM t1;
                      QUERY PLAN
------------------------------------------------------
 Seq Scan on t1  (cost=0.00..15.00 rows=1000 width=8)
(1 row)

Prepend /*+ PX(N) */ to a query to enable ePQ with a single-node parallelism of N. In the following example, the value of N is 6.

=> /*+ PX(6) */ EXPLAIN SELECT * FROM t1;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 PX Coordinator 12:1  (slice1; segments: 12)  (cost=0.00..431.02 rows=1000 width=8)
   ->  Partial Seq Scan on t1  (cost=0.00..431.00 rows=84 width=8)
 Optimizer: PolarDB PX Optimizer
(3 rows)

Feedback

Previous: Determine whether to use ePQ based on table size or costsNext: Real-time materialized views

On this page （1, T）

Overview

Enable ePQ

GUC parameters

Best practices

Allow specific tables to use ePQ

Enable the ePQ feature

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

Best Practices

Overview

Enable ePQ

GUC parameters

Best practices

Allow specific tables to use ePQ

Enable the ePQ feature

Global level

Session level

Database level and user level

Query level

Sales Support

Technical Support

Connect & Report Abuse

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

Asia Accelerator Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Cloud Phone Beta

Elastic Desktop Service (EDS) Featured

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)

Function Compute (FC)