Parallel hints allow you to specify whether to enable the parallel query feature. You can also specify the degree of parallelism (DOP), the tables that need to be scanned in parallel, and the parallel execution mode of each operator. PolarDB for MySQL 8.0.1 supports the PARALLEL
and NO_PARALLEL
hints in the parallel query feature. PolarDB for MySQL 8.0.2 supports the following hints: PARALLEL
, NO_PARALLEL
, PQ_DISTRIBUTE
, PQ_GROUPBY
, PQ_DISTINCT
, PQ_WINDOW
, and PQ_ORDERBY
. The PQ_DISTRIBUTE hint is added to specify the parallel execution policies of joint operations. The PQ_GROUPBY, PQ_DISTINCT, PQ_WINDOW, and PQ_ORDERBY hints are used to specify the parallel execution mode of each analytic operator.
Enable or disable the parallel query feature
Enable the parallel query feature
You can execute one of the following statements to enable the parallel query feature:
SELECT /*+PARALLEL(x)*/ ... FROM ...; -- x >0
SELECT /*+ SET_VAR(max_parallel_degree=n) */ * FROM ... -- n > 0
Disable the parallel query feature
You can execute one of the following statements to disable the parallel query feature:
SELECT /*+NO_PARALLEL()*/ ... FROM ...;
SELECT /*+ SET_VAR(max_parallel_degree=0) */ * FROM ...
Use hints to specify the tables that need to be scanned in parallel
The parallel query feature supports the PARALLEL
and NO_PARALLEL
hints. These hints specify tables that need to be scanned in parallel and ones that do not need to be scanned in parallel.
You can use the
PARALLEL
hint to enforce parallel queries and specify the DOP and the tables that need to be scanned in parallel. Sample syntax:/*+ PARALLEL [( [query_block] [table_name] [degree] )] */
You can use the
NO_PARALLEL
hint to enforce sequential queries or specify the tables for which parallel scans are prohibited./*+ NO_PARALLEL [( [query_block] [table_name][, table_name] )] */
The following table describes the parameters that you can configure.
Parameter | Description |
query_block | The name of the query block to which the hint is applied. |
table_name | The name of the table to which the hint is applied. |
degree | The DOP. |
Examples:
SELECT /*+PARALLEL()*/ * FROM t1, t2;
-- When the number of rows in the table is less than the specified value of records_threshold_for_parallelism, parallel queries are forcibly executed. The default value of records_threshold_for_parallelism is 10000.
-- The default parameter max_parallel_degree is used to specify the DOP. If max_parallel_degree is set to a value greater than 0,
-- The parallel query feature is enabled. If max_parallel_degree is set to 0, the parallel query feature is disabled.
SELECT /*+PARALLEL(8)*/ * FROM t1, t2;
-- This statement enforces parallel queries in a DOP of 8.
-- When the number of rows in the table is less than the specified value of records_threshold_for_parallelism, parallel queries are forcibly executed. The default value of records_threshold_for_parallelism is 10000.
-- The max_parallel_degree parameter is set to 8.
SELECT /*+ SET_VAR(max_parallel_degree=8) */ * FROM ...
-- The max_parallel_degree parameter is set to 8.
-- When the number of rows in the table is less than the value of records_threshold_for_parallelism, such as 20000, the parallel query feature is automatically disabled.
SELECT /*+PARALLEL(t1)*/ * FROM t1, t2;
-- This statement enforces parallel queries on the t1 table by using the /*+PARALLEL()*/ syntax for the table.
SELECT /*+PARALLEL(t1 8)*/ * FROM t1, t2;
-- This statement enforces parallel queries on the t1 table in a DOP of 8 by using the /*+PARALLEL(8)*/ syntax for the table.
SELECT /*+PARALLEL(@subq1)*/ SUM(t.a) FROM t WHERE t.a =
(SELECT /*+QB_NAME(subq1)*/ SUM(t1.a) FROM t1);
-- This statement enforces parallel subqueries. The default parameter max_parallel_degree is used to specify the DOP.
-- If max_parallel_degree is set to a value greater than 0, the parallel query feature is enabled. If max_parallel_degree is set to 0, the parallel query feature is disabled.
SELECT /*+PARALLEL(@subq1 8)*/ SUM(t.a) FROM t WHERE t.a =
(SELECT /*+QB_NAME(subq1)*/ SUM(t1.a) FROM t1);
-- This statement enforces parallel subqueries. The max_parallel_degree parameter is set to 8.
SELECT SUM(t.a) FROM t WHERE t.a =
(SELECT /*+PARALLEL()*/ SUM(t1.a) FROM t1);
-- This statement enforces parallel subqueries.
-- The default parameter max_parallel_degree is used to specify the DOP.
-- If max_parallel_degree is set to a value greater than 0, the parallel query feature is enabled. If max_parallel_degree is set to 0, the parallel query feature is disabled.
SELECT SUM(t.a) FROM t WHERE t.a =
(SELECT /*+PARALLEL(8)*/ SUM(t1.a) FROM t1);
-- This statement enforces parallel subqueries. The max_parallel_degree parameter is set to 8.
SELECT /*+NO_PARALLEL()*/ * FROM t1, t2;
-- This statement disables the parallel query feature.
SELECT /*+NO_PARALLEL(t1)*/ * FROM t1, t2;
-- This statement prohibits parallel queries for only the t1 table. If the system enables the parallel query feature, the system may perform parallel scans and run parallel queries on the t2 table.
SELECT /*+NO_PARALLEL(t1, t2)*/ * FROM t1, t2;
-- This statement prohibits parallel queries for the t1 and t2 tables.
SELECT /*+NO_PARALLEL(@subq1)*/ SUM(t.a) FROM t WHERE t.a =
(SELECT /*+QB_NAME(subq1)*/ SUM(t1.a) FROM t1);
-- This statement prohibits parallel subqueries.
SELECT SUM(t.a) FROM t WHERE t.a =
(SELECT /*+NO_PARALLEL()*/ SUM(t1.a) FROM t1);
-- This statement prohibits parallel subqueries.
The PARALLEL
hint is ineffective for queries that do not support parallel processing or tables that do not support parallel scans.
Parallel join operations
You can use the PQ_DISTRIBUTE
hint to specify the parallel query execution mode of a join operation.
Use the
PQ_DISTRIBUTE
hint to specify the data distribution strategy of a table:/*+ PQ_DISTRIBUTE([query_block] table_name strategy ) */
The following table describes the parameters that you can configure.
Parameter
Description
query_block
The name of the query block to which the hint is applied.
table_name
The name of the table to which the hint is applied.
strategy
The data distribution strategy. Valid values:
PQ_GATHER: Data is aggregated to a worker of the previous phase.
PQ_HASH: Data is shuffled and distributed to multiple workers of the previous phase.
PQ_BROADCAST: Data is broadcast to multiple workers of the previous phase.
PQ_NONE: Data is not distributed.
Examples:
SELECT /*+ PARALLEL(t1) PQ_DISTRIBUTE(t1 PQ_GATHER) */ * FROM t as t1; -- When the number of rows in the table is less than the specified value of records_threshold_for_parallelism, parallel queries are forcibly executed. The default value of records_threshold_for_parallelism is 10000. -- The default parameter max_parallel_degree is used to specify the DOP. If max_parallel_degree is set to a value greater than 0, -- The parallel query feature is enabled. If max_parallel_degree is set to 0, the parallel query feature is disabled. -- After the table is scanned in parallel, the data is not distributed, but the results are gathered by the leader. SELECT /*+ PARALLEL(t1) PQ_DISTRIBUTE(t1 PQ_HASH) */t1.a, SUM(t1.b) * FROM t as t1 GROUP BY t1.a; -- After the table is scanned in parallel, the data is distributed to the workers of the next phase based on the group key.
Use the
PQ_DISTRIBUTE
hint to specify the joining mode of two tables:/*+ PQ_DISTRIBUTE([query_block] table_name strategy1 [strategy2]) */
In the preceding statement, if only
strategy1
is specified, it indicates that the data distribution method of the table_name table is specified. If bothstrategy1
andstrategy2
are specified, it indicates that the parallel joining mode between thetable_name
table and a previous table is specified.NoteThe previous table can be a physical table or an intermediate table in the previous join operation.
Examples:
SELECT /*+ PARALLEL(t1) PARALLEL(t2) PQ_DISTRIBUTE(t2 PQ_HASH PQ_HASH) */ * FROM t as t1 STRAIGHT_JOIN t as t2 on t1.b = t2.c; -- The t1 table is scanned in parallel, and the data is shuffled and distributed to the workers of the next phase. -- The t2 table is scanned in parallel, and the data is shuffled and distributed to the workers of the next phase. -- The workers of the next phase execute a co-location join. Then, the results are gathered by the leader. SELECT /*+ PARALLEL(t1) PARALLEL(t2) PQ_DISTRIBUTE(t2 PQ_GATHER PQ_GATHER) */ * FROM t as t1 STRAIGHT_JOIN t as t2 on t1.b = t2.c; -- The t1 table is scanned in parallel, and the data is distributed to the leader. -- The t2 table is scanned in parallel, and the data is distributed to the leader. -- The leader gathers the data and completes the join operation.
If queries do not support parallel processing or queries use two conflicting hints, the queries may fail. Sample syntax:
SELECT /*+ PARALLEL(t1) PARALLEL(t2) PQ_DISTRIBUTE(t2 PQ_HASH PQ_GATHER) */ * FROM t as t1 STRAIGHT_JOIN t as t2 on t1.b = t2.c; -- The t1 table is scanned in parallel, and the data is distributed to multiple workers of the next phase. -- The t2 table is scanned in parallel, and the data is distributed to the leader. -- The data distribution modes of the two hints conflict with each other, so a parallel query plan cannot be generated.
Parallel by-group aggregation: PQ_GROUPBY
You can use the PQ_GROUPBY
hint to specify the execution mode of by-group aggregation.
/*+ PQ_GROUPBY(strategy) */
The following table describes the parameters that you can configure.
Parameter | Description |
strategy | The data distribution strategy. Valid values:
|
Examples:
SELECT /*+ PARALLEL(t1) PQ_GROUPBY(PQ_ONEPHASE) */ t1.a, sum(t1.b) FROM t as t1 GROUP BY t1.a;
-- The t1 table is scanned in parallel and the data is distributed to multiple workers of the next phase based on the t1.a column.
-- Multiple workers of the next phase complete aggregation calculations. Then, the results are gathered by the leader.
SELECT /*+ PARALLEL(t1) PQ_GROUPBY(PQ_TWOPHASE_HASH) */ t1.a, sum(t1.b) FROM t as t1 GROUP BY t1.a;
-- Workers scan the t1 table in parallel and perform data aggregation separately.
-- Intermediate aggregation results are distributed to multiple workers of the next phase based on the t1.a column.
-- Multiple workers of the next phase complete aggregation calculations. Then, the results are gathered by the leader.
Parallel by-group aggregation: PQ_DISTINCT
You can use the PQ_DISTINCT
hint to specify the execution mode of by-group aggregation.
/*+ PQ_DISTINCT(strategy) */
The following table describes the parameters that you can configure.
Parameter | Description |
strategy | The data distribution strategy. Valid values:
|
Parallel window functions
You can use the PQ_WINDOW
hint to specify the execution mode of a window function.
/*+ PQ_WINDOW([window_name] strategy) */
A hint with a specified window name has a higher priority than a hint without a window name.
The following table describes the parameters that you can configure.
Parameter | Description |
window_name | The name of the window function to which the data distribution strategy applies. If you do not specify this parameter, the data distribution strategy applies to all window functions. |
strategy | The data distribution strategy. Valid values:
|
Examples:
SELECT /*+ PQ_WINDOW(PQ_ONEPHASE) PQ_WINDOW(win PQ_SERIAL) */
ROW_NUMBER() OVER(win) AS 'row_number',
RANK() over(partition by name order by salary desc)
FROM employee_salaries WINDOW win as (partition by dept order by salary desc);
-- The window function named win is called to run computations serially.
-- For other window functions, data is distributed to multiple workers based on the key specified in the PARTITION BY clause for parallel computations.
Parallel data sorting
You can use the PQ_ORDERBY
hint to specify the data sorting mode.
/*+ PQ_ORDERBY(strategy) */
The following table describes the parameters that you can configure.
Parameter | Description |
strategy | The data distribution strategy. Valid values:
|
Parallel subqueries
You can use hints to specify the policy to run parallel subqueries. For more information, see the support for subqueries in View parallel query execution plans. Sample syntax:
/*+ PQ_PUSHDOWN [( [query_block])] */ # The subqueries run in parallel based on the pushdown policy.
/*+ NO_PQ_PUSHDOWN [( [query_block])] */ # The subqueries run in parallel based on the shared access policy.
Examples:
# The subqueries run in parallel based on the pushdown policy.
EXPLAIN SELECT /*+ PQ_PUSHDOWN(@qb1) */ * FROM t2 WHERE t2.a =
(SELECT /*+ qb_name(qb1) */ a FROM t1);
# The subqueries run in parallel based on the shared access policy.
EXPLAIN SELECT /*+ NO_PQ_PUSHDOWN(@qb1) */ * FROM t2 WHERE t2.a =
(SELECT /*+ qb_name(qb1) */ a FROM t1);
# No hint is added to the query block.
EXPLAIN SELECT * FROM t2 WHERE t2.a =
(SELECT /*+ NO_PQ_PUSHDOWN() */ a FROM t1);