AnalyticDB for MySQL provides the split flow control feature that allows you to manage concurrent split scans that run in a task or on a node. This feature prevents node instability caused by high scan concurrency and excess resource usage. This topic describes how to enable the split flow control feature and configure the quota for concurrent split scans.
Prerequisites
An AnalyticDB for MySQL cluster of V3.1.10.0 or later is created.
For information about how to query the minor version of a cluster, see How do I query the version of an AnalyticDB for MySQL cluster? To update the minor version of a cluster, contact technical support.
Background information
When you perform a query in AnalyticDB for MySQL, the system scans data from the data source and the split scans run on storage nodes or compute nodes. If a large number of concurrent split scans run on a node, the following issues may occur:
For internal tables, the concurrent split scans compete for I/O resources of storage nodes. This increases CPU utilization and memory usage. In this case, the storage nodes become unstable.
For external tables, the scan efficiency varies based on the data source. If the number of concurrent split scans exceeds the quota, new split scans consume the resources of compute nodes and affect other queries instead of speeding up the overall scan.
To resolve the preceding issues, AnalyticDB for MySQL provides the split flow control feature. By default, the split flow control feature is enabled.
Terms
Overview
Each task has a quota for concurrent split scans. If the number of concurrent split scans that run in a task is less than the quota, the task can start a new split scan. When the quota is reached, a new split scan can be started only after the running split scans are complete. AnalyticDB for MySQL provides the split flow control feature for nodes and tasks. The quota for concurrent split scans in a task can be dynamically adjusted based on the overall quota for concurrent split scans of the node on which the task runs.
Disable or re-enable the split flow control feature
By default, the split flow control feature is enabled. You can execute one of the following statements to disable or re-enable the split flow control feature.
Disable or re-enable the split flow control feature for a cluster.
SET ADB_CONFIG SPLIT_FLOW_CONTROL_ENABLED=true|false;
Disable or re-enable the split flow control feature for a query.
/*+ SPLIT_FLOW_CONTROL_ENABLED=true|false*/ SELECT * FROM table;
Allow the quota for concurrent split scans in a task to be dynamically adjusted
Enable and disable the feature that allows the quota for concurrent split scans in a task to be dynamically adjusted
If you disable the feature that allows the quota for concurrent split scans in a task to be dynamically adjusted, the default quota for concurrent split scans in a task is 32. You can execute the SET ADB_CONFIG TARGET_RUNNING_SPLITS_LIMIT_PER_TASK=<value>;
statement to change the default value.
You can enable the feature to allow the quota for concurrent split scans in a task to be dynamically adjusted based on the overall quota for concurrent split scans of the node on which the task runs.
SET ADB_CONFIG NODE_LEVEL_SPLIT_FLOW_CONTROL_ENABLED=true|false;
Before you enable the feature that allows the quota for concurrent split scans in a task to be dynamically adjusted, make sure that the split flow control feature is enabled. You can execute the SHOW ADB_CONFIG KEY=SPLIT_FLOW_CONTROL_ENABLED;
statement to check whether the split flow control feature is enabled.
Adjust the quota for concurrent split scans in a task
Method
You can execute the SET statement or use a hint to adjust the quota for concurrent split scans in a task for a cluster or a query.
Adjust the quota for concurrent split scans in a task for a cluster.
SET ADB_CONFIG <Quota-related parameter>=<value>;
Adjust the quota for concurrent split scans in a task for a query.
/*+ <Quota-related parameter>=<value>*/SELECT * FROM orders;
Quota-related parameters
The following table describes the parameters that can be used to dynamically adjust the quota for concurrent split scans in a task.
Parameter | Description |
MIN_RUNNING_SPLITS_LIMIT_PER_TASK | The minimum quota for concurrent split scans in a task. The default value is 1. The valid values range from If a large number of concurrent split scans run on a node after you enable the feature that allows the quota for concurrent split scans in a task to be dynamically adjusted, the quota for concurrent split scans in the task is dynamically decreased, and the quota after the decrease is no less than the value of this parameter. |
TARGET_RUNNING_SPLITS_LIMIT_PER_TASK | The intermediate quota for concurrent split scans in a task. The actual quota is dynamically increased or decreased based on the value of this parameter. The default value is 32. The valid values range from the value of the If the sum of the intermediate values of quotas of all tasks on a node does not exceed the quota of the node, the quota of each task is dynamically increased. Otherwise, the quota of each task is dynamically decreased. |
MAX_RUNNING_SPLITS_LIMIT_PER_TASK | The maximum quota for concurrent split scans in a task. The default value is 64. The value of this parameter must be greater than the value of the If a small number of concurrent split scans run on a node after you enable the feature that allows the quota for concurrent split scans in a task to be dynamically adjusted, the quota for concurrent split scans in the task is dynamically increased, and the quota after the increase is no greater than the value of this parameter. |
Suggestions on how to adjust the quota for concurrent split scans in a task
AnalyticDB for MySQL allows you to configure different quotas fir concurrent split scans in tasks for different queries. This way, resources can be allocated to the tasks based on your business requirements. Examples:
If a query requires a short response time (RT) and a small number of split scans, such as a point query, you can set the
MIN_RUNNING_SPLITS_LIMIT_PER_TASK
,TARGET_RUNNING_SPLITS_LIMIT_PER_TASK
, andMAX_RUNNING_SPLITS_LIMIT_PER_TASK
parameters to a larger value, or execute theSET ADB_CONFIG SPLIT_FLOW_CONTROL_ENABLED=false;
statement to disable the split flow control feature. This way, all split scans of the query can be quickly started.If a query requires a large number of split scans but does not require a higher priority, you can set the
TARGET_RUNNING_SPLITS_LIMIT_PER_TASK
parameter to a smaller value to reduce the impact on important queries when resources are insufficient. When resources are idle, the quota for concurrent split scans in a task can be dynamically increased to improve the scan efficiency.If you query external tables, you can configure the quota based on factors such as the transmission limits of the data source.
Examples
Set the minimum quota for concurrent split scans in all tasks for a cluster to 24.
SET ADB_CONFIG MIN_RUNNING_SPLITS_LIMIT_PER_TASK=24;
Set the minimum quota for concurrent split scans in specific tasks for a query to 10.
/*+ MIN_RUNNING_SPLITS_LIMIT_PER_TASK=10*/SELECT * FROM orders;
Set the maximum quota for concurrent split scans in all tasks for a cluster to 128.
SET ADB_CONFIG MAX_RUNNING_SPLITS_LIMIT_PER_TASK=128;
Set the maximum quota for concurrent split scans in specific tasks for a query to 100.
/*+ MAX_RUNNING_SPLITS_LIMIT_PER_TASK=100*/SELECT * FROM adb_test;
Configure the quota for concurrent split scans on a node
Storage nodes
By default, the quota for concurrent split scans on a storage node is 256. We recommend that you do not change the default value because an excessively large or small quota can affect the cluster performance. You can execute the following statement to configure the quota for concurrent split scans on a storage node:
SET ADB_CONFIG WORKER_MAX_RUNNING_SOURCE_SPLITS_PER_NODE=256;
Compute nodes
By default, the quota for concurrent split scans on a compute node is 256. We recommend that you do not change the default value because an excessively large or small quota can affect the cluster performance. You can execute the following statement to configure the quota for concurrent split scans on a compute node:
SET ADB_CONFIG EXECUTOR_MAX_RUNNING_SOURCE_SPLITS_PER_NODE=256;
You can execute the
SHOW ADB_CONFIG
statement to query the quota for concurrent split scans that is configured for a node. For more information, see SHOW ADB_CONFIG.After you configure the quota for concurrent split scans on a node, the quota may not immediately take effect due to the following reasons:
The quota that you configured takes effect only on split scans that are not started instead of the split scans that are running on the node. If you decrease the quota for concurrent split scans, the quota takes effect only after the running split scans are complete.
The split flow control feature ensures that the number of concurrent split scans running in each task is greater than the minimum quota that you configured for a task. If the sum of the minimum values of quotas of all tasks on a node is greater than the quota of the node, the number of concurrent split scans running on the node may exceed the quota that you configured for the node.