This topic describes the factors that affect query performance of AnalyticDB for MySQL.
Background information
Cluster specifications
AnalyticDB for MySQL clusters support a variety of specifications. For more information about cluster specifications, see Specifications. Different cluster specifications provide different capabilities to process subtasks because they each have specific CPU cores, memory sizes, and storage media. You must select cluster specifications based on your business query characteristics. For example, business queries oriented to joins or grouping and aggregations consume large amounts of CPU and memory resources, while queries oriented to data scanning or simple grouping and aggregations consume large amounts of disk I/O resources. Differences in resource consumption result in different performance limits on clusters of different specifications, which affects the overall query effect.
Number of nodes
AnalyticDB for MySQL uses a distributed data processing architecture to split a query into multiple stages and execute these stages in parallel on different nodes. An AnalyticDB for MySQL cluster that has more nodes can provide higher query processing capabilities. You can determine the number of cluster nodes to purchase based on your actual business requirements. For more information, see Create a cluster.
Data distribution characteristics
AnalyticDB for MySQL uses a distributed data processing architecture to execute a query in parallel on multiple nodes. However, whether AnalyticDB for MySQL can make full use of multiple nodes to execute the query in parallel depends on the distribution characteristics of data on storage nodes. If the data can be evenly distributed across the storage nodes, data processing subtasks in AnalyticDB for MySQL can be completed near simultaneously. If the data is unevenly distributed, long latency occurs on subtasks during data processing and affects the final query effect.
Data size
Typically, when a query is being executed, AnalyticDB for MySQL does not write temporary results to disks, but attempts to process all data in the memory. If a large amount of data needs to be processed, the query may occupy large amounts of resources for a long period of time. This decreases the overall query efficiency and affects the final query effect.
If a large amount of data is stored in an AnalyticDB for MySQL table, operations such as index filtering and detailed data reads lead to competition for disk I/O resources and slow down queries.
Query concurrency
Due to the limits of cluster specifications and sizes, the number of queries that AnalyticDB for MySQL can process at the same time is also limited. If large numbers of queries are concurrently executed and if cluster node resources have reached the upper limit, queries in the backend are queued for a long period of time. This affects the overall query effect.
Query complexity
AnalyticDB for MySQL is subject to different issues caused by complex queries. For example, if a query uses complex filter conditions, a large amount of data may be read from storage nodes during data filtering. If a query contains large numbers of JOIN operators, data may need to be transferred multiple times between different nodes over networks. This causes network congestion. If a large number of columns are used in a GROUP BY clause, large amounts of memory resources are occupied.