This topic describes how to troubleshoot the issues that cause high I/O on an ApsaraDB RDS for MySQL instance. The I/O performance of your RDS instance varies based on the storage type, kernel architecture, and SQL statement that is executed to scan or modify data.
Troubleshoot high I/O caused by high throughput
Symptom
If your application frequently initiates requests to update, delete, and insert data on tables, the I/O of the RDS instance significantly increases due to the data reads and the flushes of dirty pages. This applies if the tables contain a large number of indexes or large fields.
You can log on to the ApsaraDB RDS console and choose
in the left-side navigation pane. Then, on the Performance Trends tab, you can view the read and write loads on your RDS instance.Solution
We recommend that you reduce the read and write frequency, upgrade the instance type of your RDS instance, or optimize the settings of the parameters that are used to flush dirty pages. The following list describes the parameters that are used to flush dirty pages:
innodb_max_dirty_pages_pct: the percentage of dirty pages that are allowed in the buffer pool. Default value: 75.
innodb_max_dirty_pages_pct_lwm: the low water mark for the percentage of dirty pages that are allowed in the buffer pool. If the percentage of dirty pages in the buffer pool exceeds the low water mark, ApsaraDB RDS flushes dirty pages to the disk. This ensures a proper percentage of dirty pages in the buffer pool. The default value 0 specifies to disable the low water mark.
NoteThe value of the innodb_max_dirty_pages_pct_lwm parameter must be less than or equal to the value of the innodb_max_dirty_pages_pct parameter. Otherwise, ApsaraDB RDS sets the innodb_max_dirty_pages_pct_lwm parameter to the value of the innodb_max_dirty_pages_pct parameter.
innodb_io_capacity: the maximum number of I/O operations that are allowed by InnoDB per second for each background task. The value of this parameter affects the speed at which ApsaraDB RDS flushes dirty pages to the disk. The value of this parameter also affects the speed at which ApsaraDB RDS writes data to the buffer pool. Default value: 20000.
innodb_io_capacity_max: the maximum number of I/O operations that are allowed by InnoDB per second for each background task. This parameter takes effect only when the flushes of dirty pages are outdated. The value of this parameter is greater than the value of the innodb_io_capacity parameter. Default value: 40000.
Troubleshoot high I/O caused by temporary tables
Symptom
If the temporary directory is large, ApsaraDB RDS may have created large temporary tables due to operations such as the sorting and deduplication of slow SQL statements. This increases the I/O of your RDS instance. In addition, data writes to temporary tables also increase the I/O of your RDS instance.
You can log on to the ApsaraDB RDS console and choose
in the left-side navigation pane. Then, on the Performance Trends tab, you can view the size of the tmp or other directory for your RDS instance.Solution
We recommend that you optimize the SQL statements that you want to execute. This allows you to prevent slow SQL statements. Database Autonomy Service (DAS) supports SQL optimization. For more information, see SQL optimization.
Troubleshoot high I/O caused by cold data reads
Symptom
If the data that is queried or modified by using SQL statements cannot be hit in the buffer pool, ApsaraDB RDS needs to read the data from the disk. This may significantly increase the I/O of your RDS instance.
You can log on to the ApsaraDB RDS console and choose
in the left-side navigation pane. Then, on the Performance Trends tab, you can view the buffer pool hit ratio of your RDS instance.Solution
Redesign the cache policy based on your business scenario. Otherwise, upgrade your RDS instance.
Troubleshoot high I/O caused by DDL statements
Symptom
If your application initiates DDL statements, ApsaraDB RDS may rebuild the tablespaces of your RDS instance. During the rebuild process, ApsaraDB RDS scans each row of each table in the tablespaces, creates indexes that are used to sort data, and flushes the dirty pages generated from new tables. All these operations significantly increase the I/O of your RDS instance. If your application initiates requests to delete large tables, the I/O of your RDS instance may also increase.
You can log on to the ApsaraDB RDS console and click Monitoring and Alerts in the left-side navigation pane. Then, you can click Standard View on the Standard Monitoring tab to view the disk usage and IOPS of your RDS instance.
Solution
Use the Purge Large File Asynchronously feature to delete large files. This feature is provided by AliSQL. AliSQL is a MySQL branch that is developed by Alibaba Cloud. For more information, see Purge Large File Asynchronously.
Troubleshoot high I/O caused by binary log writes from large transactions
Symptom
A transaction writes log records into binary log files only when it is committed. If your application runs a large transaction, the transaction may write a few dozen GB of data into binary log files. For example, the transaction contains a DELETE statement that is used to delete a large number of rows. When these binary log files are flushed to the disk, the I/O of your RDS instance significantly increases.
Solution
We recommend that you split each large transaction that you want to run. This allows you to reduce the flushes of dirty pages to the disk.
Appendix: Introduction to the InnoDB I/O system
InnoDB uses an independent I/O system to read and write data pages. If the data page that is requested by an SQL statement cannot be hit in the buffer pool, physical I/O operations are performed to read and write data to the disk.
Operations to read data pages
The underlying read operation is called based on synchronous I/O to read data pages.
Operations to write data pages
Use the flushes of dirty pages as an example. Background I/O threads are called based on asynchronous I/O to asynchronously flush dirty pages to the disk.
In addition to I/O operations on common data files, a number of other operations may also significantly increase the I/O of your RDS instance. These operations include the operations to write redo logs, undo logs, and binary logs, the operations to sort temporary tables, and the operations to rebuild tablespaces due to DDL statements.