Alibaba Cloud Linux 2 starting with the kernel version 4.19.81-17.al7
and Alibaba Cloud Linux 3 provide interfaces to better monitor block I/O throttling. This topic describes the interfaces and provides examples on how to use the interfaces.
Background information
Linux block I/O throttling (BPS or IOPS) is required in multiple scenarios, especially in scenarios where cgroup writeback is enabled. Alibaba Cloud Linux provides interfaces to enhance the monitoring of block I/O throttling and make it easier for you to perform I/O throttling related operations.
Interfaces
Interface | Description |
blkio.throttle.io_service_time | The total amount of time between request dispatch and request completion for I/O operations. Unit: nanoseconds. |
blkio.throttle.io_wait_time | The total amount of time that the I/O operations wait in scheduler queues. Unit: nanoseconds. |
blkio.throttle.io_completed | The number of completed I/O operations. It is used to calculate the average latency of the block I/O throttling layer. |
blkio.throttle.total_io_queued | The number of I/O operations that were throttled. The number of I/O operations that were throttled in the current cycle can be calculated based on periodic monitoring data and be used to analyze whether I/O latency is related to throttling. |
blkio.throttle.total_bytes_queued | The number of I/O bytes that were throttled. Unit: bytes. |
The preceding interfaces are stored in /sys/fs/cgroup/blkio/<cgroup>/, where <cgroup>
specifies the control group.
Examples
You can obtain the average I/O latency of a disk by using the preceding interfaces. In this example, the average I/O write latency of the vdd disk is monitored at an interval of 5 seconds to calculate the average I/O latency of the vdd disk. The following table describes the relevant parameters.
Parameter | Description |
write_wait_time<N> | The duration of throttling at the block I/O throttling layer. |
write_service_time<N> | The total amount of time between request dispatch and request completion for I/O operations. |
write_completed<N> | The number of completed I/O operations. |
Obtain the monitoring data at the T1 time.
write_wait_time1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_wait_time | grep -w "254:48 Write" | awk '{print $3}'` write_service_time1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_service_time | grep -w "254:48 Write" | awk '{print $3}'` write_completed1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_completed | grep -w "254:48 Write" | awk '{print $3}'`
Wait 5 seconds and obtain the monitoring data at the T2 (T1 + 5s) time.
write_wait_time2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_wait_time | grep -w "254:48 Write" | awk '{print $3}'` write_service_time2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_service_time | grep -w "254:48 Write" | awk '{print $3}'` write_completed2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_completed | grep -w "254:48 Write" | awk '{print $3}'`
Calculate the average I/O latency during the 5 seconds based on the following formula:
Average I/O latency = (Total I/O duration at the T2 time - Total I/O duration at the T1 time)/(Number of completed I/O operations at the T2 time - Number of completed I/O operations at the T1 time)
avg_delay = `echo "((write_wait_time2 + write_service_time2) - (write_wait_time1+write_service_time1)) / (write_completed2 - write_completed1)" | bc`