An I/O hang occurs when the system becomes unstable or even goes down due to time-consuming I/O requests. To accurately detect I/O hangs, Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3 extend the core data structures to provide a feature that can identify and detect I/O hangs at low system overheads. This topic describes the interfaces for this feature and the usage examples of the interfaces.
Interface description
Interface | Description |
/sys/block/<device>/queue/hang_threshold | This interface allows you to query and modify the threshold for I/O hangs. Unit of the threshold: ms. Default value: 5000. |
/sys/block/<device>/hang | This interface allows you to query the number of I/O operations that exceed the threshold for I/O hangs on the device. |
/sys/kernel/debug/block/<device>/rq_hang | This interface allows you to query details about I/O hangs. |
/proc/<pid>/wait_res | This interface allows you to query information about the resources for which the process is waiting. |
/proc/<pid>/task/<tid>/wait_res | This interface allows you to query information about the resources for which the thread is waiting. |
The following table describes variables in the preceding interfaces.
Variable | Description |
<device> | The name of the block storage device. |
<pid> | The ID of the process. |
<tid> | The thread ID. |
Example 1
You can call the /sys/block/<device>/queue/hang_threshold
interface to modify the threshold for I/O hangs. In this example, the threshold is changed from 5,000 ms to 10,000 ms.
Change the threshold for I/O hangs of the vdb disk to 10,000 ms.
echo 10000 > /sys/block/vdb/queue/hang_threshold
View the change result.
cat /sys/block/vdb/queue/hang_threshold
Sample output:
10000
Example 2
You can call the /sys/block/<device>/hang
interface to query the number of I/O operations that cause I/O hangs on a disk. In this example, the vdb disk is used.
Sample command:
cat /sys/block/vdb/hang
Sample output:
0 1 # The value on the left indicates the number of read operations that cause I/O hangs. The value on the right indicates the number of write operations that cause I/O hangs.
Example 3
You can call the /sys/kernel/debug/block/<device>/rq_hang
interface to query the details of I/O hangs. In this example, the vdb disk is used.
Sample command:
cat /sys/kernel/debug/block/vdb/rq_hang
Sample output:
ffff9e50162fc600 {.op=WRITE, .cmd_flags=SYNC, .rq_flags=STARTED|ELVPRIV|IO_STAT|STATS, .state=in_flight, .tag=118, .internal_tag=67, .start_time_ns=1260981417094, .io_start_time_ns=1260981436160, .current_time=1268458297417, .bio = ffff9e4907c31c00, .bio_pages = { ffffc85960686740 }, .bio = ffff9e4907c31500, .bio_pages = { ffffc85960639000 }, .bio = ffff9e4907c30300, .bio_pages = { ffffc85960651700 }, .bio = ffff9e4907c31900, .bio_pages = { ffffc85960608b00 }}
The preceding output shows the details of an I/O operation. io_start_time_ns
indicates the start time of the I/O request, and this parameter has an assigned value. This indicates that the I/O request was not processed in a timely manner, which leads to prolonged I/O time.
Example 4
You can call the /proc/<pid>/wait_res
interface to query information about the resources for which a process is waiting. In this example, the 577
process is used.
Sample command:
cat /proc/577/wait_res
Sample output:
1 0000000000000000 4310058496 4310061448 # 1 is the value of Field 1, 0000000000000000 is the value of Field 2, 4310058496 is the value of Field 3, and 4310061448 is the value of Field 4.
The following table describes the parameters in the sample output.
Parameter | Description |
Field 1 | The type of the resources for which the process is waiting. A value of 1 indicates the cache page in the file system. A value of 2 indicates the block I/O layer. |
Field 2 | The addresses of the resources (cache pages or block I/O layers) for which the process is waiting. |
Field 3 | The time at which the process began waiting for resources. |
Field 4 | The current time when the file is being read. The difference between Field 4 and Field 3 is the amount of time taken by the process to wait for the resources. |