A softlockup error occurs when specific earlier versions of the Linux kernel write data back to file caches. This topic describes the cause of and solution to the issue.
Problem description
On a Linux Elastic Compute Service (ECS) instance whose kernel version is earlier than 4.15, a softlockup error occurs when the kernel writes data back to file caches. Call stack information similar to the following content is generated.
You can run the uname -r command to check the Linux kernel version. For example, if 3.10.0-514.26.2.el7.x86_64
is displayed in the command output, the kernel version is 3.10.0
. In this topic, the 4.15 version refers to the 4.15.0
version.
[3507707.671883] [<ffffffff8127cf7a>] redirty_tail+0x3a/0x40
[3507707.671884] [<ffffffff81280ea4>] __writeback_inodes_wb+0x64/0xc0
[3507707.671885] [<ffffffff81281238>] wb_writeback+0x268/0x300
[3507707.671887] [<ffffffff812819f4>] wb_workfn+0xb4/0x380
[3507707.671889] [<ffffffff810a5dc9>] process_one_work+0x189/0x420
[3507707.671890] [<ffffffff810a625b>] worker_thread+0x1fb/0x4b0
[3507707.671891] [<ffffffff810a6060>] ? process_one_work+0x420/0x420
[3507707.671893] [<ffffffff810ac696>] kthread+0xe6/0x100
[3507707.671894] [<ffffffff810ac5b0>] ? kthread_park+0x60/0x60
[3507707.671897] [<ffffffff81741dd9>] ret_from_fork+0x39/0x50
Cause
The operating system kernel frequently calls the wakeup_flusher_threads
function when the memory of an ECS instance is insufficient. Then, a large number of writeback tasks (wb_writeback_work
) are created. As a result, the writeback threads continue to process the writeback tasks, and a softlockup error occurs on the operating system.
Solution
Update the kernel to a version later than 4.15. The issue does not occur on Alibaba Cloud Linux operating systems because the kernel versions of Alibaba Cloud Linux operating systems are 4.19. The following solution applies to Linux distributions other than the Alibaba Cloud Linux operating systems.
Before you perform the operations in the solution on a instance on which the issue occurred, we recommend that you create snapshots for the instance to back up data. This prevents data loss caused by accidental operations. For information about how to create a snapshot, see Create a snapshot for a disk.
Connect to the Linux instance.
For more information, see Connect to a Linux instance by using a password or key.
Run the following command to view the kernel version of the operating system:
uname -r
If the kernel version is 4.15 or earlier, run the following command to update the kernel version.
If the kernel version is later than 4.15, the softlockup error does not occur on the operating system. You do not need to perform subsequent operations.
yum update kernel
NoteIf you cannot update specific earlier kernel versions by running the
yum update kernel
command, download the kernel RPM package of a version later than 4.15 and then upgrade the kernel versions by using the kernel RPM package.Restart the instance after you update the kernel version.
reboot
After you restart the instance, rerun the following command to check whether the kernel version is later than 4.15:
uname -r