This topic describes how to use the crash_kexec_post_notifiers parameter to resolve the issue that vmcore files are not generated on an Elastic Compute Service (ECS) instance that runs Alibaba Cloud Linux 3.
Limits on operating systems
Alibaba Cloud Linux 3
Anolis OS 8.x
Problem description
When a system panic occurs on an instance on which the crash_kexec_post_notifiers parameter is set to Y, no vmcore file is generated.
Cause
When a system panic occurs on an instance on which the crash_kexec_post_notifiers parameter is set to Y, the panic kernel runs Panic Notifiers before jumping to the kdump kernel, and an error occurs in the Notifiers functions. For example, the Notifiers functions cause a deadlock. As a result, no vmcore file is generated.
Panic
An emergency measure taken by an operating system when the operating system detects a fatal error from which the operating system cannot safely recover.
Panic Notifiers
A mechanism in the Linux kernel that allows registered modules or system components to execute specific callback functions to perform operations, such as resource cleanup, fault logging, and collection of system status data, when the kernel is about to reboot or jump to the kdump kernel due to a fatal error that may lead to a panic.
kdump
An important component in Linux that is used to improve system reliability and troubleshooting capabilities.
vmcore
A memory image generated by the kernel crash dumping mechanism when the Linux kernel crashes.
Solution
Check the status of the kdump service.
sudo kdumpctl statusView the command output.
If the
kdump: Kdump is operationalmessage appears in the command output, kdump runs as expected.If the
kdump: Kdump is not operationalmessage appears in the command output, kdump is not running. Start kdump.sudo kdumpctl start
Check whether a
vmcorefile is generated after the systempanic.Replace
<ip-time>with the name of the folder in whichvmcorefiles are stored. Example:127.0.0.1-2024-10-11-15:46:52.ls /var/crash/<ip-time>If no
vmcorefile is generated, change the value of thecrash_kexec_post_notifiersparameter to N. Then, check whether avmcorefile is generated.Temporarily change the value of the
crash_kexec_post_notifiersparameter to N.sudo sh -c 'echo N > /sys/module/kernel/parameters/crash_kexec_post_notifiers'Permanently change the value of the
crash_kexec_post_notifiersparameter to N.Change the value of the
crash_kexec_post_notifiersparameter to N.sudo grubby --update-kernel="/boot/vmlinuz-$(uname -r)" --args="crash_kexec_post_notifiers=N"Restart the instance for the change to take effect.
WarningThe restart operation stops the instance for a short period of time and may interrupt the services that are running on the instance. This may result in data loss. Therefore, we recommend that you back up critical instance data before you restart the instance. We also recommend that you restart the instance during off-peak hours.
sudo reboot
If no
vmcorefile is generated after you change the value of thecrash_kexec_post_notifiersparameter to N, the issue may be caused by other reasons. In this case, use thekdumpctltool to troubleshoot the issue. For more information, see Use the kdumpctl tool to view the boot logs for kernel crash dumps.