All Products
Search
Document Center

Alibaba Cloud Linux:How do I use the crash_kexec_post_notifiers parameter to resolve the issue that vmcore files are not generated?

Last Updated:Nov 06, 2024

This topic describes how to use the crash_kexec_post_notifiers parameter to resolve the issue that vmcore files are not generated on an Elastic Compute Service (ECS) instance that runs Alibaba Cloud Linux 3.

Limits on operating systems

  • Alibaba Cloud Linux 3

  • Anolis OS 8.x

Problem description

When a system panic occurs on an instance on which the crash_kexec_post_notifiers parameter is set to Y, no vmcore file is generated.

Cause

When a system panic occurs on an instance on which the crash_kexec_post_notifiers parameter is set to Y, the panic kernel runs Panic Notifiers before jumping to the kdump kernel, and an error occurs in the Notifiers functions. For example, the Notifiers functions cause a deadlock. As a result, no vmcore file is generated.

Note
  • Panic

    An emergency measure taken by an operating system when the operating system detects a fatal error from which the operating system cannot safely recover.

  • Panic Notifiers

    A mechanism in the Linux kernel that allows registered modules or system components to execute specific callback functions to perform operations, such as resource cleanup, fault logging, and collection of system status data, when the kernel is about to reboot or jump to the kdump kernel due to a fatal error that may lead to a panic.

  • kdump

    An important component in Linux that is used to improve system reliability and troubleshooting capabilities.

  • vmcore

    A memory image generated by the kernel crash dumping mechanism when the Linux kernel crashes.

Solution

  1. Check the status of the kdump service.

    sudo kdumpctl status

    View the command output.

    • If the kdump: Kdump is operational message appears in the command output, kdump runs as expected.

    • If the kdump: Kdump is not operational message appears in the command output, kdump is not running. Start kdump.

      sudo kdumpctl start
  2. Check whether a vmcore file is generated after the system panic.

    Replace <ip-time> with the name of the folder in which vmcore files are stored. Example: 127.0.0.1-2024-10-11-15:46:52.

    ls /var/crash/<ip-time>

    If no vmcore file is generated, change the value of the crash_kexec_post_notifiers parameter to N. Then, check whether a vmcore file is generated.

    • Temporarily change the value of the crash_kexec_post_notifiers parameter to N.

      sudo sh -c 'echo N > /sys/module/kernel/parameters/crash_kexec_post_notifiers'
    • Permanently change the value of the crash_kexec_post_notifiers parameter to N.

      1. Change the value of the crash_kexec_post_notifiers parameter to N.

        sudo grubby --update-kernel="/boot/vmlinuz-$(uname -r)" --args="crash_kexec_post_notifiers=N"
      2. Restart the instance for the change to take effect.

        Warning

        The restart operation stops the instance for a short period of time and may interrupt the services that are running on the instance. This may result in data loss. Therefore, we recommend that you back up critical instance data before you restart the instance. We also recommend that you restart the instance during off-peak hours.

        sudo reboot
  3. If no vmcore file is generated after you change the value of the crash_kexec_post_notifiers parameter to N, the issue may be caused by other reasons. In this case, use the kdumpctl tool to troubleshoot the issue. For more information, see Use the kdumpctl tool to view the boot logs for kernel crash dumps.