All Products
Search
Document Center

Alibaba Cloud Linux:Use KFENCE to detect kernel memory pollution

Last Updated:Oct 25, 2024

Kernel Electric-Fence (KFENCE) is a built-in Linux kernel tool that can be enabled in an online environment. KFENCE detects memory pollution issues in the kernel and kernel modules. When KFENCE detects a memory pollution issue, KFENCE generates an error message that contains the details of the issue. Alibaba Cloud enhanced KFENCE in Alibaba Cloud Linux 3. You can flexibly and dynamically enable or disable KFENCE and use KFENCE to comprehensively detect memory pollution issues to meet the requirements for online detection and offline debugging.

Limits on operating systems

  • x86 architecture

    Alibaba Cloud Linux 3 whose kernel version is 5.10.84-10 or later

  • Arm architecture

    Alibaba Cloud Linux 3 whose kernel version is 5.10.134-16 or later

Note
  • If you are a developer of the kernel or kernel modules, you can use KFENCE to check whether memory pollution occurs in the kernel or kernel modules.

  • If you are a regular user and a kernel crash occurs, you can use KFENCE to help Alibaba Cloud or third-party driver developers collect detailed information about the kernel crash.

Terms

Term

Description

memory pollution

The issue that memory areas are incorrectly modified or corrupted when a program is running, which causes exceptions or crashes on the program. Memory pollution can be caused by programming errors, software vulnerabilities, malware, or hardware failures.

slab

Slab allocation is an efficient memory allocation mechanism in the Linux kernel. The kernel uses slabs to pre-allocate a specific number of memory objects in a memory cache pool for quick memory allocation and release. Slabs can be used to prevent frequent memory allocation and release operations and improve the efficiency of memory allocation.

order-0 page

Order-0 page allocation is a memory allocation mechanism in the Linux kernel. Memory is divided into fixed-size blocks called page frames. In most cases, the size of an order-0 page frame is 4 KiB. An order-0 page is a 4-KiB page frame, which is the basic unit for memory allocation. When an application or the kernel requires small blocks of memory, memory is allocated by order-0 pages.

Enable KFENCE

KFENCE is used in the following business scenarios:

Online detection scenario

Scenario 1: Use KFENCE to detect whether a memory pollution issue occurs

Note

In the following example, KFENCE consumes up to 2 MiB of memory and does not affect performance.

  • Add the kfence.sample_interval parameter to enable KFENCE.

    Replace <kfence.sample_interval> with the value that you want to specify. For example, a value of 100 specifies that the KFENCE debugging tool is automatically enabled the next time the system starts and the sampling interval is set to 100 events.

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.sample_interval=<kfence.sample_interval>"
  • Add the kfence.booting_max parameter to limit the maximum amount of memory that KFENCE can consume based on the memory specifications.

    Note
    • In kernel version 5.10.134-17 or later, the default configuration kfence.booting_max=0-2G:0,2G-32G:2M,32G-:32M is added to the boot commandline parameter list. The preceding default configuration is used in conjunction with the default value (255) of the num_objects parameter to ensure that the memory overhead of KFENCE does not exceed 1‰ of the total memory in all memory specifications. When the preceding default configuration and value are used, KFENCE can consume up to 2 MiB of memory if standard 4-KiB memory pages are used and up to 32 MiB of memory if 64-KiB huge pages are used.

    • The kfence.booting_max parameter only limits the maximum amount of memory that KFENCE can consume. The parameter is a constraint for the num_objects parameter and does not represent the actual memory overhead. The actual memory overhead is less than or equal to the value of the kfence.booting_max parameter.

    Replace <kfence.booting_max> with the value that you want to specify, such as 0-128M:0,128M-256M:1M,256M-:2M. Description of the segments in the sample value:

    • 0-128M:0: If the total memory on the machine that you use is less than 128 MiB in size, KFENCE is disabled.

    • 128M-256M:1M: If the total memory on the machine that you use is larger than or equal to 128 MiB but less than or equal to 256 MiB in size, KFENCE can consume up to 1 MiB of memory. The value of the num_objects parameter cannot exceed 127.

    • 256M-:2M: If the total memory on the machine that you use is larger than 256 MiB in size, KFENCE can consume up to 2 MiB of memory. The value of the num_objects parameter cannot exceed 255.

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.booting_max=<kfence.booting_max>"

    The preceding configuration applies only to the scenario in which KFENCE is started on system startup by adding parameters to the boot commandline parameter list. The configuration does not take effect in Scenario 2 in which KFENCE is configured after the system starts.

The configuration automatically takes effect the next time the system starts.

Scenario 2: Use KFENCE to detect whether a memory pollution issue occurs

Important

In this scenario, a large amount of memory at the GiB level is consumed. Exercise caution when you use a small-memory machine.

  1. Create a memory allocation script and add the following content. In the following example, the script is named kfence.sh and the slab type to be monitored is kmalloc-64.

    #!/bin/bash
    # usage: ./kfence.sh kmalloc-64
    
    SLAB_PREFIX=/sys/kernel/slab
    MODULE_PREFIX=/sys/module/kfence/parameters
    
    if [ $# -eq 0 ]; then
    	echo "err: please input slabs"
    	exit 1
    fi
    
    #check whether slab exists
    for i in $@; do
    	slab_path=$SLAB_PREFIX/$i
    	if [ !  -d $slab_path ]; then
    		echo "err: slab $i not exist!"
    		exit 1
    	fi
    done
    
    #calculate num_objects
    sumobj=0
    for i in $@; do
    	objects=($(cat $SLAB_PREFIX/$i/objects))
    	maxobj=1
    	for ((j=1; j<${#objects[@]}; j++)); do
    		nodeobj=$(echo ${objects[$j]} | awk -F= '{print $2}')
    		[ $maxobj -lt $nodeobj ] && maxobj=$nodeobj
    	done
    	((sumobj += maxobj))
    done
    echo "recommend num_objects per node: $sumobj"
    
    #check kfence stats
    if [ $(cat $MODULE_PREFIX/sample_interval) -ne 0 ]; then
    	echo "kfence is running, disable it and wait..."
    	echo 0 > $MODULE_PREFIX/sample_interval
    	sleep 1
    fi
    
    #disable all slabs catching
    for file in $SLAB_PREFIX/*
    do
    	(echo 0 > $file/kfence_enable) 2>/dev/null || echo 1 > $file/skip_kfence
    done
    
    #disable order0 page catching
    echo 0 > $MODULE_PREFIX/order0_page
    
    #enable setting slabs catching
    for i in $@; do
    	(echo 1 > $SLAB_PREFIX/$i/kfence_enable) 2>/dev/null || echo 0 > $SLAB_PREFIX/$i/skip_kfence
    done
    
    #setting num_objects and node mode
    echo $sumobj > $MODULE_PREFIX/num_objects
    echo node > $MODULE_PREFIX/pool_mode
    
    #start kfence
    echo -1 > $MODULE_PREFIX/sample_interval
    if [ $?  -ne 0 ]; then
    	echo "err: kfence enable fail!"
    	exit 1
    fi
    echo "kfence enabled!"

    The script is used to detect the number of active objects of the slabs, estimate the appropriate KFENCE pool size based on the number, and then enable KFENCE to obtain information about the memory allocation of all the slabs.

    Note

    Slabs are commonly used in memory management to optimize memory allocation and release operations. This improves system performance and efficiency. KFENCE can monitor slabs and order-0 pages.

  2. Run the following command to execute the script to start memory pollution detection:

    sudo bash ./kfence.sh kmalloc-64

Offline debugging scenario

Enable KFENCE by specifying parameters for the x86 architecture

  1. Run the following commands to enable KFENCE:

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.num_objects=1000000"
    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.sample_interval=-1"
    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.fault=panic"
    • num_objects: the size of the KFENCE pool, which is the maximum number of slab objects that KFENCE can monitor.

      • When the value of the num_objects parameter is smaller than or equal to 131071, the maximum amount of memory that KFENCE can consume is calculated by using the following formula: (num_objects + 1) × 8 KiB.

      • When the value of the num_objects parameter is greater than 131071, the maximum amount of memory that KFENCE can consume is calculated by using the following formula: ⌈num_objects/131071⌉ GiB. The ⌈⌉ symbols specify that the calculation result is rounded up to the nearest integer.

        Note

        We recommend that you set the num_objects parameter to 10% of the maximum available memory. For example, if you set the num_objects parameter to 1,000,000, KFENCE can consume up to 8 GiB of memory, which is calculated by using the following formula: ⌈ 1,000,000/131071 ⌉ GiB = 8 GiB.

    • sample_interval: the interval at which memory is monitored. Valid values:

      • 0: KFENCE is disabled and does not monitor memory.

      • Positive number: the sampling interval in milliseconds. For example, a value of 100 specifies that KFENCE monitors the allocated memory every 100 milliseconds.

      • Negative number: the full mode. KFENCE monitors all memory that meets a specific condition, such as a specific slab type.

    • fault: This parameter is introduced in kernel version 5.10.134-16. Default value: report. If you set the fault parameter to panic, downtime occurs on the instance on which an issue is detected to preserve the core dump file generated when the issue occurred.

  2. Restart the operating system to allow the configurations to take effect.

    For more information, see Restart an instance.

Use a script to enable KFENCE for the x86 or Arm architecture

Note
  • After you run a script to enable KFENCE, KFENCE cannot detect the memory pollution issues that may occur during kernel startup.

  • If you want to change the value of the num_objects or sample_interval parameter after you enable KFENCE, you must first disable KFENCE.

Run the following commands to enable KFENCE:

sudo sh -c 'echo 1000000 > /sys/module/kfence/parameters/num_objects'
sudo sh -c 'echo -1 > /sys/module/kfence/parameters/sample_interval'
sudo sh -c 'echo panic > /sys/module/kfence/parameters/fault'
  • num_objects: the size of the KFENCE pool, which is the maximum number of slab objects that KFENCE can monitor. The maximum amount of memory that KFENCE can consume is calculated by using the following formula: ⌈num_objects/131071⌉ GiB. The ⌈⌉ symbols specify that the calculation result is rounded up to the nearest integer.

    Note

    We recommend that you set the num_objects parameter to 10% of the maximum available memory. For example, if you set the num_objects parameter to 1,000,000, KFENCE can consume up to 8 GiB of memory, which is calculated by using the following formula: ⌈ 1,000,000/131071 ⌉ GiB = 8 GiB.

  • sample_interval: the interval at which memory is monitored. Valid values:

    • 0: KFENCE is disabled and does not monitor memory.

    • Positive number: the sampling interval in milliseconds. For example, a value of 100 specifies that KFENCE monitors the allocated memory every 100 milliseconds.

    • Negative number: the full mode. KFENCE monitors all memory that meets a specific condition, such as a specific slab type.

  • fault: This parameter is introduced in kernel version 5.10.134-16. Default value: report. If you set the fault parameter to panic, downtime occurs on the instance on which an issue is detected to preserve the core dump file generated when the issue occurred.

    Note

    If your kernel version is earlier than 5.10.134-16, an error message is reported when you run the preceding command. The error does not affect KFENCE. You can ignore the error message.

View results

After KFENCE detects memory pollution issues, you can view the number of issues and detailed error messages.

  • View the number of detected memory pollution issues.

    sudo cat /sys/kernel/debug/kfence/stats

    The following figure shows the command output, which indicates that the total bugs count increases.

    image.png

  • View the details of error messages.

    dmesg | grep -i kfence

    The following figure shows the command output, which indicates that one error message is returned.

    image.png

Disable KFENCE

  • Run the following command to disable KFENCE:

    sudo bash -c 'echo 0 > /sys/module/kfence/parameters/sample_interval'

    After you disable KFENCE, KFENCE no longer detects memory allocation issues. When all monitored memory in the pool is released, KFENCE returns the memory to the kernel buddy systems at a granularity of 1 GiB.

  • In scenarios in which KFENCE is started by adding parameters to the boot commandline parameter list, you can run the following command to remove the parameters. Then, KFENCE is not automatically enabled the next time the system starts.

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --remove-args="kfence.sample_interval"

FAQ

  • What are the impacts of KFENCE on memory and performance?

    • Impacts on memory

      KFENCE trades a large number of memory overheads for less performance interference and consumes a high amount of memory. If the restart-triggered sampling mode (supported by the Linux community) is used, you can set the num_objects parameter to a smaller value to conserve memory. If the full mode is used or KFENCE is dynamically enabled, GiB-level memory is consumed. Exercise caution when you use small-memory machines.

    • Impacts on performance

      • In sampling mode, the performance is less affected.

      • In full mode, the impacts on the performance are acceptable if memory that meets a specific condition is monitored. For example, memory of a specific slab type is monitored.

      Note
      • We recommend that you perform a phased test based on the actual business scenario to observe the impacts of enabling KFENCE on the actual business performance and then determine the subsequent deployment.

      • In the offline debugging scenario, if the full mode is used to monitor the memory of all types of slabs, the performance and memory usage are greatly affected. However, in this scenario, the KFENCE is used to pinpoint issues, regardless of impacts on performance.

  • What is the difference between KFENCE and Kernel Address Sanitizer (KASAN)?

    KFENCE and KASAN are built-in Linux kernel tools that detect memory pollution. Alibaba Cloud enhanced KFENCE in kernel version 5.10. KFENCE can be enabled and disabled in a more flexible manner, supports sampling, and can run in an online business environment. The following section describes the functional differences between KFENCE and KASAN:

    • KFENCE supports monitoring of slabs up to 4 KiB in size, such as kmalloc-4k and order-0 pages. KASAN can monitor more types of memory, including memory of all types of slabs, pages of memory, stack memory, and global memory.

    • KFENCE has a higher success rate than KASAN in detecting abnormal memory behaviors within the monitoring range.

    • KFENCE has more memory overheads than KASAN. However, KFENCE has less impacts on service performance than KASAN.

    In most cases, we recommend that you do not use KFENCE and KASAN at the same time. KFENCE takes over the monitoring objects of KASAN.

  • How stable is KFENCE?

    A known issue exists in kernel version 5.10.134-15 and earlier. When KFENCE monitors memory of order-0 pages and slabs, downtime may occur in specific scenarios. To prevent this issue, run the following command to disable KFENCE from monitoring memory of order-0 pages:

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.order0_page=0"