When the Linux operating system is up for an extended period of time, memory fragmentation may occur. This topic provides solutions to this issue.
Problem description
When a service deployed on an instance occasionally takes an extended period of time
responding or when a system call takes an extended period of time, the
sys
metric value of the system increases accordingly. The buddy system is in lack of
high-order memory (memory whose order
is higher than 3). The following code shows an example output of the cat /proc/buddyinfo command. Each column starting from the fourth column indicates the free memory at
a different order
of the buddy system. cat /proc/buddyinfo
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3173 856 529 0 0 0 0 0 0 0 0
Node 0, zone Normal 19030 8688 7823 0 0 0 0 0 0 0 0
Possible causes
When the Linux operating system is up for an extended period of time, contiguous large
physical memory chunks are split into small physical memory blocks. If a service requires
contiguous large memory chunks, the system starts a memory compaction procedure that
takes time, which may cause system performance jitters. Memory fragmentation typically
generates kernel stack information. The following code shows an example of the kernel
stack information:
0xffffffff8118f9cb compaction_alloc ([kernel.kallsyms])
0xffffffff811c88a9 migrate_pages ([kernel.kallsyms])
0xffffffff811901ee compact_zone ([kernel.kallsyms])
0xffffffff8119041b compact_zone_order ([kernel.kallsyms])
0xffffffff81190735 try_to_compact_pages ([kernel.kallsyms])
0xffffffff81631cb4 __alloc_pages_direct_compact ([kernel.kallsyms])
0xffffffff811741d5 __alloc_pages_nodemask ([kernel.kallsyms])
0xffffffff811b5a79 alloc_pages_current ([kernel.kallsyms])
0xffffffff811c0005 new_slab ([kernel.kallsyms])
0xffffffff81633848 __slab_alloc ([kernel.kallsyms])
0xffffffff811c5291 __kmalloc_node_track_caller ([kernel.kallsyms])
0xffffffff8151a8c1 __kmalloc_reserve.isra.30 ([kernel.kallsyms])
0xffffffff8151b7cd alloc_sib ([kernel.kallsyms])
0xffffffff815779e9 sk_stream_alloc_skb ([kernel.kallsyms])
0xffffffff8157872d tcp_sendmsg ([kernel.kallsyms])
0xffffffff815a26b4 inet_sendmsg ([kernel.kallsyms])
0xffffffff81511017 sock_aio_write ([kernel.kallsyms])
0xffffffff811df729 do_sync_readv_writev ([kernel.kallsyms])
0xffffffff811e0cfe do_readv_writev ([kernel.kallsyms])
Solutions
You can take one of the following measures to troubleshoot memory fragmentation in
the Linux operating system:
- Adjust the min watermark
In most cases, we recommend that you set the min watermark to 1% to 3% of the total memory. We recommend that you set it to 2% of the total memory. When memory resources are insufficient, asynchronous reclaim is triggered. You can run the following command to adjust the min watermark:
In the preceding command, the memtotal_kbytes * 2% variable indicates the memory size that is 2% of the total memory of the instance.sysctl -w vm.min_free_kbytes = memtotal_kbytes * 2%
- Adjust the difference between the min and low watermarks
You can specify the
watermark_scale_factor
kernel parameter to adjust the difference between the min and low watermarks to cope with sudden memory demands. The default value ofwatermark_scale_factor
is 0.1% of the total memory, and the minimum value is calculated based on the following formula:0.5 × <min watermark>
. The minimum value is the minimum difference between the min and low watermarks. You can run the following command to adjust the value ofwatermark_scale_factor
:
In the preceding command, the value variable indicates your specified difference between the min and low watermarks.sysctl -w vm.watermark_scale_factor = value
- Perform regular memory compaction
You can run the following command to trigger asynchronous memory compaction during off-peak hours:
echo 1 > /proc/sys/vm/compact_memory
- Manually drop the cache at regular intervals
If the preceding measures cannot effectively handle memory fragmentation, you can also drop the cache during off-peak hours so that the memory can be re-allocated. You can effectively prevent memory fragmentation by dropping the cache. However, short-period system performance jitters may occur when you perform this operation. You can run the following command to drop the cache:
echo 3 > /proc/sys/vm/drop_caches