This topic describes the cause of the following issue and how to resolve the issue: The Buffer I/O write performance of an Ext4 file system on an Elastic Compute Service (ECS) instance that runs Alibaba Cloud Linux 2 is not as expected.
Problem description
Buffer I/O write operations may not be performed as expected on an Ext4 file system on an ECS instance that meets the following conditions:
Image version:
aliyun-2.1903-x64-20G-alibase-20190327.vhd
or later.Kernel version:
kernel-4.19.24-9.al7
or later.The
dioread_nolock
andnodelalloc
options are used to mount Ext4 file systems.NoteFor information about how to view the types and mount options of file systems, see the View the types and mount options of file systems section of this topic.
Common scenarios in which performance is not as expected:
For information about the performance of block storage devices, see EBS performance.
Scenario 1: When you run the
cp
command to copy large files to an Ext4 file system on a disk, a long period of time is required to copy the files and the maximum copy rate is approximately 30 MB/s.Scenario 2: When you run the
dd
command that is similar to the following one without the Sync flag to copy files to an Ext4 file system on a disk, a long period of time is required to copy the files.dd if=/dev/zero of=/mnt/badfile bs=10M count=1000
When you run the
iostat -xm 1
command on the instance to check the write speeds, the value in the wMB/s column that corresponds to the disk is approximately 30 MB/s. Sample command output:avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 12.77 0.00 0.00 87.23 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vdb 0.00 7194.00 0.00 57.00 0.00 28.05 1008.00 0.02 17.81 0.00 17.81 0.39 2.20
Cause
The preceding issue occurs because a large number of 4-KB dirty pages that are called unwritten extents are generated in the kernel when the dioread_nolock
and nodelalloc
options are used to mount the file system. The 4-KB dirty pages are directly written back without being merged into huge pages due to the defects of the processing logic of Ext4 file systems. When you use the Perf tool to check the write-back caching process of the kernel, you find that the process is performed in the ext4_writepages()
function of the Ext4 file system. A long period of time is spent on searching for and mapping 4-KB dirty pages, which causes a significant decrease in file write performance.
Solution
This issue is a known issue and does not have a permanent solution. You can perform the following steps to temporarily resolve the issue:
Run the following command to remount the Ext4 file system without using the
dioread_nolock
andnodelalloc
mount options at the same time:sudo mount -o remount,delalloc <$Device> <$Mount_point>
NoteSet <$Device> to the device name of the disk on which you want to mount the Ext4 file system.
Set <$Mount_point> to the mount point to which you want to mount the Ext4 file system.
To ensure that the file system is automatically mounted on system startup, modify the
/etc/fstab
file and delete thenodelalloc
option of the Ext4 file system. By default, thedelalloc
option is used for Ext4 file systems.
View the types and mount options of file systems
Perform the following steps to view the file system type and mount options of the disk where the relevant directory resides:
Log on to the instance and run the following command to check the disk partition where the directory resides:
df <$DIR> | grep -v Filesystem | awk '{ print $1 }'
NoteSet <$DIR> to the directory on which write operations are performed.
Run the following command to view the file system type and mount options of the disk partition:
mount | grep -w <$Partition> | grep ext4 | grep -w dioread_nolock | grep -w nodelalloc
NoteSet <$Partition> to the name of the disk partition that you obtained in the previous step.