When you make system calls such as read and copy_file_range to read files from a Network File System (NFS) file system on an Alibaba Cloud Linux 3 instance, the read performance may significantly degrade compared with when you do the same on an Alibaba Cloud Linux 2 instance. This topic describes the cause of and solution to the issue.
Problem description
Symptom
NFS file systems on an Alibaba Cloud Linux 3 instance experience poor file read performance and the following situations occur:
The instance takes a long time to read large files from the NFS file systems by making system calls such as read and copy_file_range.
The instance takes a longer time than an Alibaba Cloud Linux 2 instance to read data from files at NFS mount points by running the
dd
command. Sample command:dd if=<nfs_mntpoint>/<testfile> of=/dev/null bs=1M
NoteThe sample command reads data from the testfile file at the NFS mount point and sends the data to the /dev/null device. After the
dd
command is run, the command output includes specific information, such as the total number of bytes read and the amount of time consumed. You can use the information to calculate the rate at which data is read and evaluate the performance of the NFS file system.
Impact
This issue mainly occurs on the ECS instances with the following configurations:
Image: aliyun_3_x64_20G_alibase_20210415.vhd and later.
Kernel: 5.10.23-4.al8.x86_64 and later.
File system: NFS file systems are mounted, and files located in the directories of mount points are read.
Cause
In the upstream Linux kernel, the read_ahead_kb
parameter indicates the read-ahead window size of the block device. Read-ahead is a performance optimization technology that allows the system to predict data that is likely to be accessed in the near future and proactively load the data into memory. If read-ahead data is requested, the system can read the data directly from the memory without the need to wait for disk I/O operations to complete. This reduces latency and improves data read efficiency.
Prior to Linux kernel version 5.4, the amount of read-ahead data supported by an NFS file system varies based on the
rsize
value set at the mount time. The rsize value indicates the maximum size of data that an NFS client can receive for each network read request. By default, theread_ahead_kb
value is 15 times as large as thersize
value. In Alibaba Cloud Linux 2 with kernel version 4.19, the default value ofrsize
is 1,024 KB, and the value ofread_ahead_kb
is 15,360 KB.However, after a commit (index : kernel/git/torvalds/linux.git) was introduced in Linux kernel version 5.4, the
read_ahead_kb
parameter varies based on theVM_READAHEAD_PAGES
parameter instead of thersize
parameter. In Alibaba Cloud Linux 3 with kernel version 5.10, the default value ofread_ahead_kb
is 128 KB.
Consequently, Alibaba Cloud Linux 3 provides lower file read performance than Alibaba Cloud Linux 2. For an Alibaba Cloud Linux 3 instance, you must re-evaluate and adjust the read-ahead window size to optimize the file read efficiency.
On an NFS file system, a large read-ahead window can potentially improve the performance of continuous reads of large files. However, if the window is extremely large, unnecessary data may be loaded into the memory, especially in scenarios that involve random reads. Therefore, we recommend that you evaluate the actual workload based on your business environment and then adjust the read_ahead_kb
value to obtain the optimal read-ahead window size.
Solution
You can use one of the following methods to modify the read_ahead_kb
value.
Run the echo
command to modify the read_ahead_kb value for a single file system.
View the read-ahead settings of the NFS file system.
cat /sys/class/bdi/$(mountpoint -d <nfs_mountpoint>)/read_ahead_kb
Replace
<nfs_mountpoint>
with the actual path of the NFS mount point. To obtain the path, run thecat /proc/self/mountinfo
command.Appropriately increase the read-ahead window size of the NFS file system.
sudo sh -c 'echo <num> > /sys/class/bdi/<major>:<minor>/read_ahead_kb'
Set the following parameters based on the actual environment:
<num>
: the size of the read-ahead window. Unit: KB.<major>:<minor>
: the minor and minor device numbers of the NFS file system. To obtain the numbers, run thesudo mountpoint -d <nfs_mountpoint>
command.
Sample command:
sudo sh -c 'echo 15360 > /sys/class/bdi/0:422/read_ahead_kb'
NoteIf you mount multiple NFS file systems to an instance, repeatedly run the command to modify the read-ahead settings of each file system.
Modify the read_ahead_kb value for multiple file systems by using the udev mechanism.
You can use the udev mechanism to add udev rules to manually trigger a udev rule check event for all NFS file systems that are mounted to an instance. The udev rule check event allows the instance to automatically modify the read-ahead settings for all NFS file systems that are mounted and to be mounted to the instance. Perform the following steps:
udev is a device manager for the Linux kernel and is responsible for the management and operation automation of files. The udev mechanism uses the udev daemon as the core component. The udev daemon runs in the user space and communicates with the kernel by using the uevent mechanism.
Open and edit the configuration file that contains udev rules for the NFS file systems. The configuration file is located in the
/etc/udev/rules.d/
directory. If the configuration file does not exist, create one. Sample command:sudo vim /etc/udev/rules.d/99-nfs.rules
Add a udev rule to the configuration file to allow the NFS file systems to automatically modify the read-ahead settings.
In this example, the
read_ahead_kb
parameter is set to 15,360 KB. You can modify the the value of this parameter based on your business requirements.SUBSYSTEM=="bdi", ACTION=="add", PROGRAM="/bin/awk -v bdi=$kernel 'BEGIN{ret=1} {if ($4 == bdi) {ret=0}} END{exit ret}' /proc/fs/nfsfs/volumes", ATTR{read_ahead_kb}="15360"
Save and close the configuration file.
Reload the configuration file for the new rule to take effect.
sudo udevadm control --reload
Manually trigger a udev rule check event to modify the
read_ahead_kb
value for all mounted NFS file systems.sudo udevadm trigger -c add -s bdi
Modify the NFS configuration file to modify the read_ahead_kb value for multiple file systems of version 2.3.3-57.0.1.al8.1 or later.
If the NFS file system version of an Alibaba Cloud Linux 3 instance is nfs-utils-2.3.3-57.0.1.al8.1 or later, you can modify the read_ahead_kb
value by modifying the NFS configuration file that is located in the /etc/nfs.conf
directory. You can run the rpm -qa | grep nfs-utils
command to query NFS file system versions.
Open and edit the NFS configuration file.
sudo vim /etc/nfs.conf
Modify the default read-ahead settings, and then save and close the file.
[nfsrahead] nfs=15000 nfs4=16000
Modify the read-ahead settings based on the NFS file system version: nfs and nfsv4.
nfs
indicates that the version of the NFS file system is 3.nfs4
indicates that the version of the NFS file system is 4. To query the version, run themount -v | grep nfs
command.For a mounted NFS file system, unmount and then remount the file system for the configuration to take effect.
sudo umount <nfs_mountpoint> sudo mount -t nfs -o vers=<NFS protocol version> <NFS server address> <nfs_mountpoint>