All Products
Search
Document Center

Elastic Compute Service:Test the IOPS performance of an ESSD

Last Updated:Oct 21, 2024

Alibaba Cloud Enterprise SSDs (ESSDs) use 25 Gigabit Ethernet and Remote Direct Memory Access (RDMA) technologies to deliver up to 1,000,000 random read/write IOPS per disk and reduce one-way latency. This topic describes how to test the IOPS performance of an ESSD. You can configure the test conditions as described in the following example.

Test conditions

  • Test tool: Use flexible I/O tester (fio).

    Note

    fio is an open source, powerful I/O performance benchmarking tool that can test the performance metrics of block storage devices, such as random read and write operations and sequential read and write operations.

  • Instance type: We recommend that you use the ecs.g7se.32xlarge instance type. For more information, see the g7se, storage-enhanced general-purpose instance family section of the "General-purpose instance families" topic.

  • Image: Use a more recent version of a Linux public image provided by Alibaba Cloud. In this example, Alibaba Cloud Linux 3 is used.

    Note

    Test results show that ESSDs may not achieve the expected IOPS performance in specific Linux distribution images. We recommend that you use Alibaba Cloud Linux 3 images that are maintained by Alibaba Cloud.

  • ESSD:

    • Testing raw disks can obtain real disk performance. We recommend that you use the fio tool to test the IOPS performance of a raw disk.

    • We recommend that you use an ESSD at performance level 3 (PL3 ESSD). For more information about ESSDs, see ESSDs.

    Important
    • Testing raw disks can provide accurate test results but may destroy the file system structure of raw disks. To prevent the preceding issue, we recommend that you back up disk data by creating snapshots for the disks. For more information, see Create a snapshot.

    • To prevent data loss, we strongly recommend that you do not use the system disk on which the operating system resides or the disk that contains important data as the test object. We recommend that you use tools to test the block storage performance on new data disks or temporary disks that do not have important data.

    • If you want to perform a raw disk stress test on a system disk, we recommend that you complete the stress test and reset the operating system before you deploy services. This prevents potential issues caused by the stress test and helps ensure the long-term stable operation of the system.

    • Performance testing results are obtained in a test environment and are only for reference. In the real production environment, the performance of cloud disks may vary due to factors such as the network environment and concurrency access.

Procedure

  1. Connect to an ECS instance.

    For more information, see Connect to a Linux instance by using a password or key.

  2. Run the following command to install the libaio library and the fio tool:

    sudo yum install libaio libaio-devel fio -y
  3. Run the following command to change the path:

    cd /tmp
  4. Run the following command to create the test100w.sh script:

    sudo vim test100w.sh
  5. Paste the following content to the test100w.sh script.

    For information about the script content, see the Details about the test100w.sh script section of this topic.

    function RunFio
    {
     numjobs=$1   # The number of test threads. In this example, the value is 10.
     iodepth=$2   # The maximum number of concurrent I/O requests. In this example, the value is 64.
     bs=$3        # The data block size per I/O. In this example, the value is 4k.
     rw=$4        # The read and write policy. In this example, the value is randwrite.
     size=$5
     filename=$6  # The name of the test file. In this example, the value is /dev/your_device.
     nr_cpus=`cat /proc/cpuinfo |grep "processor" |wc -l`
     if [ $nr_cpus -lt $numjobs ];then
         echo "Numjobs is more than cpu cores, exit!"
         exit -1
     fi
     let nu=$numjobs+1
     cpulist=""
     for ((i=1;i<10;i++))
     do
         list=`cat /sys/block/your_device/mq/*/cpu_list | awk '{if(i<=NF) print $i;}' i="$i" | tr -d ',' | tr '\n' ','`
         if [ -z $list ];then
             break
         fi
         cpulist=${cpulist}${list}
     done
     spincpu=`echo $cpulist | cut -d ',' -f 2-${nu}`
     echo $spincpu
     fio --ioengine=libaio --runtime=30s --numjobs=${numjobs} --iodepth=${iodepth} --bs=${bs} --size=${size} --rw=${rw} --filename=${filename} --time_based=1 --direct=1 --name=test --group_reporting --cpus_allowed=$spincpu --cpus_allowed_policy=split
    }
    echo 2 > /sys/block/your_device/queue/rq_affinity
    sleep 5
    RunFio 10 128 4k randwrite 1024g /dev/your_device
  6. Modify the parameters in the test100w.sh script based on the actual business scenario.

    • Replace the your_device parameter with the actual device name of the ESSD. Example: nvme1n1.

    • Replace 10, 64, 4k, randwrite, and /dev/your_device in the RunFio 10 64 4k randwrite /dev/your_device line with the actual values.

    • If data loss of the ESSD does not affect your business, you can set the filename parameter to a device name. Example: filename=/dev/vdb. Otherwise, set the filename parameter to a file directory. Example: filename=/mnt/test.image.

  7. Run the following command to test the performance of the ESSD:

    sudo sh test100w.sh

    The following figure shows the sample command output, in which the IOPS=*** parameter indicates the IOPS of the ESSD.image

Details about the test100w.sh script

  • In the test100w.sh script, the following command sets the rq_affinity parameter to 2:

    echo 2 > /sys/block/your_device/queue/rq_affinity

    Value of rq_affinity

    Description

    1

    Indicates that the block device delivers received I/O completion events to the group of the vCPUs that submit the corresponding I/O requests. In scenarios where multiple threads concurrently run, I/O completion events may be delivered only to one vCPU and cause a performance bottleneck.

    2

    Indicates that the block device delivers received I/O completion events to the vCPUs that submit the corresponding I/O requests. In scenarios where multiple threads concurrently run, each vCPU can deliver its maximum performance.

  • The following command runs jobs to bind queues to different CPU cores:

    fio -ioengine=libaio -runtime=30s -numjobs=${numjobs} -iodepth=${iodepth} -bs=${bs} -rw=${rw} -filename=${filename} -time_based=1 -direct=1 -name=test -group_reporting -cpus_allowed=$spincpu -cpus_allowed_policy=split
    Note

    In normal mode, a device has a single request queue. The request queue becomes a performance bottleneck when multiple threads concurrently process I/O requests. In multi-queue mode, a device can have multiple request queues to process I/O requests and deliver the maximum backend storage performance. For example, assume that you have four I/O threads. To make full use of multi-queue mode and improve storage performance, you must bind the I/O threads to the CPU cores that correspond to different request queues.

    Parameter

    Description

    Example value

    numjobs

    The number of I/O threads.

    10

    /dev/your_device

    The device name of the ESSD.

    /dev/nvme1n1

    cpus_allowed_policy

    The parameter provided by the fio tool to bind vCPUs. The fio tool provides the cpus_allowed_policy and cpus_allowed parameters to bind vCPUs.

    split

    The preceding command runs jobs to bind queues that have different queue IDs to different CPU cores. To view the ID of the CPU core to which a queue is bound, run the following commands:

    • Run the ls /sys/block/your_device/mq/ command. In the command, replace the your_device parameter with the actual device name. Example: nvme1n1. The command returns the ID of the queue for an ESSD whose device name is in the /dev/vd* format.

    • Run the cat /sys/block/your_device/mq/cpu_list command. In the command, replace the your_device parameter with the actual device name. Example: nvme1n1. The command returns the ID of the CPU core to which the queue for an ESSD is bound. The device name of the ESSD is in the /dev/vd* format.