By Chao Qian (Xixie)
Many users use UnixBench to compare and to test the performance of VMs provided by different vendors. This article describes the tests performed by UnixBench from the code level.
Before we go into the details of UnixBench implementation, let's check the article UnixBench Score: An Introduction for similar results. From the results, we can see that UnixBench tests consist of two parts: single-process tests and multi-process tests. The number of processes in multi-process tests depends by default on the number of CPUs. The only difference between single-process tests and multi-process tests is the number of processes. Therefore, the following description focuses on single-process tests.
Dhrystone is a synthetic computing benchmark program that primarily tests the integer performance of a CPU. The corresponding floating-point number test is Double-Precision Whetstone.
According to the online articles, these unintelligible operations improve the performance through compilation optimization and cannot truly reflect the CPU performance. This article goes into detail about this: Benchmarking in context: Dhrystone.
Let's skip over the operations and look at the output: The test calculates the number of operations done within 10 seconds and obtains the index based on the score according to the first article.
In addition to the tests showing the CPU performance in integer operations, whets.c demonstrates the floating-point number computation performance, which features much higher code quality and intelligibility.
625*10/1.238352=5047
In addition to the two complex operations described above, other UnixBench operations are relatively simple. In Execl, it is actually a recursive call of the execl function. The execution file compiled by execl.c is a binary file of execl. When the execl function is executed, these parameters are recorded: start time, number of executions, and time consumed (generally 10 seconds). The concept is rather clever: The total number of executions is output when the time consumed exceeds 10 seconds, based on which the score is calculated according to the scoring rule.
This test mainly checks the write and read functions and takes 30 seconds. Its implementation is simple. First, the code writes a file for two seconds (cyclically) and reads the file for data for two seconds. The data obtained is then written to another file cyclically. In this way, the code obtains the number of read and write operations in 30 seconds. The parameters are used to test the performance with different block sizes. To test disks, FIO is recommended.
This test opens a pipeline, writes 512 bytes to the pipeline, and then reads the data from the pipeline. The test calculates the number of read and write operations in 10 seconds.
This test opens two pipelines and enables two processes. One process writes data to pipeline 1 and reads data from pipeline 2. The other process writes data to pipeline 2 and reads data from pipeline 2. Each time a process completes one read and write cycle, the result increases by 1. Interestingly, the test result is much better if the two processes are performed on the same CPU rather than on different CPUs. The following article in this series will provide a detailed analysis of this issue.
This test repeatedly calls the fork function to create a process and then immediately exits the process. Each time the operation cycle is completed, the result increases by 1.
This test uses the fork function to create a process and execute a script repeatedly. Each time the script is executed successfully, the result increases by 1. Shell Scripts (1 concurrent) indicates that the pgms/multi.sh parameter input to the script is 1. Shell Scripts (8 concurrent) indicates that the pgms/multi.sh parameter input to the script is 8 and eight subtasks are executed concurrently.
This test calculates the overhead for entering and exiting the operating system. Each time entry and exit is performed, the result increases by 1. The test calculates the number of executions within 10 seconds. The execution is based on the fork child process. Each time the waitpid function is exited, the result increases by 1.
These are the default implementations of UnixBench, which are very simple but interesting!
33 posts | 12 followers
FollowAlibaba Cloud ECS - April 11, 2019
Alibaba Cloud ECS - April 11, 2019
Alibaba Cloud Native Community - November 22, 2023
Apache Flink Community China - August 19, 2021
yanmin - June 27, 2019
淘系技术 - December 9, 2020
33 posts | 12 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreHigh Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreA HPCaaS cloud platform providing an all-in-one high-performance public computing service
Learn MoreMore Posts by Alibaba Cloud ECS