You can configure Elastic Remote Direct Memory Access (eRDMA) on specific enterprise-level Elastic Compute Service (ECS) instances to use the low-latency, high-throughput, high-performance, and highly scalable RDMA network services and improve network performance without the need to modify the network architecture.
Limits
Item | Description |
Region | eRDMA is available in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Guangzhou), China (Ulanqab), and China (Heyuan). |
Instance family | The following instance families support eRDMA: |
Image |
Note The images that are available for selection vary based on the instance type. The images that are available for selection are displayed on the instance buy page when you select an instance type that supports eRDMA. |
Number of eRDMA devices | To query the maximum number of ERIs that you can bind to an ECS instance of a specific instance type, call the DescribeInstanceTypes operation and check the value of the EriQuantity parameter in the response. A value of 0 indicates that you cannot bind an ERI to an ECS instance of the instance type. |
Network |
|
Configure eRDMA on an enterprise-level ECS instance
Configure eRDMA when you create an ECS instance
When you create an eRDMA-capable instance that runs Alibaba Cloud Linux, Ubuntu, or Anolis OS, you can enable eRDMA by selecting the Auto-install eRDMA Driver option to automatically install the eRDMA driver and enabling the ERI feature for the primary ENI.
If you cannot select the Auto-install eRDMA Driver option for the operating system version that you select or the eRDMA driver fails to be automatically installed, you can install the driver manually or by using a script after the instance is created. For more information, see the Configure eRDMA on an existing instance section of this topic.
After you start the ECS instance, wait for a period of time for the system to install the eRDMA driver.
Create an enterprise-level ECS instance that supports ERIs. When you create the ECS instance, take note of the following parameters or options. For information about other parameters on the ECS instance buy page, see Create an instance on the Custom Launch tab.
Instance and image: Select an instance type that supports eRDMA and install the eRDMA driver.
Instance: For more information, see the Limits section of this topic.
ENI: Select the eRDMA Interface option on the right side of Primary ENI to bind an ERI to the ECS instance.
When you create an enterprise-level instance, you can enable the ERI feature only for the primary elastic network interface (ENI). You can enable the ERI feature for a secondary ENI in the ECS console or by calling an API operation. For more information, see ERI.
Configure eRDMA on an existing ECS instance
Check whether eRDMA is configured as expected for the instance.
For more information, see the Verify the correctness of the eRDMA configurations section of the "Use eRDMA" topic.
You can perform the following operations based on the verification result: Install the eRDMA driver or Bind an ERI to the ECS instance.
Install the eRDMA driver.
If you do not select Auto-install eRDMA Driver when you create the instance, the eRDMA driver is not automatically installed on the instance. Install the eRDMA driver manually or by using a script based on the actual scenario.
If you use a script to install the eRDMA driver, the installation package for the latest stable eRDMA driver version is automatically downloaded.
If you want to manually install the eRDMA driver, you can download the package for a specific eRDMA driver version.
Execute a script to install the eRDMA driver
Run the following command to download the most recent and stable eRDMA driver package:
curl -O http://mirrors.cloud.aliyuncs.com/erdma/env_setup.sh
Run the following command to install the eRDMA driver package:
sudo /bin/bash env_setup.sh > /var/log/erdma_install.log 2>&1
The script automatically installs the dependencies that are required by the eRDMA driver and then the eRDMA driver. Wait for the script execution to complete.
NoteIf the eRDMA driver fails to be installed by using the script, check logs in the
/var/log/erdma_install.log
file.
Manually install the eRDMA driver
Update the prerequisite package.
For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:
sudo yum update -y
For Ubuntu, skip this step.
Run the following commands in sequence to query the most recent kernel package version and the operating system kernel version:
rpm -qa | grep kernel #Query the latest kernel package version. uname -r #Query the operating system kernel version.
The command outputs shown in the following figure indicate that the kernel package version is the same as the operating system kernel version. In this case, you do not need to perform additional operations. If the versions are different, restart the ECS instance to make the versions the same.
Install dependency packages.
If the ECS instance is an x86 instance, run one of the following commands based on the instance operating system.
For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:
sudo yum install gcc-c++ dkms cmake kernel-devel kernel-headers libnl3 libnl3-devel
For Ubuntu, run the following command:
sudo apt-get install dkms cmake libnl-3-dev libnl-route-3-dev kernel-headers
If the ECS instance is an Arm instance, the building task is executed based on the source code. In this case, a large number of dependencies are required and subject to change. You can skip this step and execute the installation script. If the installation script fails to install dependency packages, you are prompted to install the required dependency packages. Install the dependency packages as prompted and then re-install the eRDMA driver.
Download the driver installation package.
Run the following command to download the eRDMA driver installation package from an internal URL:
wget http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-latest.tar.gz
Run the following command to download the eRDMA driver installation package from a public URL:
wget https://mirrors.aliyun.com/erdma/erdma_installer-latest.tar.gz
In this example, the installation package for the latest eRDMA driver version is downloaded. You can download the installation package for a specific eRDMA driver version based on your business scenarios. For information about the release of different versions of the eRDMA installation package, see the Install the eRDMA driver for an ECS instance section of the "Use eRDMA" topic.
Run the following command to decompress the installation package and then go to the directory to which the installation package is decompressed:
tar -xvf erdma_installer-latest.tar.gz && cd erdma_installer
Use one of the following methods to install the eRDMA driver:
Method 1: Run the following command to install the eRDMA driver. During the installation process, confirm relevant uninstallation steps and automatic installation steps.
sudo sh install.sh
Method 2: Run the following command to automatically install the eRDMA driver:
sudo sh install.sh --batch
View the command output to check whether the driver is installed.
The following command output indicates that the eRDMA driver is installed.
The following command output indicates that the eRDMA driver failed to be installed. Perform operations as prompted and then re-install the eRDMA driver.
NoteIf the ECS instance runs CentOS 7 and you receive an error message indicating that packages are missing when you re-install the driver, you may fail to obtain the packages by running the
yum
commands. In this case, you may need to run theyum install -y epel-release
command to install the Extra Packages for Enterprise Linux (EPEL) repository before you obtain the packages.
Bind an ERI to the ECS instance.
You can bind only one ERI to each enterprise-level ECS instance. For more information, see the Limits section of this topic.
Enable the ERI feature for an ENI that is bound to an ECS instance
You can enable the ERI feature for an ENI that is bound to an ECS instance by modifying the attributes of the ENI. For more information, see the Change the status of the ERI feature for an existing ENI section of the "ERIs" topic.
Create an ERI and bind the ERI to an ECS instance
For information about how to create an ERI, see the Separately create an ERI section of the "ERIs" topic.
For information about how to bind an ERI to an ECS instance, see Bind a secondary ENI.
Call API operations to create an ERI and bind the ERI to an ECS instance
Perform the following steps:
Call an API operation to create an ERI.
Call the CreateNetworkInterface operation to create an ENI and set the NetworkInterfaceTrafficMode parameter to HighPerformance to enable the ERI feature for the ENI.
After the call is successful, record the return value of the
NetworkInterfaceId
parameter, which is the ERI ID.Set the NetworkInterfaceId parameter to the return value recorded in the preceding step and the InstanceId parameter to the ID of an ECS instance and call the AttachNetworkInterface operation to bind the ERI to the ECS instance.
ImportantIf the instance type of the ECS instance supports multiple ERIs per instance, we recommend that you set the NetworkCardIndex parameter to a different value for each ERI when you bind multiple ERIs to the instance. This ensures that the ERIs are bound to different channels and the maximum network bandwidth is achieved for the instance. For more information, see the Request parameters section of the "AttachNetworkInterface" topic.
Test the eRDMA write latency
You can install Perftest
and test the write latency by using ib_write_lat
on two enterprise-level instances that have eRDMA configured. For information about Perftest tests, see the Perftest test set section of the "Use eRDMA" topic.
Prepare the environment
Create two enterprise-level ECS instances that function as the server and client. Make sure that the ECS instances have eRDMA configurations, such as installing the eRDMA software stack and enabling the ERI feature.
Make sure that the instances have valid network configurations and can communicate with each other over the internal network. For more information, see Connect ECS instances through an internal network.
Procedure
Connect to the two ECS instances.
For more information, see Use Workbench to connect to a Linux instance over SSH.
Verify and confirm that the eRDMA configurations on both instances are correct.
For more information, see Verify the correctness of eRDMA configurations of the "Use eRDMA" topic.
Install Perftest on each ECS instance.
You can download the perftest package from the official perftest repository and install perftest, or use a Yellowdog Updater, Modified (YUM) or Advanced Packaging Tool (APT) repository to install perftest.
Official perftest repository
Enable public bandwidth for an ECS instance on which you want to install perftest. For more information, see Enable public bandwidth for an ECS instance.
Download the perftest package from the official perftest repository and install perftest.
YUM or APT repository
NoteDifferent versions of perftest are included in the repositories of different Linux distributions. Incompatibility may occur. To prevent incompatibility, we recommend that you identify the Linux distribution run by the ECS instance on which you want to install perftest and install the perftest version included in the repository of the same Linux distribution. Otherwise, download the perftest package from the official perftest repository and install perftest.
Alibaba Cloud Linux 3, CentOS, and Anolis OS
sudo yum install perftest -y
Ubuntu
sudo apt install perftest -y
Test whether the eRDMA network latency meets the expected performance.
On the server-side instance, run the following command to start
ib_write_lat
as a server that listens to connections from the client:ib_write_lat -R -a -F
On the client-side instance, run the following command to start
ib_write_lat
and connect to the server:ib_write_lat -R -a -F <server_ip>
Replace
<server_ip>
with the private IP address of the ERI bound to the server-side ECS instance. For information about how to query IP addresses, see View IP addresses.Check the test results.
After the client is tested,
ib_write_lat
outputs the test configuration information, connection information, and performance test results. The statistics include the minimum, maximum, and average latency. For more information, see the Latency data in the ib_write_lat test results section of this topic.