You can configure elastic Remote Direct Memory Access (eRDMA) on eRDMA-capable, enterprise-level Elastic Compute Service (ECS) instances to use the low-latency, high-throughput, high-performance, and highly scalable RDMA network service and improve network performance without the need to modify the network architecture. This topic describes how to configure eRDMA on an enterprise-level ECS instance.
Limits
The following table describes the limits on eRDMA in terms of regions, instance families, images, the number of eRDMA devices, and networks.
Item | Description |
Region | eRDMA is supported in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), and China (Shenzhen), China (Guangzhou), China (Ulanqab), and China (Heyuan). |
Instance family | The following instance families support eRDMA: |
Image |
Note The images that are available for selection vary based on the instance type. The images that are available for selection are displayed on the instance buy page when you select an instance type that supports eRDMA. |
Number of eRDMA devices | Each ECS instance supports only one ERI. |
Network |
|
Procedure
You can enable eRDMA on an eRDMA-capable ECS instance only if the instance meets the following conditions: The eRDMA driver is installed on the instance, and an ERI is bound to the instance.
Configure eRDMA when you create an ECS instance
When you create an eRDMA-capable instance that runs Alibaba Cloud Linux, Ubuntu, or Anolis OS, you can enable eRDMA by selecting the Auto-install eRDMA Driver option to automatically install the eRDMA driver and enabling the ERI feature for the primary ENI. If you cannot select the Auto-install eRDMA Driver option for the operating system version that you select or the eRDMA driver fails to be automatically installed, you can install the driver manually or by using a script after the instance is created. For more information, see the Configure eRDMA on an existing instance section in this topic.
Create an ECS instance that supports ERIs. When you create the ECS instance, take note of the following parameters or options. For information about other parameters on the ECS instance buy page, see Create an instance on the Custom Launch tab.
Instance and Image: Select an instance type that supports eRDMA and an image. For more information, see the Limits section in this topic. Auto-install eRDMA Driver: Select this option to automatically install the eRDMA driver during ECS instance creation.
ENI: Select eRDMA Interface to the right of the primary ENI.
ImportantWhen you create an ECS instance, you can enable the ERI feature only for the primary ENI. You can bind only one ERI to each ECS instance. If you want to use a secondary ENI to configure eRDMA on an ECS instance, create a secondary ENI for which the ERI feature is enabled and bind the ENI to the ECS instance after the ECS instance is created. For more information, see Create an ENI and Bind an ENI.
Configure eRDMA on an existing ECS instance
Log on to the ECS console. Find the ECS instance on which you want to configure eRDMA and click the instance ID to go to the Instance Details page. Click the ENIs tab and check whether an ERI is bound to the instance.
If an ERI is bound to the ECS instance, skip this step.
If no ERI is bound to the ECS instance, proceed to the next step.
Enable the ERI feature for the primary ENI or a secondary ENI of the ECS instance.
ImportantEach ECS instance can have only one ERI. Enable the ERI feature for the primary ENI or a secondary ENI of the ECS instance.
Enable the ERI feature for the primary ENI or a secondary ENI of the ECS instance. We recommend that you enable the ERI feature for the primary ENI.
To enable the ERI feature for the primary ENI or a secondary ENI of an ECS instance, you can call the ModifyNetworkInterfaceAttribute operation with NetworkInterfaceId set to the ENI ID and
NetworkInterfaceTrafficMode
set toHighPerformance
.Create a secondary ENI, enable the ERI feature for the ENI, and then attach the ENI to the ECS instance.
NoteYou can enable the ERI feature for a secondary ENI only when you separately create the ENI. You cannot enable the ERI feature for a secondary ENI when you create an ECS instance or after the ENI is created.
Create a secondary ENI. For more information, see Create a secondary ENI.
VPC and vSwitch: Select the VPC in which the ECS instance is deployed and the vSwitch to which the ECS instance is connected.
eRDMA Interface: Turn on this switch.
Bind the secondary ENI to the ECS instance. For more information, see Bind a secondary ENI.
NoteBefore you bind the secondary ENI to the ECS instance, make sure that the primary ENI of the instance and the secondary ENI are not connected to the same vSwitch. Otherwise, the RDMA functionality of the secondary ENI may be unavailable in some cases due to the default route.
If you want to unbind a secondary ENI for which the ERI feature is enabled from an ECS instance, stop the instance before you unbind the ENI. For more information, see Stop an instance.
Run the
ifconfig
command to view the secondary ENI. If information about the secondary ENI is not displayed in the command output, configure the secondary ENI. For more information, see Configure a secondary ENI. If information about the secondary ENI is displayed in the command output, skip this step.NoteAfter secondary ENIs are bound to ECS instances, specific images used by the ECS instances cannot recognize the new secondary ENIs.
Install the eRDMA driver.
Install the eRDMA driver manually or by using a script based on the actual scenario.
If you use a script to install the eRDMA driver, the installation package for the latest stable eRDMA driver version is automatically downloaded.
If you want to manually install the eRDMA driver, you can download the package for a specific eRDMA driver version.
Use a script to install the eRDMA driver
Run the following commands to execute a script to install the eRDMA driver:
curl -O http://mirrors.cloud.aliyuncs.com/erdma/env_setup.sh sudo /bin/bash env_setup.sh > /var/log/erdma_install.log 2>&1
The script automatically installs the dependency packages that are required by the eRDMA driver, downloads the eRDMA driver package, and installs the eRDMA driver. Wait for the script execution to complete.
NoteIf the eRDMA driver fails to be installed by using the script, check the installation logs in the following path:
/var/log/erdma_install.log
.Manually install the eRDMA driver
Update the prerequisite package.
For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:
sudo yum update -y
For Ubuntu, skip this step.
Run the following commands in sequence to query the latest kernel package version and operating system kernel version:
rpm -qa | grep kernel #Query the latest kernel package version. uname -r #Query the operating system kernel version.
The command outputs shown in the following figure indicate that the kernel package version is the same as the operating system kernel version. In this case, you do not need to perform additional operations. If the versions are different, restart the ECS instance to make the versions the same.
Install dependency packages.
If the ECS instance is an x86 instance, run one of the following commands based on the instance operating system.
For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:
sudo yum install gcc-c++ dkms cmake kernel-devel kernel-headers libnl3 libnl3-devel
For Ubuntu, run the following command:
sudo apt-get install dkms cmake libnl-3-dev libnl-route-3-dev kernel-headers
If the ECS instance is an Arm instance, the building task is executed based on the source code. In this case, a large number of dependencies are required and subject to change. You can skip this step and execute the installation script. If the installation script fails to install dependency packages, you are prompted to install the required dependency packages. Install the dependency packages as prompted and then re-install the eRDMA driver.
Download the eRDMA driver installation package.
Run the following command to download the eRDMA driver installation package from an internal URL:
wget http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-latest.tar.gz
Run the following command to download the eRDMA driver installation package from a public URL:
wget https://mirrors.aliyun.com/erdma/erdma_installer-latest.tar.gz
In this example, the installation package for the latest eRDMA driver version is downloaded. You can download the installation package for a specific eRDMA driver version based on your business scenarios. The following table describes the release notes for eRDMA driver versions.
Run the following command to decompress the installation package and then go to the directory to which the installation package is decompressed:
tar -xvf erdma_installer-latest.tar.gz && cd erdma_installer
Install the eRDMA driver.
Method 1: Run the following command to install the eRDMA driver. During the installation process, confirm relevant uninstallation steps and automatic installation steps.
sudo sh install.sh
Method 2: Run the following command to automatically install the eRDMA driver:
sudo sh install.sh --batch
View the command output to check whether the driver is installed.
The following command output indicates that the driver is installed.
The following command output indicates that the driver failed to be installed. Perform operations as prompted and then re-install the driver.
NoteIf the ECS instance runs CentOS 7 and you receive an error that packages are missing when you re-install the driver, you may fail to obtain the packages by running the
yum
commands. In this case, you may need to run theyum install -y epel-release
command to install the Extra Packages for Enterprise Linux (EPEL) repository before you can obtain the packages.
Test the eRDMA performance
The Perftest tool is a benchmark tool that you can use to test the basic performance of eRDMA. For more information, see Perftest documentation.
Install the Perftest tool on a server and a client. Use one of the following methods to install the Perftest tool:
Method 1: Download the Perftest tool from the official perftest repository and install the tool. When you use this method to install the tool on an ECS instance, make sure that the instance can access the Internet.
Method 2: Use the YUM and APT repositories to install the Perftest tool. Run one of the following commands based on the instance operating system to install the Perftest tool.
For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:
sudo yum install perftest -y
For Ubuntu, run the following command:
sudo apt install perftest -y
NoteDifferent versions of the Perftest tool are included in the repositories of different Linux distributions. Incompatibility may occur. When you use this method to install the Perftest tool, we recommend that you use the same Linux distribution on the communicating ECS instances. Otherwise, use Method 1.
Test the eRDMA latency.
Run the following command on the server:
ib_write_lat -R -a -F
Run the following command on the client:
ib_write_lat -R -a -F <server_ip>
<server_ip>
specifies the private IP address of the ERI on the ECS instance that is used as the server. For information about how to obtain an IP address, see View IP addresses.
The following command output is returned. The command output includes performance metrics, such as the average latency, maximum latency, and minimum latency, and indicates that eRDMA works as expected.
References
In scenarios in which large-scale data transfers and high-performance network communications are required in containers, you can use eRDMA in Docker environments to allow container applications to bypass the kernel and directly access physical eRDMA devices on hosts. This helps improve the data transfer speeds and communication efficiency. For more information, see Configure eRDMA in Docker.
You can use eRDMA in Alibaba Cloud Container Service for Kubernetes (ACK) clusters to provide low-latency and high-throughput network communication capabilities for all services and applications in the clusters. For more information, see Use eRDMA in ACK clusters.
You can monitor and check the real-time working status of eRDMA. For more information, see Monitor and check eRDMA.