This topic describes the issues that may occur in Shared Memory Communication (SMC) and how to resolve the issues. This topic is applicable to Alibaba Cloud Linux 3.
SMC does not provide application performance improvements over TCP
Problem description
When you use SMC instead of TCP to accelerate the TCP connection of an application, the application performance is not improved.
Cause and solution
The SMC connection that is established for the application falls back to TCP. In this case, you cannot use Remote Direct Memory Access (RDMA) to accelerate network communication. For information about how to troubleshoot and resolve the fallback issue, see the SMC falls back to TCP and RDMA cannot be used to accelerate communications section in this topic.
The network communication overhead of the application accounts for a small portion of the overall overhead. For example, the application is CPU-intensive and slightly dependent on network communication.
SMC is incompatible with the network communication model of the application. Example scenarios:
Scenarios in which short-lived connections are frequently established and closed. The establishment of SMC connections involves slow-path operations such as creating and requesting RDMA resources. For applications that predominantly use short-lived connections, SMC offers no performance improvements over TCP.
Scenarios in which resources are limited. The resources required for SMC communications are subject to the memory and eRDMA interface (ERI) specifications of an Elastic Compute Service (ECS) instance. If the resources are insufficient, SMC may fall back to TCP. For more information, see Use SMC.
Communication fails after SMC is enabled
Problem description
After you enable Shared Memory Communications over Remote Direct Memory Access (SMC-R) for an ECS instance that runs Alibaba Cloud Linux 3, specific addresses such as the addresses of specific Internet-facing services can be pinged but cannot be accessed. After you disable SMC-R, the issue is resolved.
Cause
Some servers are not strictly compliant with the TCP specifications. When the servers process TCP options, the servers may replay the TCP options. As a result, the local ends incorrectly regards that the SMC-incapable peer servers support SMC.
A TCP implementation MUST (MUST-6) ignore without error any TCP Option it does not implement, assuming that the option has a length field. For more information, see RFC 9293.
If the TCP option that is used to indicate support for SMC is replayed, the local end misidentifies the peer server as being SMC-capable. In this case, a handshake error occurs. As a result, requests such as cURL requests fail, but pings over the Internet Control Message Protocol (ICMP) succeed.
You can use the check_tcpoption_replay.py tool to diagnose the issue.
Install the Python 3 and Scapy libraries.
yum install python3 -y python3 -m pip install scapy
Run the check_tcpoption_replay.py tool.
python3 check_tcpoption_replay.py -i <Server IP address> -p <Server port>
If the TCP option replay issue occurred on a specific server,
The server has replayed the TCP option
is displayed in the command output. If no TCP option replay issue occurred on a specific server,The server did not replay the TCP option
is displayed in the command output.
Solution
The TCP option replay issue occurs unexpectedly, and cannot be resolved because TCP options are replayed by intermediate network nodes or peers. When you access the problematic services that are described in the preceding "Problem description" section, we recommend that you do not use SMC.
SMC failed to be enabled after the smc_run command is run
Problem description
After you run the smc_run ./foo
command to enable SMC for an application, you run the smcr l
command to explore SMC-R link groups but the command output indicates that no SMC-R link groups are created. Then, you run the smcss -a
command to query SMC sockets, but the command output indicates that no SMC connections exist or that an SMC connection falls back to TCP on one side. For more information about the commands, see Use SMC.
Cause
The smc_run
command uses the following mechanism to transparently enable SMC: Preload the dynamic link libraries from smc-tools that are specified in the LD_PRELOAD variable before other libraries, and then make a socket(2) call in the preloaded dynamic link libraries to modify the families and protocols of sockets. If an application is not dynamically linked, you cannot run the smc_run
command to transparently enable SMC for the application.
Solution
Run the sysctl net.smc.tcp2smc
command that is described in Use SMC to enable SMC.
Specific ports become unusable after SMC is enabled
Problem description
After SMC is loaded, 16 ports within the port range of 65500 to 65515 become unusable. After you make a bind(2) call for the ports, EADDRINUSE
is returned.
Cause
SMC-R and eRDMA are used together. SMC modules use ports 65500 to 65515 in the net namespace in which ERIs reside to establish out-of-band (OOB) connections. You can run the dmesg command and view the following information in the command output:
smc: smc: load SMC module with reserve_mode
NET: Registered protocol family 43
smc: netns <netns ID> reserved ports [65500 ~ 65515] for eRDMA OOB
smc: adding ib device erdma_0 with port count 1
smc: ib device erdma_0 port 1 has pnetid
If SMC modules fail to occupy the ports, the SMC modules cannot use eRDMA devices.
Solution
Unload the SMC modules to release the ports. For information about how to unload SMC modules, see the Instructions section in the "Use SMC" topic.
SMC falls back to TCP and RDMA cannot be used to accelerate communications
Problem description
After you enable SMC to replace TCP in an application, you run the smcss -a
command and the command output indicates that the SMC connection automatically falls back to TCP.
Cause
If an exception causes an SMC connection to fall back to TCP during SMC connection establishment, the SMC connection can still be used for communication, but the application that uses the SMC connection cannot leverage the performance benefits of RDMA. When an SMC-to-TCP fallback occurs, a cause code is returned. You can identify the cause of the fallback based on the code.
Solution
Run the
smcss -a
command to obtain the cause code of the SMC-to-TCP fallback.Sample command output:
State UID Inode Local Address Peer Address Intf Mode ACTIVE 00000 0156721 192.168.99.21:60188 192.168.99.22:8090 0000 TCP 0x03010000 ACTIVE 00000 1202539 172.16.4.189:44780 172.16.4.190:1811 0000 SMCR
In the first entry, TCP in the Intf Mode column indicates that the SMC connection fell back to TCP. The cause code is 0x03010000. In the second entry, SMCR in the Intf Mode column indicates that the SMC-R connection is established. If two cause codes (example: 0x05000000 and 0x03030001) are displayed in the Intf Mode column, the first code indicates the cause for the local host and the second code indicates the cause for the peer host. In most cases, SMC-to-TCP fallbacks are caused by the peer host.
Identify the causes of SMC-to-TCP fallbacks based on the cause codes and resolve the fallbacks.
After you enable SMC, data collected by common network O&M tools does not meet expectations
Problem description
After you enable SMC for an ECS instance that runs Alibaba Cloud Linux 3, common network analysis tools such as tcpdump and Wireshark and network monitoring tools such as the Socket Statistics (ss) and netstat utilities collect network traffic data that does not meet expectations or cannot collect expected traffic data.
Cause
SMC-R is a communication protocol that is based on RDMA. Currently, common network O&M tools analyze or monitor only TCP traffic and cannot identify RDMA packets. As a result, the data displayed in network O&M tools does not match actual network data.
Solution
Use RDMA-related O&M tools to analyze or monitor data. For more information, see Monitor and check eRDMA.
The SMC module that is loaded on a GPU-accelerated or Super Computing Cluster (SCC) instance is unusable
Problem description
The SMC module that is loaded on a GPU-accelerated or SCC instance is unusable.
Cause
Mellanox OpenFabrics Enterprise Distribution (OFED) drivers are installed on GPU-accelerated and SCC instances. The SMC module in the OFED stack is automatically loaded and cannot work. After you install Mellanox OFED drivers, symbols for RDMA-related functions change. The SMC module that is included in the kernel fails to be loaded, and the Unknow symbol
error appears.
Solution
The SMC module in Alibaba Cloud Linux 3 cannot be used on GPU-accelerated or SCC instances.
After you enable SMC, some SOL_SOCKET or SOL_TCP level options for setsockopt and getsockopt calls do not work as expected
Problem description
After you enable SMC to replace TCP in applications, some SOL_SOCKET or SOL_TCP level options that were used for the TCP connections cannot be configured by making setsockopt or getsockopt calls or do not work as expected after configuration.
Cause
After you place the TCP protocol stack with the SMC protocol stack, shared buffer is used to transfer data over SMC links. The protocol stack design and data transfer methods of SMC greatly differ from the protocol stack design and data transfer methods of TCP. In this case, SOL_SOCKET or SOL_TCP level options are inapplicable.
Solution
Take note of the SOL_SOCKET or SOL_TCP level options that are supported or not supported by SMC in Alibaba Cloud Linux 3. The following tables describes the support of SMC for SOL_SOCKET or SOL_TCP level options.
Y, M, and N are displayed in the table.
Y: The option is supported by SMC and can be configured and obtained and work as expected.
M: The option is not supported by SMC and can be configured and obtained, but cannot work as expected due to the differences in design between SMC and TCP.
N: The option is not supported by SMC and cannot be configured or obtained. A fallback to TCP occurs with the cause code 0x03060000 or 0x03010001.