All Products
Search
Document Center

Alibaba Cloud Linux:Troubleshoot SMC

Last Updated:May 20, 2024

This topic describes the issues that may occur in Shared Memory Communication (SMC) and how to resolve the issues. This topic is applicable to Alibaba Cloud Linux 3.

SMC does not provide application performance improvements over TCP

Problem description

When you use SMC instead of TCP to accelerate the TCP connection of an application, the application performance is not improved.

Cause and solution

  • The SMC connection that is established for the application falls back to TCP. In this case, you cannot use Remote Direct Memory Access (RDMA) to accelerate network communication. For information about how to troubleshoot and resolve the fallback issue, see the SMC falls back to TCP and RDMA cannot be used to accelerate communications section in this topic.

  • The network communication overhead of the application accounts for a small portion of the overall overhead. For example, the application is CPU-intensive and slightly dependent on network communication.

  • SMC is incompatible with the network communication model of the application. Example scenarios:

    • Scenarios in which short-lived connections are frequently established and closed. The establishment of SMC connections involves slow-path operations such as creating and requesting RDMA resources. For applications that predominantly use short-lived connections, SMC offers no performance improvements over TCP.

    • Scenarios in which resources are limited. The resources required for SMC communications are subject to the memory and eRDMA interface (ERI) specifications of an Elastic Compute Service (ECS) instance. If the resources are insufficient, SMC may fall back to TCP. For more information, see Use SMC.

Communication fails after SMC is enabled

Problem description

After you enable Shared Memory Communications over Remote Direct Memory Access (SMC-R) for an ECS instance that runs Alibaba Cloud Linux 3, specific addresses such as the addresses of specific Internet-facing services can be pinged but cannot be accessed. After you disable SMC-R, the issue is resolved.

Cause

Some servers are not strictly compliant with the TCP specifications. When the servers process TCP options, the servers may replay the TCP options. As a result, the local ends incorrectly regards that the SMC-incapable peer servers support SMC.

Note

A TCP implementation MUST (MUST-6) ignore without error any TCP Option it does not implement, assuming that the option has a length field. For more information, see RFC 9293.

If the TCP option that is used to indicate support for SMC is replayed, the local end misidentifies the peer server as being SMC-capable. In this case, a handshake error occurs. As a result, requests such as cURL requests fail, but pings over the Internet Control Message Protocol (ICMP) succeed.

image

You can use the check_tcpoption_replay.py tool to diagnose the issue.

Click check_tcpoption_replay.py to view the script content

from scapy.all import *
import time
import argparse

# Set up command line arguments
parser = argparse.ArgumentParser(description='Check if the server replays with the same TCP option.')
parser.add_argument('-i', '--ip', required=True, help='Target IP address')
parser.add_argument('-p', '--port', required=True, type=int, help='Target port number')
parser.add_argument('-v', '--verbose', action='store_true', help='Print verbose output')
args = parser.parse_args()

# Target IP and port
target_ip = args.ip  # Get target IP from command line arguments
target_port = args.port  # Get target port from command line arguments
verbose = args.verbose  # Get verbose flag from command line arguments

# Create a TCP SYN packet that includes a special TCP Option
ip = IP(dst=target_ip)
syn = TCP(sport=RandShort(), dport=target_port, flags='S', options=[(254, b'xxxx')])
syn_ack_pkt = sr1(ip/syn, timeout=1, verbose=verbose)

# Check if the returned packet is a TCP SYN-ACK
if syn_ack_pkt and TCP in syn_ack_pkt and syn_ack_pkt[TCP].flags & 18:  # SYN-ACK flags
    # Check for the special TCP Option
    if any(opt[0] == 254 and opt[1] == b'xxxx' for opt in syn_ack_pkt[TCP].options):
        print("The server has replayed the TCP option")
    else:
        print("The server did not replay the TCP option")
else:
    print("Failed to receive SYN-ACK, please make sure the IP and port are correct")
    sys.exit(1)

# Complete the TCP handshake
if syn_ack_pkt:
    ack = TCP(sport=syn_ack_pkt[TCP].dport, dport=target_port, flags='A', seq=syn_ack_pkt[TCP].ack, ack=syn_ack_pkt[TCP].seq + 1)
    send(ip/ack, verbose=verbose)

# Wait for 1 second before disconnecting
time.sleep(1)

# Send TCP FIN to close the connection
if syn_ack_pkt:
    fin = TCP(sport=syn_ack_pkt[TCP].dport, dport=target_port, flags='FA', seq=syn_ack_pkt[TCP].ack, ack=syn_ack_pkt[TCP].seq + 1)
    last_ack_pkt = sr1(ip/fin, timeout=1, verbose=verbose)

# Complete the four-way handshake
if last_ack_pkt and TCP in last_ack_pkt and last_ack_pkt[TCP].flags & 16:  # ACK flag
    last_ack = TCP(sport=syn_ack_pkt[TCP].dport, dport=target_port, flags='A', seq=last_ack_pkt[TCP].ack, ack=last_ack_pkt[TCP].seq + 1)
    send(ip/last_ack, verbose=verbose)
  1. Install the Python 3 and Scapy libraries.

    yum install python3 -y
    python3 -m pip install scapy
  2. Run the check_tcpoption_replay.py tool.

    python3 check_tcpoption_replay.py -i <Server IP address> -p <Server port>

    If the TCP option replay issue occurred on a specific server, The server has replayed the TCP option is displayed in the command output. If no TCP option replay issue occurred on a specific server, The server did not replay the TCP option is displayed in the command output.

Solution

The TCP option replay issue occurs unexpectedly, and cannot be resolved because TCP options are replayed by intermediate network nodes or peers. When you access the problematic services that are described in the preceding "Problem description" section, we recommend that you do not use SMC.

SMC failed to be enabled after the smc_run command is run

Problem description

After you run the smc_run ./foo command to enable SMC for an application, you run the smcr l command to explore SMC-R link groups but the command output indicates that no SMC-R link groups are created. Then, you run the smcss -a command to query SMC sockets, but the command output indicates that no SMC connections exist or that an SMC connection falls back to TCP on one side. For more information about the commands, see Use SMC.

Cause

The smc_run command uses the following mechanism to transparently enable SMC: Preload the dynamic link libraries from smc-tools that are specified in the LD_PRELOAD variable before other libraries, and then make a socket(2) call in the preloaded dynamic link libraries to modify the families and protocols of sockets. If an application is not dynamically linked, you cannot run the smc_run command to transparently enable SMC for the application.

Solution

Run the sysctl net.smc.tcp2smc command that is described in Use SMC to enable SMC.

Specific ports become unusable after SMC is enabled

Problem description

After SMC is loaded, 16 ports within the port range of 65500 to 65515 become unusable. After you make a bind(2) call for the ports, EADDRINUSE is returned.

Cause

SMC-R and eRDMA are used together. SMC modules use ports 65500 to 65515 in the net namespace in which ERIs reside to establish out-of-band (OOB) connections. You can run the dmesg command and view the following information in the command output:

smc: smc: load SMC module with reserve_mode
NET: Registered protocol family 43
smc: netns <netns ID> reserved ports [65500 ~ 65515] for eRDMA OOB
smc: adding ib device erdma_0 with port count 1
smc: ib device erdma_0 port 1 has pnetid

If SMC modules fail to occupy the ports, the SMC modules cannot use eRDMA devices.

Solution

Unload the SMC modules to release the ports. For information about how to unload SMC modules, see the Instructions section in the "Use SMC" topic.

SMC falls back to TCP and RDMA cannot be used to accelerate communications

Problem description

After you enable SMC to replace TCP in an application, you run the smcss -a command and the command output indicates that the SMC connection automatically falls back to TCP.

Cause

If an exception causes an SMC connection to fall back to TCP during SMC connection establishment, the SMC connection can still be used for communication, but the application that uses the SMC connection cannot leverage the performance benefits of RDMA. When an SMC-to-TCP fallback occurs, a cause code is returned. You can identify the cause of the fallback based on the code.

Solution

  1. Run the smcss -a command to obtain the cause code of the SMC-to-TCP fallback.

    Sample command output:

    State          UID   Inode   Local Address           Peer Address            Intf Mode
    ACTIVE         00000 0156721 192.168.99.21:60188     192.168.99.22:8090      0000 TCP 0x03010000
    ACTIVE         00000 1202539 172.16.4.189:44780      172.16.4.190:1811       0000 SMCR

    In the first entry, TCP in the Intf Mode column indicates that the SMC connection fell back to TCP. The cause code is 0x03010000. In the second entry, SMCR in the Intf Mode column indicates that the SMC-R connection is established. If two cause codes (example: 0x05000000 and 0x03030001) are displayed in the Intf Mode column, the first code indicates the cause for the local host and the second code indicates the cause for the peer host. In most cases, SMC-to-TCP fallbacks are caused by the peer host.

  2. Identify the causes of SMC-to-TCP fallbacks based on the cause codes and resolve the fallbacks.

    The following table describes the causes and cause codes of SMC-to-TCP fallbacks and provides solutions to the fallbacks.

    Cause code

    Description

    Possible cause and solution

    0x01010000

    Resources cannot be created due to insufficient memory.

    • Cause: Host memory is insufficient to accommodate the data structures and read and write operations that are required to establish an SMC connection.

    • Solution: Increase available memory by performing operations such as terminating unnecessary processes.

    0x02010000

    When Connection Layer Control (CLC) or Link Layer Control (LLC) messages are sent during a TCP handshake process, the Link Confirm messages that are sent over RDMA links time out.

      • Cause 1: RDMA network interface cards (RNICs) or RDMA links fail. As a result, the responses to LLC messages that are sent over RDMA links time out.

      • Solution 1: Make sure that the RNICs work as expected.

      • Cause 2: Ethernet network interface controllers (NICs) or TCP/IP networks fail. As a result, the responses to CLC messages that are sent over TCP connections time out.

      • Solution 2: Make sure that the Ethernet NICs work as expected.

    0x02020000

    A timeout occurs when LLC messages are sent to establish RDMA links.

    The cause code is not in use.

    0x03000000

    Correct IP addresses cannot be obtained due to incorrect configurations.

    • Cause: When a proposal is created to establish an SMC connection, the IP address that corresponds to the CLC socket cannot be obtained.

    • Solution: Make sure that the TCP-based CLC connection and the corresponding devices work as expected.

    0x03010000

    The peer host does not support or use SMC.

    • Cause: The peer host does not support SMC. If the peer host supports SMC, the SYN or SYNACK packet sent over a CLC connection during a TCP handshake process carries SMC TCP option flags.

    • Solution: Check whether the protocol stacks of applications on the local or peer host are replaced with SMC protocol stacks, and run the smcss command from the smc-tools package to check the status of the SMC connection. If no connections and ports that correspond to the applications exist, replace TCP with SMC and establish an SMC connection.

    0x03020000

    IPsec is not supported.

    • Cause: IPsec is used in an SMC connection, but SMC does not support IPsec.

    • Solution: Do not use IPsec in an SMC connection.

    0x03030000

    No Shared Memory Communications - Direct Memory Access (SMC-D) or SMC-R devices are available.

      • Cause 1: No RDMA devices are available for establishing an SMC-R connection.

      • Solution 1: Run the smcr d command to check whether RDMA devices are available for establishing an SMC-R connection. If no RDMA devices are available in Alibaba Cloud eRDMA scenarios, make sure that ERIs are properly configured in the ECS console and ERI drivers are properly installed on instance operating systems.

      • Cause 2: When multiple Ethernet NICs are used, Ethernet NICs that are used for SMC-R connections are not eRDMA-capable, and eRDMA devices are not found.

      • Solution 2: Run the ibv_devinfo command to query the node GUIDs of eRDMA devices. Run the ip addr command to query the MAC addresses of Ethernet NICs. Then, compare the node GUIDs of eRDMA devices with the MAC addresses of Ethernet NICs to determine whether the Ethernet NICs are eRDMA-capable.

      • Cause 3: If RDMA devices are set to run in exclusive mode, SMC searches for RDMA devices only in the net namespace in which RDMA sockets are created.

      • Solution 3: Run the rdma system command to check the operation mode of an RDMA device. If the RDMA device runs in exclusive mode, netns exclusive is displayed in the command output. To use an RDMA device in a net namespace, run the rdma dev set <RDMA device name> netns <Net namespace name> command to move the RDMA device to the net namespace. If the RDMA device is an RDMA over Converged Ethernet (RoCE) or Internet Wide Area RDMA Protocol (iWARP) device, move the RDMA device and the required Ethernet devices to the net namespace.

      • Cause 4: When eRDMA devices are used, a client attempts to replace an AF_INET6 connection with an SMC connection.

      • Solution 4: eRDMA devices are based only on the SMC Version 2 (SMCv2) protocol that does not allow the replacement of AF_INET6 connections. Use TCP. Then, change the protocol families of applications to AF_INET.

    0x03030001

    No SMC-D devices are available.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x03030002

    No SMC-R devices are available.

      • Cause 1: During SMC connection establishment, the selected RDMA device becomes invalid.

      • Solution 1: Run the smcr d command to check whether an SMC-R device is available in the system. If the required RDMA devices are Alibaba Cloud ERIs, make sure that the ERIs are added in the ECS console and ERI drivers are properly installed and configured.

      • Cause 2: When multiple Ethernet NICs are used, Ethernet NICs that are used for SMC-R connections are not eRDMA-capable, and eRDMA devices are not found.

      • Solution 2: Run the ibv_devinfo command to query the node GUIDs of eRDMA devices. Run the ip addr command to query the MAC addresses of Ethernet NICs. Then, compare the node GUIDs of eRDMA devices with the MAC addresses of Ethernet NICs to determine whether the Ethernet NICs are eRDMA-capable.

      • Cause 3: If RDMA devices are set to run in exclusive mode, SMC searches for RDMA devices only in the net namespace in which RDMA sockets are created.

      • Solution 3: Run the rdma system command to check the operation mode of an RDMA device. If the RDMA device runs in exclusive mode, netns exclusive is displayed in the command output. To use an RDMA device in a net namespace, run the rdma dev set <RDMA device name> netns <Net namespace name> command to move the RDMA device to the net namespace. If the RDMA device is an RoCE or iWARP device, move the RDMA device and the required Ethernet devices to the net namespace.

    0x03030003

    SMC-D devices do not support the ISMv2 protocol.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x03030004

    The peer host does not support the extension of the SMCv2 protocol.

    • Cause: The SMCv2 protocol is enabled for the local host, but the peer host does not support the SMCv2 protocol. The SMC protocol stack uses the SMCv1 or SMCv2 protocol based on the underlying device capabilities. In Alibaba Cloud eRDMA or RoCE v2 scenarios, SMCv2 is used.

    • Solution: Use the same type of RDMA devices on both communicating hosts to ensure that the same SMC protocol version is used by the hosts. Run the smcr d command to query SMC-R devices. In the command output, the values in the Type column indicates the types of SMC-R devices. The values include RoCE_Express, RoCE_Express2, and 0x107f. The value 0x107f indicates Alibaba Cloud eRDMA.

    0x03030005

    The peer host does not support the extension of the SMC-D v2 protocol.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x03030006

    The peer host does not have a system enterprise ID (SEID).

    The cause code is not in use.

    0x03030007

    No SMC-D v2 devices are available.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x03030008

    The peer host does not have a user-defined enterprise ID (UEID).

    • Cause: The SMCv2 protocol is used, but no UEID is specified.

    • Solution: Run the smcr ueid {show | add | del} command from the smc-tools package to specify the same UEID for both communicating hosts.

    0x03030009

    The SMC version negotiation between the local and peer hosts fails.

    • Cause: The SMC version that is negotiated by the local and peer hosts changes during the CLC handshake.

    • Solution: Make sure that the local and peer hosts run the same operating system distribution.

    0x0303000a

    The negotiation on the maximum number of connections per link group (Max Connections per LGR negotiation) fails.

    • Cause: SMCv2.1 supports Max Connections per LGR negotiation. If the outcome of the negotiation is unacceptable, a fallback occurs. For example, the negotiated number is zero or exceeds the maximum value allowed for the local host.

    • Solution: Make sure that the local and peer hosts run the same operating system distribution.

    0x0303000b

    The negotiation on the maximum number of links per link group (Max Links per LGR negotiation) fails.

    • Cause: SMCv2.1 supports Max Links per LGR negotiation. If the outcome of the negotiation is unacceptable, a fallback occurs. For example, the negotiated number is zero or exceeds the maximum value allowed for the local host.

    • Solution: Make sure that the local and peer hosts run the same operating system distribution.

    0x0303000c

    The negotiation on the SMC vendor feature between the local and peer hosts fails.

    • Cause: The SMC vendor feature negotiated by the local and peer hosts changes during the CLC handshake.

    • Solution: Make sure that the local and peer hosts run the same operating system distribution. Then, run the uname -r command to check the kernel version. If the kernel version is 5.10.134-015, do not change the value of sysctl net.smc.vendor_exp_options during connection establishment. If the kernel version is 5.10.134-016 or later, do not change the value of sysctl net.smc.experiment_vendor_options during connection establishment.

    0x03040000

    The local host and the peer host use the different modes of SMC devices.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x03050000

    The remote memory buffer element (RMBE) of the peer host has an eyecatcher.

    The cause code is not in use for Linux.

    0x03060000

    The SMC connection does not support the MSG_FASTOPEN flag.

    • Cause: SMC does not support the MSG_FASTOPEN flag.

    • Solution: When SMC sockets are created, remove the MSG_FASTOPEN flag.

    0x03070000

    The IP prefix or IP subset of the local host is different from the IP prefix or IP subset of the peer host.

    • Cause: When RDMA devices are RoCEv1 devices, the SMCv1 protocol is automatically used. SMCv1 supports only in-subnet communication. If both communicating hosts are not in the same subnet, a fallback occurs.

    • Solution: If RDMA devices are RoCEv1 devices, make sure that both communicating hosts are in the same subnet. If RDMA devices are eRDMA-capable, the SMCv2 protocol is automatically used and no fallbacks result from subnet restrictions.

    0x03080000

    The VLAN ID of a device cannot be obtained.

    • Cause: During connection establishment, SMC attempts to obtain the VLAN ID of the device that corresponds to an SMC socket.

    • Solution: Make sure that the TCP connection identified by the quintuple and corresponding Ethernet devices work as expected.

    0x03090000

    The VLAN ID cannot be registered with an internal shared memory (ISM) device.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x030a0000

    No SMC-R RDMA links are available in the link group.

    • Cause: When an SMC-R connection is established, the connection is assigned a link from the link group to which the connection belongs. If the connection is not assigned a link, RDMA cannot be used to accelerate the connection.

    • Solution: Run the smcr d command to check whether RNICs on ECS instances work as expected. If the required RDMA devices are Alibaba Cloud ERIs, make sure that the ERIs are added in the ECS console and ERI drivers are properly installed and configured.

    0x030b0000

    The client cannot find the RDMA links provided by the server.

    • Cause: The client searches for RDMA links based on the queue air number (QPN), global identifier (GID), and media access control address (MAC) information provided by the server and establishes a connection to the server. If the client cannot find the required RDMA links, the connection cannot be accelerated by RDMA.

    • Solution: Run the smcr d command to check whether RNICs on ECS instances work as expected. If the required RDMA devices are Alibaba Cloud ERIs, make sure that the ERIs are added in the ECS console and ERI drivers are properly installed and configured.

    0x030c0000

    The SMC version negotiation fails.

    • Cause: The SMC version negotiated by the local and peer hosts is unacceptable.

    • Solution: Make sure that the local and peer hosts run the same operating system distribution.

    0x030d0000

    The maximum number of SMC-D DMBs is reached.

    Alibaba Cloud does not provide SMC-D devices. If the issue occurs, contact Alibaba Cloud technical support.

    0x030e0000

    The peer host cannot connect to the local host over the SMC-R V2 protocol.

    • Cause: When a connection is established over the SMCv2 protocol, the client needs to find routing information based on the IP addresses provided by the server. Currently, the client cannot find routing information based on the IP addresses of the local host and the peer host.

    • Solution: Make sure that the TCP connection identified by the quintuple and corresponding Ethernet devices work as expected and are reachable. For example, make sure that Ethernet NICs, IP configurations, and routing configurations are normal and accessible.

    0x030f0000

    The flag that indicates whether the SMC-R V2 connection is an indirect connection is improperly set.

    • Cause: When a connection is established over the SMCv2 protocol, the client determines that traffic passes through a gateway based on the gateway flag from the server. However, the routing information that is obtained based on the IP addresses of the local and peer hosts indicates that traffic does not pass through a gateway.

    • Solution: Make sure that the TCP connection identified by the quintuple and corresponding Ethernet devices work as expected, are reachable, and use the same network path. For example, make sure that Ethernet NICs, IP configurations, and routing configurations are normal and accessible.

    0x04000000

    The server and the client do not use the same link group.

    • Cause: During connection establishment, the server reuses the link group, but the client wants to create a link group.

    • Solution: Run the smcr d command to check whether RNICs on ECS instances work as expected. If the required RDMA devices are Alibaba Cloud ERIs, make sure that the ERIs are added in the ECS console and ERI drivers are properly installed and configured.

    0x05000000

    The peer host rejects the handshake.

    • Cause: During connection establishment, the peer host responds with a CLC message to reject the RDMA connection.

    • Solution: Run the smcss command, find the rejected RDMA connection based on the quintuple information, and then identify the fallback cause based on the cause code of the peer host.

    0x09990000

    RDMA related resources fail to be created.

    • Cause: RDMA resources fail to be created or initialized.

    • Solution: Use an RDMA monitoring tool to view the statistics about failed resource requests. If Alibaba Cloud eRDMA devices are used, you can run the eadm stat command to view error statistics.

    0x09990001

    The RDMA RToken fails to be added.

    This is an SMC protocol stack issue. If the issue occurs, contact Alibaba Cloud technical support.

    0x09990002

    RDMA queue pairs (QPs) fail to be initialized.

    • Cause: If an RDMA link that requires an RDMA QP needs to be created during connection establishment, SMC calls InfiniBand (IB) verbs interfaces to initialize and modify the RDMA QP. During this process, an issue occurs.

    • Solution: Run the smcr d command to query available SMC-R devices. If the RDMA device is an Alibaba Cloud ERI, make sure that the ERI is properly configured in the ECS console and ERI drivers are properly installed in the operating system.

    0x09990003

    A memory region (MR) fails to be registered with the RDMA device that is used by SMC.

    • Cause: When RDMA is used for communication, MRs must be registered with an RDMA device to access and write data. If the number or size of MRs exceeds the specifications that are supported by the RDMA device, an error is reported.

    • Solution: Run the smcr d command to query the name of the RDMA device that is used by SMC. Run the ibv_devinfo -d <RDMA device name> -v | grep max_mr command. In the command output, max_mr indicates the maximum number of MRs that are supported by the RDMA device, and max_mr_size indicates the maximum size of MRs that are supported by the RDMA device. In most cases, the issue occurs because the maximum number of MRs is reached. To reduce the number of MRs, decrease the number of SMC connections.

    0x09990004

    Credits that are required for SMC flow control cannot be initialized.

    • Cause: RNICs or RDMA links fail. As a result, credit messages cannot be sent over RDMA links.

    • Solution: Make sure that RNICs work as expected.

After you enable SMC, data collected by common network O&M tools does not meet expectations

Problem description

After you enable SMC for an ECS instance that runs Alibaba Cloud Linux 3, common network analysis tools such as tcpdump and Wireshark and network monitoring tools such as the Socket Statistics (ss) and netstat utilities collect network traffic data that does not meet expectations or cannot collect expected traffic data.

Cause

SMC-R is a communication protocol that is based on RDMA. Currently, common network O&M tools analyze or monitor only TCP traffic and cannot identify RDMA packets. As a result, the data displayed in network O&M tools does not match actual network data.

Solution

Use RDMA-related O&M tools to analyze or monitor data. For more information, see Monitor and check eRDMA.

The SMC module that is loaded on a GPU-accelerated or Super Computing Cluster (SCC) instance is unusable

Problem description

The SMC module that is loaded on a GPU-accelerated or SCC instance is unusable.

Cause

Mellanox OpenFabrics Enterprise Distribution (OFED) drivers are installed on GPU-accelerated and SCC instances. The SMC module in the OFED stack is automatically loaded and cannot work. After you install Mellanox OFED drivers, symbols for RDMA-related functions change. The SMC module that is included in the kernel fails to be loaded, and the Unknow symbol error appears.

Solution

The SMC module in Alibaba Cloud Linux 3 cannot be used on GPU-accelerated or SCC instances.

After you enable SMC, some SOL_SOCKET or SOL_TCP level options for setsockopt and getsockopt calls do not work as expected

Problem description

After you enable SMC to replace TCP in applications, some SOL_SOCKET or SOL_TCP level options that were used for the TCP connections cannot be configured by making setsockopt or getsockopt calls or do not work as expected after configuration.

Cause

After you place the TCP protocol stack with the SMC protocol stack, shared buffer is used to transfer data over SMC links. The protocol stack design and data transfer methods of SMC greatly differ from the protocol stack design and data transfer methods of TCP. In this case, SOL_SOCKET or SOL_TCP level options are inapplicable.

Solution

Take note of the SOL_SOCKET or SOL_TCP level options that are supported or not supported by SMC in Alibaba Cloud Linux 3. The following tables describes the support of SMC for SOL_SOCKET or SOL_TCP level options.

Y, M, and N are displayed in the table.

  • Y: The option is supported by SMC and can be configured and obtained and work as expected.

  • M: The option is not supported by SMC and can be configured and obtained, but cannot work as expected due to the differences in design between SMC and TCP.

  • N: The option is not supported by SMC and cannot be configured or obtained. A fallback to TCP occurs with the cause code 0x03060000 or 0x03010001.

SOL_SOCKET level options

Option

Supported by SMC

SO_DEBUG

Y

SO_REUSEADDR

Y

SO_TYPE

Y

SO_ERROR

Y

SO_DONTROUTE

M

SO_BROADCAST

M

SO_SNDBUF

Y

SO_RCVBUF

Y

SO_SNDBUFFORCE

Y

SO_RCVBUFFORCE

Y

SO_KEEPALIVE

M

SO_OOBINLINE

M

SO_NO_CHECK

M

SO_PRIORITY

M

SO_LINGER

Y

SO_BSDCOMPAT

M

SO_REUSEPORT

Y

SO_PASSCRED

M

SO_PEERCRED

M

SO_RCVLOWAT

M

SO_SNDLOWAT

M

SO_RCVTIMEO_OLD

Y

SO_SNDTIMEO_OLD

Y

SO_SECURITY_AUTHENTICATION

N

SO_SECURITY_ENCRYPTION_TRANSPORT

N

SO_SECURITY_ENCRYPTION_NETWORK

N

SO_BINDTODEVICE

N

SO_ATTACH_FILTER

M

SO_DETACH_FILTER

M

SO_PEERNAME

Y

SO_ACCEPTCONN

M

SO_PEERSEC

N

SO_PASSSEC

M

SO_MARK

M

SO_PROTOCOL

Y

SO_DOMAIN

Y

SO_RXQ_OVFL

M

SO_WIFI_STATUS

M

SO_PEEK_OFF

N

SO_NOFCS

M

SO_LOCK_FILTER

Y

SO_SELECT_ERR_QUEUE

M

SO_BUSY_POLL

M

SO_MAX_PACING_RATE

M

SO_BPF_EXTENSIONS

Y

SO_INCOMING_CPU

M

SO_ATTACH_BPF

M

SO_ATTACH_REUSEPORT_CBPF

M

SO_ATTACH_REUSEPORT_EBPF

N

SO_CNX_ADVICE

M

SO_MEMINFO

M

SO_INCOMING_NAPI_ID

M

SO_COOKIE

Y

SO_PEERGROUPS

N

SO_ZEROCOPY

N

SO_TXTIME

M

SO_BINDTOIFINDEX

N

SO_TIMESTAMP_OLD

M

SO_TIMESTAMPNS_OLD

M

SO_TIMESTAMPING_OLD

M

SO_TIMESTAMP_NEW

M

SO_TIMESTAMPNS_NEW

M

SO_TIMESTAMPING_NEW

M

SO_RCVTIMEO_NEW

Y

SO_SNDTIMEO_NEW

Y

SO_DETACH_REUSEPORT_BPF

N

SOL_TCP level options

Option

Supported by SMC

TCP_NODELAY

Y

TCP_MAXSEG

M

TCP_CORK

Y

TCP_KEEPIDLE

M

TCP_KEEPINTVL

M

TCP_KEEPCNT

M

TCP_SYNCNT

M

TCP_LINGER2

M

TCP_DEFER_ACCEPT

Y

TCP_WINDOW_CLAMP

M

TCP_INFO

M

TCP_QUICKACK

M

TCP_CONGESTION

M

TCP_MD5SIG

Y

TCP_THIN_LINEAR_TIMEOUTS

M

TCP_THIN_DUPACK

M

TCP_USER_TIMEOUT

M

TCP_REPAIR

M

TCP_REPAIR_QUEUE

M

TCP_QUEUE_SEQ

M

TCP_REPAIR_OPTIONS

M

TCP_FASTOPEN

N

TCP_TIMESTAMP

M

TCP_NOTSENT_LOWAT

M

TCP_CC_INFO

M

TCP_SAVE_SYN

Y

TCP_SAVED_SYN

Y

TCP_REPAIR_WINDOW

M

TCP_FASTOPEN_CONNECT

N

TCP_ULP

N

TCP_MD5SIG_EXT

Y

TCP_FASTOPEN_KEY

N

TCP_FASTOPEN_NO_COOKIE

N

TCP_ZEROCOPY_RECEIVE

N

TCP_CM_INQ/TCP_INQ

M

TCP_TX_DELAY

M