This topic describes the tools that are used to monitor and check Shared Memory Communications over Remote Direct Memory Access (SMC-R) in the Shared Memory Communications (SMC) stack in the kernel and how to use the tools. You can analyze traffic metrics in an SMC network and determine the health status of the network based on the monitoring and check results returned by the tools.
Prerequisites
The Alibaba Cloud Linux 3 operating system is used.
Use the smc-tools package to monitor and check SMC
The smc-tools package provided by Alibaba Cloud Linux 3 helps you obtain information about SMC connections, SMC resources, and the SMC stack.
Install the smc-tools package
sudo yum install -y smc-tools
Query information about the SMC-R stack
smcr device
: queries information about Remote Direct Memory Access (RDMA) devices that are used by the SMC-R stack.Sample command output:
# smcr device Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID eth1 erdma_0 1 ACTIVE 0x107f No 0
Take note of the parameters in the following table.
Parameter
Description
Net-Dev
The name of the Ethernet device.
IB-Dev
The name of the RDMA device.
IB-P
The port of the RDMA device.
IB-State
The status of the RDMA device.
Type
The type of the RDMA device. If the device is an elastic RDMA (eRDMA) device of Alibaba Cloud,
0x107f
is displayed.#Links
The number of queue pairs (QPs) used by the RDMA device.
PNET-ID
The physical network (PNET) ID of the RDMA device.
smcr linkgroup
: queries information about SMC-R link groups.NoteA link group in SMC-R consists of RDMA resources, including QPs, Protection Domains (PDs), and Memory Registrations (MRs). By default, a link group supports 32 SMC connections.
Sample command output:
# smcr linkgroup LG-ID LG-Role LG-Type VLAN #Conns PNET-ID 00000300 SERV SINGLE 0 0 1234
Take note of the number of link groups, which indicates the amount of QPs that are used, and the parameter in the following table.
Parameter
Description
#Conns
The number of SMC connections carried on the link group.
smcr stats
: queries statistics about the SMC-R stack in the current network namespace.Sample command output:
# smcr stats SMC-R Connections Summary Total connections handled 7 SMC connections 5 Handshake errors 0 Avg requests per SMC conn 518103.6 TCP fallback 2 RX Stats Data transmitted (Bytes) 18133584 (18.13M) Total requests 1295262 Buffer full 0 (0.00%) 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 0 0 0 0 5 0 0 Reqs 1.295M 0 0 0 0 0 0 0 TX Stats Data transmitted (Bytes) 18133584 (18.13M) Total requests 1295256 Buffer full 0 (0.00%) Buffer full (remote) 0 (0.00%) Buffer too small 0 (0.00%) Buffer too small (remote) 0 (0.00%) 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 0 0 0 0 5 0 0 Reqs 1.295M 0 0 0 0 0 0 0 Extras Special socket calls 5
Take note of the parameters in the following table.
Parameter
Description
Total connections handled
The total number of connections that were handled by the SMC-R stack, which is the sum of the
SMC connections
,Handshake errors
, andTCP fallback
values.SMC connections
The total number of connections that were converted into SMC-R connections.
Handshake errors
The total number of connections that failed due to errors during the handshake phase, such as no responses were received from the peer.
Avg requests per SMC conn
The average number of requests received or sent per SMC connection.
TCP fallback
The total number of connections that fell back to TCP/IP.
Rx/Data transmitted (Bytes)
The total number of bytes that were received over SMC-R connections.
Rx/Total requests
The total number of requests that were received over SMC-R connections.
Rx/Buffer full
The total number of times that the receive (Rx) buffers for SMC-R connections became full. If the user-mode application that uses an SMC-R connection does not read data from the Rx buffer allocated to the connection at the earliest opportunity, the Rx buffer allocated to the connection may become full. To decrease the value of this parameter, enable the user-mode applications that use SMC-R connections to read data from the Rx buffers at the earliest opportunity or increase the capacity of the Rx buffers. Otherwise, the sender is backpressed and the receiver cannot receive new data.
Rx/Bufs
The distribution of the Rx buffers used by SMC-R connections. SMC-R maintains a memory pool for each link group. When a connection is established, SMC-R allocates an idle memory block of a suitable size from the memory pool to the connection as the Rx buffer. If no idle memory block is available, SMC-R creates a new memory block of a suitable size. After the connection is closed, the memory block is reclaimed to the memory pool. The value of this parameter indicates the total number of times that Rx buffers were allocated from the memory pool to SMC-R connections and the distribution of the Rx buffer sizes. The value does not represent the actual number of Rx buffers that are consuming memory.
Rx/Reqs
The distribution of sizes of requests received over SMC-R connections.
Tx/Data transmitted (Bytes)
The total number of bytes that were sent over SMC-R connections.
Tx/Total requests
The total number of requests that were sent over SMC-R connections.
Tx/Buffer full
The total number of times that the transmit (Tx) buffers for SMC-R connections became full. If the SMC-R stack does not send the data delivered by the user-mode application that uses an SMC-R connection to links at the earliest opportunity, the Tx buffer corresponding to the connection may become full. If the percentage is high, increase the capacity of the Tx buffers based on your business requirements.
Tx/Buffer full (remote)
The total number of times that the peer Rx buffers for SMC-R connections became full. If the peer Rx buffer for an SMC-R connection is full, the local end cannot send data to the peer. If the percentage is high, increase the capacity of the peer Rx buffers based on your business requirements.
Tx/Buffer too small
The total number of times that the size of requests sent over an SMC-R connection exceeded the size of the corresponding Tx buffer. If the size of requests sent over an SMC-R connection exceeds the size of the corresponding Tx buffer, the size of the Tx buffer is excessively small. If the percentage is high, increase the capacity of the Tx buffers based on your business requirements.
Tx/Buffer too small (remote)
The total number of times that the size of requests sent over an SMC-R connection exceeded the size of the corresponding peer Rx buffer. If the size of requests sent over an SMC-R connection exceeds the size of the corresponding peer Rx buffer, the size of the peer Rx buffer is excessively small. If the percentage is high, increase the capacity of the peer Rx buffers based on your business requirements.
Tx/Bufs
The distribution of the Tx buffers used by SMC-R connections. SMC-R maintains a memory pool for each link group. When a connection is established, SMC-R allocates an idle memory block of a suitable size from the memory pool to the connection as the Tx buffer. If no idle memory block is available, SMC-R creates a new memory block of a suitable size. After the connection is closed, the memory block is reclaimed to the memory pool. The value of this parameter indicates the total number of times that Tx buffers were allocated from the memory pool to SMC-R connections and the distribution of the Tx buffer sizes. The value does not represent the actual number of Tx buffers that are consuming memory.
Tx/Reqs
The distribution of sizes of requests that were received over SMC-R connections.
Query information about SMC-R connections
smcss
: queries basic information about SMC sockets that are being established, are being closed, and are established in the current network namespace.NoteThe command output of the preceding command includes the SMC sockets that fell back to TCP/IP.
Sample command output:
# smcss State UID Inode Local Address Peer Address Intf Mode ACTIVE 00994 2954337 192.168.4.78:80 192.168.4.79:36000 0000 SMCR ACTIVE 00994 2954336 192.168.4.78:80 192.168.4.79:35994 0000 SMCR ACTIVE 00994 2954333 192.168.4.78:80 192.168.4.79:35978 0000 SMCR ACTIVE 00994 2950860 192.168.4.78:80 192.168.4.79:35972 0000 SMCR ACTIVE 00994 2953298 192.168.4.78:80 192.168.4.79:35966 0000 SMCR ACTIVE 00994 2953297 192.168.4.78:80 192.168.4.79:35948 0000 TCP 0x03010000 ACTIVE 00994 2954330 192.168.4.78:80 192.168.4.79:35922 0000 TCP 0x03010000 ACTIVE 00994 2947957 192.168.4.78:80 192.168.4.79:35920 0000 TCP 0x03010000 ACTIVE 00994 2953293 192.168.4.78:80 192.168.4.79:35822 0000 TCP 0x03010000 ACTIVE 00994 2955286 192.168.4.78:80 192.168.4.79:35752 0000 TCP 0x03010000
Take note of the parameters in the following table.
Parameter
Description
State
The status of the socket. Valid values:
INIT
: The socket is being initialized.CLOSED
: The socket is closed.LISTEN
: The socket is a listening socket.ACTIVE
: The connection is established.PEERCLW1
: The socket no longer sends data to the peer.PEERCLW2
: The socket no longer sends data to or receives data from the peer.APPLCLW1
: The socket no longer receives data from the peer.APPLCLW2
: The socket no longer receives data from or sends data to the peer.APPLFINCLW
: The socket is closed by the peer.PEERFINCLW
: The socket is closed locally.PEERABORTW
: The socket is unexpectedly closed locally.PROCESSABORT
: The socket is unexpectedly closed by the peer.
Local Address
The local IPv4 address and port number. SMC supports only IPv4 protocols.
Peer Address
The peer IPv4 address and port number. SMC supports only IPv4 protocols.
Mode
The communication mode.
SMCR: uses the SMC-R stack.
TCP <fallback reason>: falls back to use the TCP/IP stack.
NoteA numeric code indicates the fallback reason. For information about the meaning of a numeric code, see the SMC falls back to TCP and RDMA cannot be used to accelerate communications section of the "Troubleshoot SMC" topic.
smcss -l
: queries the sockets in theLISTEN
state in the current network namespace.The
smcss -l
command has the same output parameters as thesmcss
command.smcss -R
: queries the sockets that run on the SMC-R stack in the current network namespace.Sample command output:
# smcss -R State UID Inode Local Address Peer Address Intf Mode Role IB-device Port Linkid GID Peer-GID ACTIVE 00000 1833669 192.168.4.79:33618 192.168.4.78:80 0000 SMCR CLNT erdma_0 01 01 0000:0000:0000:0000:0000:ffff:c0a8:044f 0000:0000:0000:0000:0000:ffff:c0a8:044e ACTIVE 00000 1833667 192.168.4.79:33604 192.168.4.78:80 0000 SMCR CLNT erdma_0 01 01 0000:0000:0000:0000:0000:ffff:c0a8:044f 0000:0000:0000:0000:0000:ffff:c0a8:044e ACTIVE 00000 1828405 192.168.4.79:33590 192.168.4.78:80 0000 SMCR CLNT erdma_0 01 01 0000:0000:0000:0000:0000:ffff:c0a8:044f 0000:0000:0000:0000:0000:ffff:c0a8:044e ACTIVE 00000 1833665 192.168.4.79:33578 192.168.4.78:80 0000 SMCR CLNT erdma_0 01 01 0000:0000:0000:0000:0000:ffff:c0a8:044f 0000:0000:0000:0000:0000:ffff:c0a8:044e ACTIVE 00000 1833663 192.168.4.79:33564 192.168.4.78:80 0000 SMCR CLNT erdma_0 01 01 0000:0000:0000:0000:0000:ffff:c0a8:044f 0000:0000:0000:0000:0000:ffff:c0a8:044e
In addition to the preceding basic parameters, take note of the parameters in the following table.
Parameter
Description
IB-device
The name of the RDMA device used for the connection.
Port
The port of the RDMA device used for the connection.
GID
The global ID (GID) of the RDMA device used for the connection.
Peer-GID
The GID of the peer RDMA device.
smcss -a
: queries SMC sockets in all states in the current network namespace, including the SMC sockets that fell back to TCP/IP.The
smcss -a
command has the same output parameters as thesmcss
command.NoteThe numeric code in the Mode field for a connection indicates the fallback reason of the connection. For more information, see the SMC falls back to TCP and RDMA cannot be used to accelerate communications section of the "Troubleshoot SMC" topic.
References
For information about how to monitor and maintain SMC-R when eRDMA is used, see Monitor and check eRDMA.
For information about how to resolve SMC issues, such as communication failure, unavailable ports, and no application performance improvement compared with TCP, see Troubleshoot SMC.