Common kernel network parameters of Linux ECS instances and FAQ - - Alibaba Cloud ドキュメントセンター

This topic describes common Linux kernel network parameters of Elastic Compute Service (ECS) instances. This topic also provides answers to frequently asked questions about the Linux kernel network parameters.

Considerations

Before you modify kernel parameters, take note of the following items:

We recommend that you modify kernel parameters based only on your business requirements and relevant data.
Before you modify kernel parameters, you must understand the purpose of each parameter. The purpose of a kernel parameter varies based on the environment type or version.
Back up important data of an ECS instance. For more information, see Create a snapshot for a disk.

Query and modify kernel parameters of a Linux instance

You can use /proc/sys/ or /etc/sysctl.conf to modify kernel parameters during the instance runtime. The following section describes the differences between the two tools:

/proc/sys/ is a virtual file system that can be used to access kernel parameters. The net directory in the virtual file system stores all network kernel parameters that take effect in the system. You can modify the parameters during the instance runtime. The modification becomes invalid after the instance is restarted. The virtual file system is used to temporarily verify the modification.
/etc/sysctl.conf is a configuration file. You can modify the default values of kernel parameters in the configuration file. The modification remains valid after the instance is restarted.

Files in the /proc/sys/ directory correspond to the full names of parameters in the /etc/sysctl.conf configuration file. For example, the /proc/sys/net/ipv4/tcp_tw_recycle file corresponds to the net.ipv4.tcp_tw_recycle parameter in the configuration file. The file content is the parameter value.

Note

In Linux kernel version 4.12 and later, the net.ipv4.tcp_tw_recycle parameter is removed from the sysctl.conf file. The net.ipv4.tcp_tw_recycle parameter can be configured only in kernel versions earlier than 4.12.

View and modify kernel parameters by using the file in the /proc/sys/ directory

Log on to the ECS instance that runs a Linux operating system.
For more information, see Connection method overview.
Run the cat command to view the content of the configuration file.
For example, to view the value of the net.ipv4.tcp_tw_recycle parameter, run the following command:
```
cat /proc/sys/net/ipv4/tcp_tw_recycle 
```
Run the echo command to modify the file that contains the kernel parameter.
For example, to change the value of the net.ipv4.tcp_tw_recycle parameter to 0, run the following command:
```
echo "0" > /proc/sys/net/ipv4/tcp_tw_recycle 
```

View and modify kernel parameters of the /etc/sysctl.conf configuration file

Log on to the ECS instance that runs a Linux operating system.
For more information, see Connection method overview.

Run the following command to view all valid parameters in the current system:

sysctl -a

The following sample command output shows specific kernel parameters:

net.ipv4.tcp_app_win = 31
net.ipv4.tcp_adv_win_scale = 2
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_frto = 2
net.ipv4.tcp_frto_response = 0
net.ipv4.tcp_low_latency = 0
net.ipv4.tcp_no_metrics_save = 0
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_tso_win_divisor = 3
net.ipv4.tcp_congestion_control = cubic
net.ipv4.tcp_abc = 0
net.ipv4.tcp_mtu_probing = 0
net.ipv4.tcp_base_mss = 512
net.ipv4.tcp_workaround_signed_windows = 0
net.ipv4.tcp_challenge_ack_limit = 1000
net.ipv4.tcp_limit_output_bytes = 262144
net.ipv4.tcp_dma_copybreak = 4096
net.ipv4.tcp_slow_start_after_idle = 1
net.ipv4.cipso_cache_enable = 1
net.ipv4.cipso_cache_bucket_size = 10
net.ipv4.cipso_rbm_optfmt = 0
net.ipv4.cipso_rbm_strictvalid = 1

Modify the kernel parameters.
- Run the following command to temporarily modify a kernel parameter:
```
/sbin/sysctl -w kernel.parameter="[$Example]"
```
  Note
  Replace kernel.parameter with a kernel parameter name and [$Example] with a specific value based on your business requirements. For example, run the sysctl -w net.ipv4.tcp_tw_recycle="0" command to change the value of the net.ipv4.tcp_tw_recycle parameter to 0.
- Permanently modify kernel parameters.
  1. Run the following command to open the /etc/sysctl.conf configuration file:
```
vim /etc/sysctl.conf
```
  2. Press the I key to enter the Insert mode.
  3. Modify kernel parameters as needed.
    For example, make the following modifications to the file:
```
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
```
  4. Press the Esc key and enter :wq to save and close the file.
  5. Run the following command for the configurations to take effect:
```
/sbin/sysctl -p
```

FAQ about the common kernel network parameters of Linux ECS instances

What do I do if I cannot connect to a Linux ECS instance and the "nf_conntrack: table full, dropping packet" error message appears in the /var/log/message log?
Why does a "Time wait bucket table overflow" error message appear in the /var/log/messages log?
Why does a Linux ECS instance have a large number of TCP connections in the FIN_WAIT2 state?
Why does a Linux ECS instance have a large number of TCP connections in the CLOSE_WAIT state?
Why am I unable to access an ECS instance or an ApsaraDB RDS instance after I configure NAT for my client?

What do I do if I cannot connect to a Linux ECS instance and the "nf_conntrack: table full, drop packet" error message appears in the `/var/log/message` log?

Problem description

You cannot connect to a Linux ECS instance. When you ping the instance, ping packets are discarded or the ping fails. The following error message frequently appears in the /var/log/message system log:

Feb  6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
Feb  6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
Feb  6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
Feb  6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.

Cause

ip_conntrack is a NAT module in the Linux operating system and tracks connection entries. The ip_conntrack module uses a hash table to record established connection entries of TCP. When the hash table is full, the packets for new connections are discarded, and the nf_conntrack: table full, dropping packet error message appears.

The Linux operating system allocates a specific memory space to maintain each TCP connection. The space size is related to the nf_conntrack_buckets and nf_conntrack_max parameters. The default nf_conntrack_max value is four times as large as the default nf_conntrack_buckets value. We recommend that you increase the value of the nf_conntrack_max parameter.

Note

Maintaining connections consumes memory. When the system is idle and the memory is sufficient, we recommend that you increase the value of the nf_conntrack_max parameter.

Solution

Use Virtual Network Computing (VNC) to connect to the instance.
For more information, see Connect to an instance by using VNC.
Change the value of the nf_conntrack_max parameter.
1. Run the following command to open the /etc/sysctl.conf file:
```
vi /etc/sysctl.conf
```
2. Press the I key to enter the Insert mode.
3. Change the value of the nf_conntrack_max parameter.
  Example: 655350.
```
net.netfilter.nf_conntrack_max = 655350
```
4. Press the Esc key and enter :wq to save and close the file.
Change the value of the nf_conntrack_tcp_timeout_established parameter.
Default value: 432000. Unit: seconds. Example: 1200.
```
net.netfilter.nf_conntrack_tcp_timeout_established = 1200
```
Run the following command for the configurations to take effect:
```
sysctl -p
```

Why does a "Time wait bucket table overflow" error message appear in the `/var/log/messages` log?

Problem description

The "kernel: TCP: time wait bucket table overflow" error message frequently appears in the /var/log/messages logs of a Linux ECS instance.

Feb 18 12:28:38 i-*** kernel: TCP: time wait bucket table overflow
Feb 18 12:28:44 i-*** kernel: printk: 227 messages suppressed.
Feb 18 12:28:44 i-*** kernel: TCP: time wait bucket table overflow
Feb 18 12:28:52 i-*** kernel: printk: 121 messages suppressed.
Feb 18 12:28:52 i-*** kernel: TCP: time wait bucket table overflow
Feb 18 12:28:53 i-*** kernel: printk: 351 messages suppressed.
Feb 18 12:28:53 i-*** kernel: TCP: time wait bucket table overflow
Feb 18 12:28:59 i-*** kernel: printk: 319 messages suppressed.

Cause

The net.ipv4.tcp_max_tw_buckets parameter is used to specify the maximum number of allowed connections in the TIME_WAIT state in the kernel. When the total number of connections that are in and are about to transition to the TIME_WAIT state on the ECS instance exceeds the net.ipv4.tcp_max_tw_buckets value, the "kernel: TCP: time wait bucket table overflow" error message appears in the /var/log/messages log. Then, the kernel terminates excess TCP connections.

Solution

You can increase the net.ipv4.tcp_max_tw_buckets value based on your business requirements. We recommend that you optimize the creation and maintenance of TCP connections based on your business requirements. The following section describes how to change the value of the net.ipv4.tcp_max_tw_buckets parameter.

Use VNC to connect to the instance.
For more information, see Connect to an instance by using VNC.
Run the following command to query the number of existing TCP connections:
```
netstat -antp | awk 'NR>2 {print $6}' | sort | uniq -c
```
The following command output indicates that 6,300 connections are in the TIME_WAIT state:
```
6300 TIME_WAIT
 40 LISTEN
 20 ESTABLISHED
 20 CONNECTED
```
Run the following command to view the value of the net.ipv4.tcp_max_tw_buckets parameter:
```
cat /etc/sysctl.conf | grep net.ipv4.tcp_max_tw_buckets
```
The command output shown in following figure indicates that the value of the net.ipv4.tcp_max_tw_buckets parameter is 20000.
Change the value of the net.ipv4.tcp_max_tw_buckets parameter.
1. Run the following command to open the /etc/sysctl.conf file:
```
vi /etc/sysctl.conf
```
2. Press the I key to enter the Insert mode.
3. Change the value of the net.ipv4.tcp_max_tw_buckets parameter.
  Example: 65535.
```
net.ipv4.tcp_max_tw_buckets = 65535
```
4. Press the Esc key and enter :wq to save and close the file.
Run the following command for the configurations to take effect:
```
sysctl -p
```

Why does a Linux ECS instance have a large number of TCP connections in the FIN_WAIT2 state?

Problem description

A large number of TCP connections on the Linux ECS instance are in the FIN_WAIT2 state.

Cause

This issue may occur because of the following reasons:

In the HTTP service, a server proactively terminates a connection for a specific reason. For example, if a response to a keepalive message times out, the server terminates the connection, and the connection enters the FIN_WAIT2 state.
The TCP/IP protocol stack supports half-open connections. Different from the TIME_WAIT state, the FIN_WAIT2 state does not mean that a connection timed out. If the client does not terminate the connection, the connection remains in the FIN_WAIT2 state until the system restarts. The increasing number of connections in the FIN_WAIT2 state causes the kernel to crash.

Solution

You can decrease the value of net.ipv4.tcp_fin_timeout parameter to accelerate the termination of TCP connections in the FIN_WAIT2 state.

Use VNC to connect to the instance.
For more information, see Connect to an instance by using VNC.
Change the value of the net.ipv4.tcp_fin_timeout parameter.
1. Run the following command to open the /etc/sysctl.conf file:
```
vi /etc/sysctl.conf
```
2. Press the I key to enter the Insert mode.
3. Change the value of the net.ipv4.tcp_fin_timeout parameter.
  Example: 10.
```
net.ipv4.tcp_fin_timeout = 10
```
4. Press the Esc key and enter :wq to save and close the file.
Run the following command for the configurations to take effect:
```
sysctl -p
```

Why does a Linux ECS instance have a large number of TCP connections in the CLOSE_WAIT state?

Problem description

A large number of TCP connections on the Linux ECS instance are in the CLOSE_WAIT state.

Cause

The issue may occur because the number of TCP connections in the CLOSE_WAIT state is out of range.

TCP uses a four-way handshake to terminate a connection. Both ends of a TCP connection can initiate a request to terminate the connection. If the peer terminates the connection but the local end does not, the connection enters the CLOSE_WAIT state. The local end cannot communicate with the peer over this half-open connection and needs to terminate the connection at the earliest opportunity.

Solution

We recommend that you verify that a connection is terminated by the peer in the program.

Connect to the ECS instance.
For more information, see Connection method overview.
Check and terminate TCP connections in the CLOSE_WAIT state in the program.
The read and write functions in a programming language can be used to monitor TCP connections in the CLOSE_WAIT state. You can use one of the following methods to terminate connections in Java or C language:
- Java language
  1. Use the read method to check the end of a file based on the input and output streams. If the return value is -1, the end of the file has been reached.
  2. Use the close method to terminate a connection.
- C language
  Check the return value of the read function.
  - If the return value is 0, terminate the connection.
  - If the return value is less than 0, view the error message. If AGAIN is not displayed, terminate the connection.

Why am I unable to access an ECS instance or an ApsaraDB RDS instance after I configure NAT for my client?

Problem description

After NAT is configured, the client cannot access ECS or RDS instances on the server side, including ECS instances in VPCs with source NAT (SNAT) enabled.

Cause

This issue may occur because the values of the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters on the server side are set to 1.

If the values of the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters on the server side are set to 1, the server checks the timestamp in each TCP connection packet. If timestamps are not received in ascending order, the server does not respond.

Solution

You can select an appropriate solution to the connection failures based on cloud products deployed on the server side.

If an ECS instance is deployed as a remote server, set the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters to 0.
If an ApsaraDB RDS instance is deployed as a remote server, you cannot modify the kernel parameters on the server. Instead, you can change the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters to 0 on the client.

Use VNC to connect to the instance.
For more information, see Connect to an instance by using VNC.
Change the values of the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters to 0.
1. Run the following command to open the /etc/sysctl.conf file:
```
vi /etc/sysctl.conf
```
2. Press the I key to enter the Insert mode.
3. Change the values of the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps parameters to 0.
```
net.ipv4.tcp_tw_recycle=0
net.ipv4.tcp_timestamps=0
```
4. Press the Esc key and enter :wq to save and close the file.
Run the following command for the configurations to take effect:
```
sysctl -p 
```

Common Linux kernel parameters

Parameter	Description
net.core.rmem_default	The default size of the socket receive window. Unit: byte.
net.core.rmem_max	The maximum size of the socket receive window. Unit: byte.
net.core.wmem_default	The default size of the socket send window. Unit: byte.
net.core.wmem_max	The maximum size of the socket send window. Unit: byte.
net.core.netdev_max_backlog	When the kernel processing speed is slower than the network interface controller (NIC) receive speed, excess packets are stored in the receive queue of the NIC. This parameter specifies the maximum number of packets allowed to be sent to a queue in the preceding scenario.
net.core.somaxconn	This global parameter specifies the maximum length of a listening queue of each port. The `net.ipv4.tcp_max_syn_backlog` parameter specifies the maximum number of half-open connections during the three-way handshake. Different from this parameter, the net.core.somaxconn parameter specifies the maximum number of connections in the Established state. If the business loads on your ECS instance are heavy, increase the value of the net.core.somaxconn parameter. The `backlog` parameter in the `listen(2)` function also specifies the maximum number of connections in the Established state on a listening port. If the `backlog` value is greater than the `net.core.somaxconn` value, the `net.core.somaxconn` value prevails.
net.core.optmem_max	The maximum buffer size of each socket.
net.ipv4.tcp_mem	This parameter reflects the memory usage of the TCP stack. The unit is memory page that is 4 KB in most cases. The first value is the lower limit of the memory usage. The second value is the maximum stress that the buffer can bear when you perform stress testing. The third value is the upper limit of the memory usage. If the memory usage exceeds the upper limit, the system discards packets to reduce the memory usage. You can increase the values for networks with a large bandwidth-delay product (BDP). The unit is memory page instead of byte.
net.ipv4.tcp_rmem	The receive buffer size. This parameter specifies the size of memory used by the socket for auto configuration. The first value is the minimum size of the socket receive buffer. Unit: byte. The second value is the default value, which can overwrite the rmem_default value. You can use the default value when the business loads of the system are light. Unit: byte. The third value is the maximum size of the socket receive buffer, which can overwrite the rmem_max value. Unit: byte.
net.ipv4.tcp_wmem	The send buffer size. This parameter specifies the size of memory used by the socket for auto configuration. The first value is the minimum size of the socket send buffer. Unit: byte. The second value is the default value, which can overwrite the wmem_default value. You can use the default value when the business loads of the system are light. Unit: byte. The third value is the maximum size of the socket receive buffer, which cannot overwrite the value of wmem_max. Unit: byte.
net.ipv4.tcp_keepalive_time	The interval at which TCP sends keepalive messages to check whether a TCP connection is valid. Unit: seconds.
net.ipv4.tcp_keepalive_intvl	The interval at which TCP resends a keepalive message if no response is returned. Units: seconds.
net.ipv4.tcp_keepalive_probes	The maximum number of keepalive messages that can be sent before a TCP connection is considered invalid.
net.ipv4.tcp_sack	This parameter specifies whether to enable TCP selective acknowledgment (SACK). A value of 1 indicates that TCP SACK is enabled. The TCP SACK feature allows the server to send only the missing packets, which improves performance. We recommend that you enable this feature for wide area network (WAN) communications. Take note that this feature causes CPU utilization to increase.
net.ipv4.tcp_timestamps	The TCP timestamp, which is 12 bytes in size and carried in the TCP header. The timestamp is used to trigger the calculation of the round-trip time (RTT) in a more accurate manner than the retransmission timeout method (RFC 1323). To improve performance, we recommend that you enable this option.
net.ipv4.tcp_window_scaling	This parameter specifies whether to enable window scaling that is defined in RFC 1323. To allow the system to use a TCP window larger than 64 KB, set the value to 1 to enable window scaling. The maximum TCP window size is 1 GB. This parameter takes effect only when window scaling is enabled for both ends of a TCP connection.
net.ipv4.tcp_syncookies	This parameter specifies whether to enable TCP SYN cookie (`SYN_COOKIES`). You must enable and compile CONFIG_SYN_COOKIES for the kernel. `SYN_COOKIES` can prevent a socket from being overloaded when the socket receives a large number of connection requests. The default value is 0, indicating that the TCP SYN cookie feature is disabled. When this parameter is set to 1 and the `SYN_RECV` queue is full, the kernel adds an initial sequence number to a SYN-ACK packet in reply to a SYN packet, instead of adding a sequence number calculated based on the SYN packet. The initial sequence number contains a field that is calculated based on the source IP address, source port number, destination IP address, destination port number, and timestamp. This way, the client uses the initial sequence number to calculate a different sequence number in the ACK packet to reply to the SYN-ACK packet as expected. Attackers incorrectly identify the SYN-ACK packet or cannot reply with an ACK packet. After `net.ipv4.tcp_syncookies` is enabled, `net.ipv4.tcp_max_syn_backlog` is ignored.
net.ipv4.tcp_tw_reuse	This parameter specifies whether a TIME-WAIT socket (TIME-WAIT port) can be used to establish TCP connections.
net.ipv4.tcp_tw_recycle	This parameter specifies whether the system recycles TIME-WAIT sockets at the earliest opportunity.
net.ipv4.tcp_fin_timeout	The time period within which a TCP connection remains in the FIN-WAIT-2 state after the local end disconnects a socket connection. Unit: seconds. During this period of time, the peer may become disconnected, never terminate the connection, or encounter an unexpected process termination.
net.ipv4.ip_local_port_range	The local TCP/UDP protocol port numbers.
net.ipv4.tcp_max_syn_backlog	The number of TCP connections that are in the `SYN_RECV` state. In the `SYN_RECV` state, the server waits and expects an ACK packet from the client. During the establishment of a connection to a client, the server receives a SYN packet, replies with a SYN-ACK packet, and then receives an ACK packet to implement a three-way handshake. This parameter also indicates the maximum number of connections that can be stored in a queue. If the server is frequently overloaded, increase the value of the net.ipv4.tcp_max_syn_backlog parameter. The default value varies based on the instance memory size. The maximum default value is 2048.
net.ipv4.tcp_westwood	This parameter enables the congestion control algorithm on the client. The congestion control algorithm maintains an evaluation of throughput and attempts to optimize the overall bandwidth usage. We recommend that you enable the preceding algorithm for WAN communications.
net.ipv4.tcp_bic	This parameter specifies whether binary increase congestion control is enabled for long-distance networks to better use gigabyte connections. We recommend that you enable this feature for WAN communications.
net.ipv4.tcp_max_tw_buckets	The maximum number of allowed connections in the TIME_WAIT state. If the number of connections in the TIME_WAIT state is greater than the default value, the connections are immediately terminated. The default value varies based on the instance memory size. The maximum default value is 262144.
net.ipv4.tcp_synack_retries	The number of times that a SYN-ACK packet is retransmitted when a connection is in the SYN_RECV state.
net.ipv4.tcp_abort_on_overflow	A value of 1 enables the system to send RST packets to terminate connections if the system receives a large number of requests within a short period of time and the relevant applications fail to process the requests. We recommend that you improve processing capabilities by optimizing processing efficiency of applications. Default value: 0.
net.ipv4.route.max_size	The maximum number of routes allowed by the kernel.
net.ipv4.ip_forward	This parameter specifies whether the IPv4 packet forwarding feature is enabled.
net.ipv4.ip_default_ttl	The maximum number of hops through which a packet can pass.
net.netfilter.nf_conntrack_tcp_timeout_established	If no packets are transmitted over the established connection within a specific period of time, the system uses iptables to terminate the connection.
net.netfilter.nf_conntrack_max	The maximum hash value that specifies the number of connections that can be tracked.

:Common kernel network parameters of Linux ECS instances and FAQ

Considerations

Query and modify kernel parameters of a Linux instance

View and modify kernel parameters by using the file in the /proc/sys/ directory

View and modify kernel parameters of the /etc/sysctl.conf configuration file

FAQ about the common kernel network parameters of Linux ECS instances

What do I do if I cannot connect to a Linux ECS instance and the "nf_conntrack: table full, drop packet" error message appears in the `/var/log/message` log?

Problem description

Cause

Solution

Why does a "Time wait bucket table overflow" error message appear in the `/var/log/messages` log?

Problem description

Cause

Solution

Why does a Linux ECS instance have a large number of TCP connections in the FIN_WAIT2 state?

Problem description

Cause

Solution

Why does a Linux ECS instance have a large number of TCP connections in the CLOSE_WAIT state?

Problem description

Cause

Solution

Why am I unable to access an ECS instance or an ApsaraDB RDS instance after I configure NAT for my client?

Problem description

Cause

Solution

Common Linux kernel parameters

References

Considerations

Query and modify kernel parameters of a Linux instance

View and modify kernel parameters by using the file in the /proc/sys/ directory

View and modify kernel parameters of the /etc/sysctl.conf configuration file

FAQ about the common kernel network parameters of Linux ECS instances

What do I do if I cannot connect to a Linux ECS instance and the "nf_conntrack: table full, drop packet" error message appears in the /var/log/message log?

Problem description

Cause

Solution

Why does a "Time wait bucket table overflow" error message appear in the /var/log/messages log?

Problem description

Cause

Solution

Why does a Linux ECS instance have a large number of TCP connections in the FIN_WAIT2 state?

Problem description

Cause

Solution

Why does a Linux ECS instance have a large number of TCP connections in the CLOSE_WAIT state?

Problem description

Cause

Solution

Why am I unable to access an ECS instance or an ApsaraDB RDS instance after I configure NAT for my client?

Problem description

Cause

Solution

Common Linux kernel parameters

References

What do I do if I cannot connect to a Linux ECS instance and the "nf_conntrack: table full, drop packet" error message appears in the `/var/log/message` log?

Why does a "Time wait bucket table overflow" error message appear in the `/var/log/messages` log?