Disclaimer: This article may contain information about third-party products. Such information is for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.
Question
After the health check of TCP listeners is enabled, network Connection errors such as Connection reset by peer
are frequently recorded in the service logs of the backend servers. The results of network packet capture show that the requests come from the SLB instance and the SLB instance sends RST packets to the backend server to terminate the connection. The error information in business logs is as follows:
Cause
This problem is related to the health check mechanism of SLB. The TCP protocol is not aware of the state of the upper-layer services, and in addition to reducing health check costs and the impact on backend services, server load balancer only performs a simple TCP three-way handshake for the health check of TCP listening, and then directly sends RST data packets to terminate the TCP connection, with no further service data interaction. The upper-layer application considers that the Connection is abnormal, such as the Java Connection pool. Therefore, the application throws a Connection reset by peer
exception. The detailed data interaction process is as follows:
- The server load balancer instance sends an SYN request packet to the backend service port.
- The backend server replies with SYN and ACK if the port is normal.
- The SLB instance successfully receives the response from the backend port and considers that the listening port is normal, it is determined that the health check is successful.
- The SLB instance sends an RST data packet to the backend port to close the connection. The health check ends and no data is sent from the backend server.
Answer
Alibaba Cloud reminds you that:
- Before you perform operations that may cause risks, such as modifying instance configurations or data, we recommend that you check the disaster recovery and fault tolerance capabilities of the instances to ensure data security.
- If you modify the configuration and data of an instance (including but not limited to ECS and RDS), we recommend that you create snapshots or enable RDS log backup.
- If you have authorized or submitted sensitive information such as the logon account and password in the Alibaba Cloud Management Console, we recommend that you modify such information in a timely manner.
To solve the problem, we recommend that you select one of the following solutions based on your business requirements or actual situation:
- Solution 1: Change the listener type
Change the TCP listener of the SLB instance to HTTP Listener or HTTPS listener. For more information, see add HTTP Listener and add HTTPS listener. - Solution 2: log filtering
At the upper layer of the business level, the IP address segment of the SLB health check is filtered to ignore relevant error information.
Note: the health check CIDR block of SLB is 100.64.0.0/10.
Application scope
- SLB