If the memory usage of an instance of Tair (Redis OSS-compatible) suddenly spikes, you can refer to this topic to troubleshoot the issue.
Problem description
An instance of Tair (Redis OSS-compatible) suddenly experiences a significant increase in memory usage, even reaching 100%.
Causes
The sudden increase in memory usage may be due to the following reasons:
A large amount of new data is written in a short period of time.
A large number of new connections are established in a short period of time.
Burst access generates a large amount of traffic that exceeds the network bandwidth, resulting in a backlog in the input and output buffers.
The client cannot keep up with the processing speed of Tair (Redis OSS-compatible), resulting in a backlog in the output buffer.
Solutions
Identify the causes of the sudden increase in memory usage and use the suggested solutions to resolve the issue.
Check whether a large amount of new data is written
Troubleshooting method:
On the Performance Monitor page, check the inbound traffic and write queries per second (QPS) of the instance. If the inbound traffic and write QPS follow the same trend as the memory usage, the sudden increase in memory usage is caused by a large amount of written data.
Solution:
Configure appropriate time-to-live (TTL) values for keys to automatically delete keys that are no longer needed, or manually delete unnecessary keys.
Upgrade the instance specifications by increasing the memory capacity to mitigate the sudden increase in memory usage. For more information, see Change the configurations of an instance.
If your instance is a standard instance and the memory usage remains high after you increase the memory capacity, you can upgrade the instance to a cluster instance. This way, you can distribute data across multiple data shards to reduce the memory pressure on individual data shards. For more information, see Change the configurations of an instance.
Check whether a large number of new connections are established
Troubleshooting method:
On the Performance Monitor page, view the number of connections to the instance. If the number of connections suddenly increases and follows the same trend as the memory usage, the sudden increase in memory usage is caused by a large number of new connections.
Solution:
Check whether connection leaks exist.
Configure connection timeout periods to automatically close idle connections. For more information, see Specify a timeout period for client connections.
Check whether burst traffic causes a backlog in the input and output buffers
Troubleshooting method:
Check whether the inbound and outbound traffic usage of the instance reaches 100%.
Run the
MEMORY STATS
command to check whether clients.normal occupies an excessive amount of memory.Noteclients.normal reflects the total amount of memory that is used by the input and output buffers for all normal client connections.
Solution:
Check the cause of traffic burst.
Increase the network bandwidth of the instance. For more information, see Manually increase the bandwidth of an instance and Enable bandwidth auto scaling.
Upgrade the instance specifications to ensure optimal usage of the input and output buffers. For more information, see Change the configurations of an instance.
Check whether client-side performance issues cause a backlog in the output buffer
Troubleshooting method:
In redis-cli, run the MEMORY DOCTOR
command to view the value of big_client_buf
. If big_client_buf is set to 1, at least one client has a large output buffer that consumes a significant amount of memory.
Solution:
Run the CLIENT LIST
command to check which client has a large output buffer that consumes a significant amount of memory (omem
). Check whether the client application has performance issues.