Keys that are frequently accessed in Redis are known as hotkeys. If hotkeys are improperly managed, Redis processes may be blocked and your service may be interrupted. This topic describes the solutions that use ApsaraDB for Redis to resolve the hotkey issue.
Overview
Causes
The hotkey issue can have the following two causes:
- The size of data consumed by users is much greater than that of produced data, as
in the cases of hot sale items, hot news, hot comments, and celebrity live streaming.
The hotkey issue tends to occur unexpectedly, for example, the sales price promotion of popular commodities during Double 11. When one of these commodities is browsed or purchased tens of thousands of times, a large number of requests are processed, which causes the hotkey issue. Similarly, the hotkey issue tends to occur in scenarios where more read requests are processed than write requests. For example, hot news, hot comments, and celebrity live streaming.
- In these cases, hotkeys are accessed much more frequently than other keys. Therefore,
most of the user traffic is centralized to a specific Redis instance, and the Redis
instance may reach a performance bottleneck.
When a piece of data is accessed on the server, the data is partitioning. During this process, the corresponding key is accessed on the server. When the load exceeds the performance threshold of the server, the hotkey issue occurs.
Impacts of the hotkey issue
- The traffic is aggregated and reaches the upper limit of the physical network adapter.
- Excessive requests queue up, and the partitioning service stops responding.
- The database is overloaded and the service is interrupted.
When the number of hotkey requests on a server exceeds the upper limit of the network adapter on the server, the server stops providing other services due to the concentrated traffic. If hotkeys are densely distributed, a large number of hotkeys are cached. When the cache capacity is exhausted, the partitioning service stops responding. After the caching service stops responding, the newly generated requests are cached on the backend database. Due to its poor performance, this database is prone to exhaustion when the database handles a large number of requests. The exhaustion of the database leads to service interruption and a dramatic downgrading of the performance.
Common solutions
Rebuild the server or client to improve the performance.
Use a server cache
The client sends requests to the server. The server provides a multi-thread service, and a cache space is available based on the cache LRU policy. When the server is congested, it directly responds to the requests instead of forwarding them to the database. The server sends the requests from the client to the database and rewrite the data to the cache only after the congestion is cleared. By using this solution, the cache is accessed and rebuilt.
However, this solution has the following issues:
- Cache building of the multi-thread service when the cache fails
- Cache building when the cache is missing
- Dirty reading
Use Memcache and Redis
In this solution, a separate cache is deployed on the client to resolve the hotkey issue. The client first accesses the service layer and then the cache layer of the same server. This solution has the following advantages: nearby access, high speed, and no bandwidth limit. However, it has the following disadvantages:
- Wasted memory resources
- Dirty reading
Use a local cache
Using the local cache generates the following issues:
- hotkeys must be detected in advance.
- The cache capacity is limited.
- The inconsistency duration is long.
- The omission of hotkeys.
If traditional hotkey solutions are all defective, how can the hotkey issue be resolved?
ApsaraDB for Redis provides the solution to the hotkey issue
Read/write splitting solution
The nodes in the architecture serve the following purposes:
- Load balancing is implemented at the Server Load Balancer (SLB) layer.
- Read/write splitting and automatic routing are implemented at the proxy layer.
- Write requests are processed by the master node.
- Read requests are processed by the read replica nodes.
- High availability (HA) is implemented on the replica node and the master node.
In practice, the client sends requests to SLB, and SLB distributes these requests to multiple proxies. The proxies identify, classify, and then distribute requests. For example, a proxy node sends all write requests to the master node and all read requests to the read replica nodes. But the read replica nodes in the module can be expanded to solve the hotkey reading issue. Read/write splitting supports flexible scaling for hotkey reading and can store a large number of hotkeys. It is client-friendly.
Hot data solution
In this solution, hotkeys are actively discovered and stored to resolve the hotkey issue. The client accesses an SLB instance and requests are distributed to a proxy node through the SLB instance. Then, the proxy node forwards the requests to the backend Redis instances.
A cache is added to the server. A local cache is added to each proxy node. This cache uses the LRU algorithm to cache hot data. A hotkey computing module is added to the backend data node to return the hot data.
The proxy architecture has the following benefits:
- The proxy nodes cache the hot data, and its reading capability can be scaled out.
- The database node computes the hot data set at a specified time.
- The database returns the hot data to the proxy nodes.
- The proxy architecture is transparent to the client, therefore, no compatibility is required.
Process hotkeys
Read hot data
The processing of hotkeys is divided into two jobs: writing and reading. During the data writing process, SLB receives data K1 and writes it to a Redis database through a proxy node. If K1 becomes a hotkey after the calculation conducted by the backend hotkey computing module, the proxy node caches the hotkey. In this way, the client can directly access K1 without using Redis. The proxy node can be scaled out. Therefore, the accessibility of the hot data can be enhanced.
Discover hot data
The database first counts the requests that occur in a specified cycle. When the number of requests reaches a threshold, the database detects the hotkeys and stores them in an LRU list. When a client attempts to access data by sending a request to proxy nodes, Redis enters the feedback phase and marks the data if it finds that the destination is a hotkey.
The database uses the following methods to compute the hot data:
- Hot data statistics based on statistical thresholds
- Hot data statistics based on statistical cycles
- Statistics collection method based on the version number without resetting the initial value
- Computing hotkeys on the database has a minor impact on the performance and occupies only a small amount of memory.
Comparison of two solutions
The preceding analysis shows that compared with the traditional solutions, Alibaba Cloud has made significant improvements in resolving the hotkey issue. The read/write splitting solution and the hot data solution can be extended. These two solutions are transparent to the client, though they cannot ensure complete data consistency. The read/write splitting solution supports storing a larger amount of hot data, while the proxy-based solution is more cost-effective.