High-Concurrency Practices of Redis: Snap-Up System

By Qianyi

1. I/O Models and Their Problems

2. Resource Contention and Distributed Locks

3. Redis Snap-Up System Instances

1. I/O Models and Their Problems

1) Run-to-Completion in a Solo Thread

The I/O model of Redis Community Edition is simple. Generally, all commands are parsed and processed by one I/O thread.

However, if there is a slow query command, other queries have to queue up. In other words, when a client runs a command too slowly, subsequent commands will be blocked. Sentinel revitalization cannot help since it will delay the ping command, which is also affected by slow queries. If the engine gets stuck, the ping command will fail to determine whether the service is currently available for fear of misjudgment.

If the service does not respond and the Master is switched to the Slave, slow queries also will slow down the Slave. The misjudgment may also be made by ping command. As a result, it is difficult to monitor service availability.

Summary

All user requests from different clients are executed by a single thread in queue. The next event will not be processed until the current one has finished.
Single-threaded run-to-completion means no dispatcher and no multi-worker in the backend.

If one query is delayed by slow queries, such as keys, lrange, and hgetall, the subsequent requests will also be delayed.

Defects of Using Sentinel for Revitalization

The Ping Command Misjudgment: The ping command is also affected by slow queries and will fail if the engine gets stuck.
Duplex Failure: Sentinel will fail when processing other slow queries after Master/Slave switchover due to the preceding slow query.

2) Make It a Cluster

This problem also exists in a cluster containing multiple shards. If a slow query delays processes in a shard, for example, calling a cross-shard command like mget, the access to the problematic shard gets stuck. Thus, all subsequent commands are blocked.

Summary

Similarly, clusters cannot resolve the stuck issue within a single DB.
Query Hole: If users call a cross-shard command like mget, the access to the problematic shard gets stuck.

3) “Could Not Get a Resource from the Pool”

Common Redis clients like Jedis provide connection pools. When a service thread accesses Redis, a persistent connection will be retrieved for each query. If the query is processed too slowly to return the connection, the waiting time will be prolonged because the connection cannot be used by other threads until the request returns a result.

If all queries are processed slowly, each service thread retrieves a new persistent connection, which will be consumed gradually. If this is the case, an exception of “no resource available in the connection pool” is reported because the Redis server is single-threaded. When a persistent connection of the client is blocked by a slow query, requests from subsequent connections will not be processed in time. The current connection cannot be released back to the connection pool.

Connection convergence is not supported by the Redis protocol, which deprives the Message of an ID. Thus, Request and Response cannot be associated. Therefore, the connection pool is used. The callback is put into a queue unique for each connection when a request is received to implement asynchronization. The callback can be retrieved and executed after the request returns. This is the FIFO model. However, server connections cannot be returned out of order because they cannot be matched in the client. Generally, the client uses BIO, which blocks a connection and only allows other threads to use it after request returns.

However, asynchronization cannot improve efficiency, which is limited by the single-threaded server. Even if the client modifies the access method to allow multiple connections to send requests at the same time, these requests still have to queue up in the server. Slow queries still block other persistent connections.

Another serious problem comes from the Redis thread model. Its performance will lower when the I/O thread handles with more than 10,000 connections. For example, if there are 20,000 to 30,000 persistent connections, the performance will be unacceptable for the business. For business machines, if there are 300 to 500 machines and each one handles 50 persistent connections, it is easy to reach the performance bottleneck.

Summary

The connection pool is used because the Redis protocol does not support connection convergence.

Request and Response cannot be associated because Message has no ID.
It is similar to HTTP 1.x messages.

When a slow query exists in the Engine layer, the request returns slowly, which means:

Connections in the connection pool can be depleted easily.
When a large number of application machines are in use, the performance is prone to reach the bottleneck. For example, the limit of 10,000 connections may be reached quickly. If max_conn in each client connection pool is set to 50, the callback speed becomes slow.

For each query, a connection needs to be retrieved from the connection pool and only released when the request returns.
Connection quantity in the connection pool remains low if the user returns the result in time.

If the response fails while a new request is received, the only solution is to check out another connection.
When all connections in the connection pool are checked out, an exception of “Could not get a resource from the pool” is reported.

Here are some solutions for asynchronous interfaces implementation based on the Redis protocol:

As mentioned above, a connection can be assigned with a callback queue. The callback is put into the queue before asynchronous requests are sent and retrieved after the result is returned. This is a common method, generally implemented by mainstream clients that support asynchronous requests.
There are also some tricky methods, like using the Multi-Exec and ping commands to encapsulate the request. For example, call the set k v command to encapsulate a request in the following form:

multi
ping {id}
set k v
exec

The server will return the following code:

{id}
OK

This tricky method uses atomic operations of Multi-Exec and the unmodified parameter return feature of the ping command to “attach” message IDs in the protocol. However, this method is fundamentally not used by any client.

4) Redis 2.x/4.x/5.x Thread Models

The model of well-known versions before Redis 5.x remains the same. All commands are processed by a single thread, and all reads, processes, and writes run in the same main I/O. There are several BIO threads that close or refresh files in the background.

In versions later than Redis 4.0, LAZY_FREE is added, allowing certain big keys to release asynchronously to avoid blocking the synchronous processing of tasks. On the contrary, Redis 2.8 will get stuck when eliminating or expiring large keys. Therefore, Redis 4.0 or later is recommended.

5) Flame Graph of Redis 5.x

The following figure shows the performance analysis result. The left two parts are about command processing, the middle part talks about “reading,” and the rightmost part stands for “writing,” which occupies 61.16%. Thus, we can tell that most of the performance depends on the network I/O.

6) Redis 6.x Thread Model

With the improved model of Redis 6.x, the “reading” task can be delegated to the I/O thread for processing after readable events are triggered in the main thread. After the reading task finishes, the result is returned for further processing. The “writing” task can also be distributed to the I/O thread, which improves the performance.

Improvement in the performance can be really impressive with O(1) commands like simple “reading” and “writing” if only one running thread exists. If the command is complex while only one running thread in DB exists, the improvement is rather limited.

Another problem lies in time consumption. Every “reading” and “writing” task needs to wait for the result after being distributed. Therefore, the main thread will idle for a long time, and service cannot be provided. Therefore, more improvements of the Redis 6.x model are expected.

7) Thread Model of Alibaba Cloud Redis Enterprise Edition (Performance-Enhanced Tair)

The model of Alibaba Cloud Redis Enterprise Edition splits the entire event into parts. The main thread is only responsible for command processing, while all reading and writing tasks are processed by the I/O thread. Connections are no longer retained only in the main thread. The main thread only needs to read once after the event starts. After the client is connected, the reading tasks are handed over to other I/O threads. Thus, the main thread does not care about readable and writable events from clients.

When a command arrives, the I/O thread will forward the command to the main thread for processing. Later, the processing result will be passed to the I/O thread through notification for writing. By doing so, the waiting time of the main thread is reduced as much as possible to enhance the performance further.

The same disadvantage applies. Only one thread is used in command processing. The improvement is great for O(1) commands but not enough for commands that consume many CPU resources.

8) Performance Comparison Test

In the following figure, the gray color on the left stands for Redis Community Edition 5.0.7, and the orange color on the right stands for Redis Performance-Enhanced Edition. The multi-thread performance of Redis 6.x is in between. In the test, the reading command is used, which requires more on I/O rather than CPU. Therefore, the performance improvement is great. If the command used consumes many CPU resources, the difference between the two editions will reduce to none.

It is worth mentioning that the Redis Community Edition 7 is in the planning phase. Currently, it is being designed to adopt a modification scheme similar to the one used by Alibaba Cloud. With this scheme, it can gradually approach the performance bottleneck of a single main thread.

Performance improvement is just one of the benefits. Another benefit of distributing connections to I/O threads is that it linearly increases the number of connections. You can add I/O threads to deal with more connections. Redis Enterprise Edition supports tens of thousands of connections by default. It can even support more connections, such as 50,000 or 60,000 persistent connections, to solve the issue of insufficient connections during the large-scale machine scaling at the business layer.

2. Resource Contention and Distributed Lock

1) CAS/CAD High-Performance Distributed Lock

The write command in Redis string has a parameter called NX, which specifies that a string can be written if no string exists. This is naturally locked. This feature facilitates the lock operation. By taking a random value and setting it with the NX parameter, atomicity can be ensured.

An “EX” is added to set an expiration time to ensure the lock will be released if the business machine fails. If a lock is not released after the machine fails or is disabled for some reason, it will never be unlocked.

The parameter “5” is an example. It does not have to be 5 seconds, but it depends on the specific tasks to be done by the business machine.

The removal of distributed locks can be troublesome. Here’s a case. A locked machine sticks or loses contact due to some sudden incidents. Five seconds later, the lock has expired, and other machines have been locked, while the once-failed machine is available again. After processing, like deleting Key, the lock that does not belong to the failed machine is removed. Therefore, the deletion requires judgment. Only when the value is equal to the previously written one can the lock be removed. Currently, Redis does not have such a command, so it usually uses Lua.

When the value is equal to the value in the engine, Key is deleted by using the CAD command “Compare And Delete.” For open-source CAS/CAD commands and TairString in Module form, please see this GitHub page. Users can load modules directly to use these APIs on all Redis versions that support the Module mechanism.

When locking, we set an expiration time, such as “5 seconds.” If a thread does not complete processing within the time duration (for example, the transaction is still not completed after 3 seconds), the lock needs to be renewed. If the processing has not finished before the lock expires, a mechanism is required to renew the lock. Similar to deletion, we cannot renew directly, unless the value is equal to the value in the engine. Only when the lock is held by the current thread can we renew it. This is a CAS operation. Similarly, if there is no API, a new Lua script is needed to renew the lock.

Distributed locks are not perfect. As mentioned above, if the locked machine is lost, the lock is held by others. When the lost machine is suddenly available again, the code will not judge whether the lock is held by the current thread, possibly leading to reenter. Therefore, Redis distributed locks, as well as other distributed locks, are not 100% reliable.

Summary

CAS/CAD is an extension of Redis String.
The implementation of distributed locks
Renewal through CAS
For more details, please visit the Document Center
For open-source CAS/CAD and TairString in Module form, please see this GitHub page.

2) Lua Implementation of CAS/CAD

If there is no CAS/CAD command available, Lua script is needed to read the Key and renew the lock, if two values are the same.

Note: The value that will change in each call in the script must be passed through by parameters because as long as the scripts are different, Redis caches the scripts. So far, Redis Community Edition 6.2 has neither limited the cache size nor set up an eviction policy. The operation of executing the script flush command to clear the cache is also synchronous, so be sure to avoid an overlarge script cache. The feature of asynchronous cache deletion has been added to the Redis Community Edition by Alibaba Cloud engineers. The script flush async command is supported in Redis 6.2 and later versions.

You need to run the script load command to load the Lua script to the Redis instance to use CAS/CAD commands. Then, use the evalsha command that contains parameters to call the script. This reduces the network bandwidth and avoids loading different scripts each time. Note: The evalsha command may return an error of “no script exists,” which can be solved by executing the script load command again.

More information about the Lua implementation of CAS/CAD is listed below:

Distributed locks are not so reliable regarding the data consistency and downtime recovery capabilities of Redis.
The Redlock algorithm proposed by the author of Redis is still controversial. For information about the Redlock algorithm, please see reference 1, 2, and 3
Other solutions (like ZooKeeper) can be considered if higher reliability is necessary (higher reliability and lower performance.)
Use message queues to serialize a mutually exclusive operation based on the business system.

3) Lua Usage in Redis

Before executing the Lua scripts, Redis needs to parse and translate the scripts first. Generally, Lua usage is not recommended in Redis for two reasons.

First, to use Lua in Redis, you need to call Lua from C language and then call C language from Lua. The returned value is converted twice from a value compatible with Redis protocol to a Lua object and then to C language data.

Second, the execution process involves many Lua script parsing and VM processing operations (including the memory usage of lua.vm.) So, the time consumed is longer than the common commands. Thus, simple Lua scripts like if statement is highly recommended. Loops, duplicate operations, and massive data access and acquisition should be avoided as much as possible. Remember that the engine only has one thread. If the majority of CPU resources are consumed by Lua script execution, there will be few CPU resources available for business command processing.

Summary

“The LUA Iceberg inside Redis”

The compile-load-run-unload operation on the script consumes a large amount of CPU resources. The execution of Lua scripts is similar to pushing complex transactions to Redis for execution. The memory will be depleted if any exceptions occur, and Redis will stop operating after the engine's computational power is exhausted.

“Script + EVALSHA”

If we pre-compile and load the script in Redis (without unload or clean operation) and use EVALSHA to execute, this saves CPU resources compared with EVAL only. However, this is still a defective solution as Redis will fail when it restarts, switches, or changes the configuration of the code cache. The code cache needs to be reloaded. Complex data structures or modules are better alternatives to Lua.

When applying the JIT technology to the storage engine, remember that EVAL is evil. Try to avoid using Lua since it consumes memory and computing resources.
Many advanced implementations of some SDKs (such as Redisson) use Lua internally. Therefore, caution must be taken as it is possible for developers to run into the CPU operation storm with no intention.

3. Redis Snap-Up System Instances

1) Characteristics of the Snap-Up/Flash Sale Scenario

Flash sale means to sell specified quantities of commodities at a special offer for a limited time. This attracts a large number of buyers, but unfortunately, only a few of them place orders during the promotion. Thus, a flash sale will generate dozens (or hundreds) more visit and order request traffic instances within a short time.

A flash sale scenario is divided into three phases:

Before the Promotion: Buyers refresh the commodity details page continuously. The traffic of requests for this page spikes.
During the Promotion: Buyers place orders. The number of order requests reaches a peak.
After the Promotion: Buyers that have placed orders continue to query the status of orders or cancel orders. Most buyers keep refreshing the commodity details page, waiting for opportunities to place orders once other buyers cancel their orders.

2) General Method to Dealing with Snap-Up/Flash Sale Scenario

Snap-up/flash sale scenarios fundamentally concern highly concurrent reading and writing of hot spot data.

Snap-up/flash sale is a process of continuously pruning requests:

Step 1: Minimize user reading/writing requests to the application server through client interception
Step 2: Reduce the number of access requests that applications send to the server for the backend storage system through LocalCache interception on the server
Step 3: For requests sent to the storage system, use Redis to intercept the vast majority of requests to minimize access to the database
Step 4: Send the final requests to the database. The application server can also use a message queue to get a backup plan in case the backend storage system does not respond.

Basic Principles

1. Less Data (Staticizing, CDN, frontend resource merge, dynamic and static data separation on page, and LocalCache)

Reduce the page demand for dynamic parts in every way possible. If the frontend page is mostly static, use CDN or other mechanisms to prevent all requests. Thus, requests on the server will be reduced largely in quantity and bytes.

2. Short Path (Shorten frontend-to-end path as much as possible, minimize the dependency on different systems, and support throttling and degradation.)

In the path from the user side to the final end, you should depend on fewer business systems. Bypass systems should reduce their competition, and each layer must support throttling and degradation. After throttling and degrading, the frontend prompts optimization is needed.

3. Single Point Prohibition (Achieve stateless application services scaling horizontally and avoid hot spots for storage services.)

Stateless scale-out must be supported everywhere in the service. Stateful storage services must avoid hot spots, generally some reading and writing hot spots.

Timing of Inventory Deduction

Deduction during Order Placing: Avoid malicious non-payment orders and negative inventory when high-concurrent requests are sent
Deduction during Payment: Avoid negative experience due to payment failure after placing orders
Pre-Deducting and Releasing When Timeout: This can be integrated into frameworks (like Quartz) with attention paid to security and anti-fraud.

The third method is commonly selected, as the first two all have defects. For the first one, it is difficult to avoid malicious orders that do not pay. For the second one, the payment fails because of insufficient inventories. So, the experience of the former two methods is very poor. Usually, the inventory is deducted in advance and will be released when the order times out. The TV framework will be used, and a security and anti-fraud mechanism will also be established.

Common Implementation of Redis

1. String Structure

Use incr/decr/incrby/decrby directly. Note: Currently, Redis does not support upper and lower bound limits.
Lua is available to avoid negative inventory or sku deduction of associated inventories.

2. List Structure

Each commodity is a List, and each Node is an inventory unit.
Use the lpop or rpop command to deduct inventory until nil (key not exist) is returned.

List structure has some disadvantages, for example, more memory is occupied. If multiple inventory units are deducted at a time, lpop needs to be called multiple times, which will affect the performance.

3. Set/Hash Structure

Applicable to deduplication. To restrict user purchase quantity, use hincrby to count and hget to judge the purchased quantity.
Note: You need to map the user UIDs to multiple keys for reading and writing. You must not put all UIDs in one key (hot spot) since typical reading and writing bottlenecks of hot key will directly cause business bottlenecks.

4. (If Service Scenarios Allow) Multiple Keys (key_1, key_2, key_3...) for Hot Commodities

Random selection
User UID mapping (different inventory can also be set according to user levels.)

3) TairString: String That Supports High-Concurrency CAS

TairString, another structure in the module, modifies Redis String and supports String with high-concurrency CAS and Version. It has Version values, which enable the implementation of optimistic lock during reading and writing. Note: This String structure is different and cannot be used together with other common String structures in Redis.

As shown in the above figure, when functioning, first, TairString gives an exGet value and return (value,version). Then, it operates on the value and updates with the previous version. If versions are the same, then update. Otherwise, re-read and modify before updating to implement CAS operations. This is called an optimistic lock on the server.

For the optimization of the scenario mentioned above, you can apply the exCAS interface. Similar to exSet, when encountering version conflict, exCAS will return the version mismatch error and updated version of the new value. Thus, another API call is reduced. By applying exCAS after exSet and performing exSet -> exCAS again when the API call fails, network interaction is reduced. Thus, the access volume to Redis will be reduced.

Summary

TairString is a string that supports high-concurrency CAS.

Version-Carried String

Ensure the atomicity of concurrent updates
Implement updates and optimistic lock based on Version
It cannot be used together with other common String structures in Redis.

More Semantics

exIncr/exIncrBy: Snap-up/flash sale (with upper and lower bounds)
exSet -> exCAS: Reduce network interactions
For more details, please visit the Document Center
For open-source TairString in module form, please see this GitHub page.

4) Comparisons between String and exString Atomic Count

The String is based on the INCRBY method with no upper or lower bound, and the exString is based on the EXINCRBY method that provides various parameters together with the upper and lower bounds. For example, if the minimum value is set to 0, the value cannot be reduced when it equals 0. exString also supports specifying expiration time. For example, a product can only be snapped up within a specified period and cannot after the expiration. The business system also restricts the cache to clear it after a specified time. If the inventory is limited, the goods are removed after ten seconds if no order is placed. If new orders keep coming, the cache is renewed. A parameter needs to be included in EXINCRBY to renew the cache each time INCRBY or API is called to achieve this. Thus, the hit rate can be improved.

What Is the Function of Counter Expiration Time?

A commodity can be snapped up within a certain period and cannot afterward.
If the cache inventory is limited, commodities with no orders will expire and be deleted. If a new order is placed, the cache will be renewed automatically for a period to increase the cache hit rate.

As shown in the following figure, to use Redis String, the Lua script displayed above is suitable. If the “get” KEY[1] is larger than “0,” use “decrby” minus “1.” Otherwise, the “overflow” error is returned. Value that has been reduced to “0” cannot be decreased. In the following example, ex_item is set to “3.” Subtract it 3 times and return the “overflow” error when the value becomes “0.”

exString is very simple in use. Users just need to exset a value and execute “exincrby k -1.” Remember that String and TairString are different. Their APIs cannot be used together.

Community

High-Concurrency Practices of Redis: Snap-Up System

Contents

1. I/O Models and Their Problems

1) Run-to-Completion in a Solo Thread

Summary

Defects of Using Sentinel for Revitalization

2) Make It a Cluster

Summary

3) “Could Not Get a Resource from the Pool”

Summary

4) Redis 2.x/4.x/5.x Thread Models

5) Flame Graph of Redis 5.x

6) Redis 6.x Thread Model

7) Thread Model of Alibaba Cloud Redis Enterprise Edition (Performance-Enhanced Tair)

8) Performance Comparison Test

2. Resource Contention and Distributed Lock

1) CAS/CAD High-Performance Distributed Lock

Summary

2) Lua Implementation of CAS/CAD

3) Lua Usage in Redis

Summary

3. Redis Snap-Up System Instances

1) Characteristics of the Snap-Up/Flash Sale Scenario

2) General Method to Dealing with Snap-Up/Flash Sale Scenario

Basic Principles

Timing of Inventory Deduction

Common Implementation of Redis

3) TairString: String That Supports High-Concurrency CAS

Summary

4) Comparisons between String and exString Atomic Count

What Is the Function of Counter Expiration Time?

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

Tair (Redis® OSS-Compatible)

ApsaraDB for HBase

ApsaraDB for OceanBase

ApsaraDB for Cassandra

A Free Trial That Lets You Build Big!