By Zhongfei
With nearly a year of development and three release candidates, Redis 7.0 was finally released. This is the most changed version compared with previous versions. It adds more than 50 commands and many core features, making a lot of improvements. It solves usage problems and expands the usage scenarios of Redis.
Redis 7.0 has made many attempts, but stability is still the cornerstone. Redis stressed the stability in 7.0's GA blog post, "While user-facing features are easy to boast about, the real unsung heroes in this version are efforts to make Redis more performant, stable, and lean." The stability of Redis has always been valued most since its birth, so you don't need to worry about the stability of 7.0.
Let's learn about some of the new core features of Redis 7.0!
Function is a new implementation of the Redis script solution. Before Redis 7.0, users could only use the EVAL command family to execute Lua scripts. However, Redis has always been in the undefined state for persistence and master-slave replication of Lua scripts, and its performance varies from major version to release version. Therefore, the community requires users to save a copy locally when using Lua scripts (this is the safest way) to prevent the loss of Lua scripts that may be caused when instances are restarted, and master-slave is switching. Maintaining Lua scripts in Redis has always been an obstacle for users.
Function complements Lua scripts well. It allows users to load custom function libraries to Redis. On the one hand, user-defined function names can have clearer semantics compared with the calling method of EVALSHA. On the other hand, the function libraries loaded by Function perform master-slave replication and persistent storage, which solves the ambiguity in the persistence of Lua scripts in the past.
Since 7.0, the Function and EVAL command family have clear definitions. FUNCTION LOAD will automatically perform master-slave replication and persistent storage of the function library. SCRIPT LOAD will not perform persistence and master-slave replication, and the script is only stored in the current execution node. The community is also planning to have Function support more languages, such as JavaScript and Python. Stay tuned for more information!
In general, Function is designed as a part of data in 7.0, so it can be stored in RDB and AOF files. Function can also be copied from the master database to all slave databases through master-slave replication, which can solve the previous problem of missing Lua scripts. We recommend replacing Lua scripts in Redis with Function.
AOF is the core solution for Redis data persistence. Its essence is to append redo logs for data modification operations. Since it is continuously appended, compaction, which is called AOF rewrite in Redis, is required.
However, how to process the incremental data during AOF rewrite has been a problem. In the past, the incremental data during the rewrite period needs to be retained in the memory. After the rewrite is completed, this part of the incremental data is written into the new AOF file to ensure data integrity. AOF rewrite will consume additional memory and disk IO, which can be seen as a deficiency. Many improvements have been made before, but this problem has not been solved.
Alibaba Cloud Tair, the enhanced Redis®*, also had this problem at the very beginning before it was officially released. After many iterations internally, it was solved by using the Multi-Part AOF mechanism. At the same time, it was contributed to the community and released in Redis 7.0. The specific method is to use base (full data) + inc (incremental data) independent file storage to reduce waste. It also supports the storage and management of historical AOF files. Combined with the time information in AOF files, PITR can be restored by time point (supported by Alibaba Cloud Tair), which enhances the data reliability of Redis® and meets the needs of data restoration.
You can find references about the implementation at the end of this article.
Redis has supported the publish-subscribe mechanism since 2.0. Users using the pubsub command family can establish a message subscription system. However, Redis pubsub has some problems in the cluster mode; the most significant of which is the broadcast storm brought by large-scale clusters.
Redis pubsub is published and subscribed by channel. However, channels are not treated as data processing in cluster mode. They do not participate in hash value calculation and cannot be distributed by slot. Therefore, Redis broadcasts messages to users in cluster mode.
The problem is clear. If a cluster has 100 nodes and users publish messages to a channel at node 1, the node needs to broadcast the messages to the other 99. If only a few of the other nodes subscribe to the channel, most of the messages are invalid, which causes waste to the network, CPU, and other resources.
Sharded-pubsub is used to solve this problem. It distributes channels by shards. A shard node is only responsible for processing its channels rather than broadcasting them, which simply avoids the waste of resources.
Redis supports memory specification configuration. We are familiar with maxmemory and maxmemory-policy, but I still want to explain them again here. Maxmemory controls Redis's overall operating memory instead of data memory, such as client buffer, lua cache, function cache, and db metadata. It will be counted in the operating memory. If the operating memory exceeds maxmemory, evict will be triggered to delete data. This is a problem for users when using Redis.
Among this non-data memory usage, the client buffer consumes the most. In large traffic scenarios, the client needs to cache a lot of user read and write data. (Imagine the results of keys need to be cached in the client output buffer before being sent to users.) Due to the memory consumption of network traffic, eviction is triggered to delete data many times. Redis has supported client-output-buffer-limit configuration items for a long time. However, it only limits the output buffer of a single connection dimension, and there is no global statistical client usage memory and limit. As a result, 7.0 has added the maxmemory-clients configuration items to limit the memory used by all clients. If this limit is exceeded, the client with the largest memory consumption will be released to ease the memory usage consumption.
Client-eviction is not the last problem. Metadata memory usage can cause problems for users. Redis is a memory-based database. We need to perform accurate statistics and control the memory of each module so users can have a clear understanding and planning of data storage.
Redis has been through seven major versions, adding many new features along the way. For example, cluster mode is supported from 3.0. Lazyfree and PSYNC2 developed in 4.0 have solved Redis's long-term blocking problem of large key deletion and the problem that synchronization interruption could not be renewed. A new stream data module has been added to enable Redis to have complete lightweight MSMQ capabilities in 5.0. Many enterprise-level features have been released in 6.0, such as threaded-io, TLS, and ACL, improving the performance and security of Redis.
Redis was built and developed by developers. Its popularity is mainly due to two features: high performance and rich and easy-to-use data modules. These data modules reduce the complexity of development services.
Not only Redis Labs, but many other vendors in the market are working on enriching Redis® data modules such as RedisSearch, Redis-Json, RedisGraph, RedisTimeSeries, and RedisBloom.
Similarly, Tair, the Alibaba Cloud cloud-native in-memory database, has long been developing a variety of enhanced data modules. We now offer extended enterprise-class data modules in our public cloud Tair service, some of which are more powerful than the open source Redis®. Read the references at the end of this article.
A growing number of users are now using Tair's enhanced data modules to build code, improve development efficiency, and enable features that could not be built in the past. The year 2022 was the year of the explosion of Redis® data module extensions. The industry has moved past the acceptance period and into high gear. We believe that enriching data modules will allow Tair to go further, evolving from cached databases to high-performance in-memory databases, adapting to more scenarios and problems.
Disk consistency and replica consistency are two topics that cannot be overlooked when using databases. Many people have only regarded the application scenario of Redis as a cache for a long time (especially for foreign users). Redis has supported persistence mechanisms RDB and AOF since its inception. AOF also provides different levels of persistence semantics. For example, when appendfsync uses the highest level always, it can ensure data is dropped and not lost and has the same strong drop consistency as traditional databases.
In terms of multi-replica consistency, it mainly refers to the primary and secondary consistency. Native Redis still uses asynchronous replication. As long as the data modification operation is completed locally, the results are returned. Compared with other databases, the data consistency between replicas is not provided. It also limits the scenarios of Redis. It has not been fully accepted among users with high requirements for data reliability, such as the financial industry and traditional industries.
The industry is currently developing the persistence of Redis®, taking Alibaba Cloud Tair as an typical example. Tair provides Redis®-like rocksdb-based systems , and capacity storage version that suits for massive storage.
Redis Community Edition | Tair Persistent Memory | Redis®-like rocksdb Open-Source-Based System | |
Disk Consistency | Configurable | Optane full persistence | None, partially configurable |
Write performance after the strong disk is enabled | Relatively low, large writes about 60% | Slightly low, about 90% [1] | Low, large write easy to stall and stop service |
Replica Consistency (Primary/Secondary) | None | Strong consistency: semi-synchronous | None |
Write performance after replica consistency is enabled | - | Relatively low, about 70% [2] | - |
Table 1: Comparison of disk consistency and replica consistency between community Redis and other commercial and open-source products
Note 1: Comparison with the open-source Redis community edition
Note 2: Comparison of memory-based usage costs with open-source Redis
On the whole, as the application scope of Redis is expanded, the requirements for capacity, cost, and data reliability are increasing. These have become important indicators to measure the enterprise-level capabilities of Redis. Major manufacturers and open-source products have also proposed many solutions for building these capabilities. Some typical products and solutions are described below.
In terms of persistence, Alibaba Cloud has introduced new media persistent memory to solve the problems of cost, large capacity, and persistence. The challenge is that the design of the persistent memory storage structure is complex, which needs to control the performance degradation and ensure compatibility. Tair persistent memory solves these problems well. More importantly, it is compatible with Redis® API, which reduces the switching cost for users.
Tair's persistent memory also supports semi-synchronization and strong write consistency. Tair persistent memory meets the data fault tolerance requirements of memory databases.
There are some original and excellent open-source products (Redis®-like systems) in the market. Most of them are based on LSM storage structures (such as rocksdb). Their main advantages are that the disk media is cheaper than memory, but they also have many disadvantages at the moment, including high O&M complexity, direct mapping to O&M costs, KV's inability to support Redis® data modules natively, and changing strong types of Redis® into weak types.
There is still much room for improvement in the consistency and fault tolerance of such systems. However, in terms of actual use, since many users still use Redis®-like systems on the synchronization link of business, the impact of jitter overall throughput on the delay of the LSM KV engine is directly mapped into actual use, so it is difficult to use it as a general-purpose product. These deficiencies also exist in the same Tair capacity storage type (previously called hybrid storage version), which also needs to be optimized for storage and compatibility.
In summary, the capacity version can solve the cost problem, but only when the disk and replica consistency problem is solved can the usage scenario of the Redis® system be extended to the enterprise level. Cloud vendors are competing in a field that requires a high technical threshold.
Redis Labs was renamed Redis in August 2021. The homepage of the community version of Redis has been reconstructed. The commercialization of Redis is progressing fast (for example, attaching Redis Stack on the homepage). Another example is that after buying all the commonly used SDKs on GitHub, some support for commercial open-source Redis Labs was added. Finally, Redis may be completely commercialized like MongoDB and Elasticsearch, but the community is still open and active.
Finally, you are welcome to use Redis 7.0 !
[1] Design and Implementation of Redis 7.0 Multi Part AOF (Article in Chinese): https://developer.aliyun.com/article/866957
[2] Tair Extended Data Module Overview: https://www.alibabacloud.com/help/en/apsaradb-for-redis/latest/integration-with-multiple-redis-modules
[3] Tair Persistent Memory Performance Whitepaper: https://www.alibabacloud.com/help/en/apsaradb-for-redis/latest/performance-enhanced-instances-of-apsaradb-for-redis-enhanced-edition
*Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by Alibaba Cloud is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and Alibaba Cloud.
New-Gen Cluster Non-Inductive Data Migration of Alibaba Cloud In-Memory Database Tair
Alibaba Cloud Community - June 17, 2022
ApsaraDB - August 2, 2022
Alibaba Cloud Community - July 13, 2023
ApsaraDB - October 12, 2020
ApsaraDB - June 9, 2022
Alibaba Cloud Security - March 20, 2019
A key value database service that offers in-memory caching and high-speed access to applications hosted on the cloud
Learn MoreTair is a Redis-compatible in-memory database service that provides a variety of data structures and enterprise-level capabilities.
Learn MoreSimple, scalable, on-demand and reliable network attached storage for use with ECS instances, HPC and Container Service.
Learn MoreWhen demand is unpredictable or testing is required for new features, the ability to spin capacity up or down is made easy with Alibaba Cloud gaming solutions.
Learn MoreMore Posts by ApsaraDB