By Qinxia
Disclaimer: This is a translated work of Qinxia's 漫谈分布式系统. All rights reserved to the original author.
In the previous article, we finally brought up the second core problem of distributed systems-availability. It also mentioned how replication is the only way to high availability.
As mentioned at the end of the previous article, in addition to providing high availability, replication may bring serious consequences.
Similar problems can be collectively referred to as data consistency problems.
In the previous article, we focused on the two features of replication-master-slave and timeliness. The combination of the two features may cause the following data consistency risks:
Data Synchronization | Multi-Master | Consistency Risks |
sync | single master | None |
sync | multi-master | Yes |
async | single master | Yes |
async | multi-master | Yes |
The multi-master and asynchronous modes may bring data consistency risks. (Leaderless replication can be regarded as full-master replication.)
Once data consistency is not guaranteed, many practical problems may occur both inside and outside the system. For example:
These practical problems make the system untrustworthy at the application layer. The cost of losing trust is very high.
Therefore, solving the consistency problem has become a major issue for distributed systems.
There are two main types of solutions:
From the point of view of convergence, the first type of method forces the real-time convergence of inconsistent data, while the second type of method allows inconsistent data to diverge first and gradually converge.
From the perspective of message order, the strong consistency of the prevention method ensures that for any node, the data generated before will not be incorrectly placed behind the data generated later due to problems (such as replication lag). The message linearizability of the entire system is maintained in the first type of method. The second type of method is non-linearizable. (The order of messages is very important and will be discussed in later articles.)
As the saying goes, nip it in the bud. Avoiding problems from the source is naturally the ideal goal.
In particular, data inconsistency is such a serious and poorly resolved problem that should be avoided by all means.
So let's look at the preventive solutions to data inconsistency.
The simplest solution is the single leader + synchronous replication mode mentioned earlier.
This way, we can achieve the strong consistency we want, and the entire distributed system looks like a standalone system with no replicas. Users can get a consistent experience when accessing the system from anywhere at any time. Therefore, this consistency is also called single-copy consistency.
However, if we delve into it, it seems there are still some corner cases.
Example A:
In this case, the client thinks the system has not successfully processed the request, but both the master and slave have persisted data, so the client and server have different perceptions.
Example B:
At this time, there will be two masters (the so-called split-brain phenomenon), and even the premise of single-master has been destroyed.
In this case, the system unexpectedly becomes multi-leader. It is still difficult to guarantee strong consistency with a carefully designed multi-leader system, let alone in such exceptions.
Example C:
This way, the data between replicas is inconsistent. 1 yuan will be deducted from the account on the first replica, while 2 yuan will be deducted from the account on the second replica.
Therefore, the single-master synchronous replication method does not provide absolute strong consistency but only consistency under the best-effort guarantee in normal cases.
(The corner cases above are also related to the exactly-once problem. Subsequent articles in this series will be devoted to it, which is not discussed here.)
Several corner cases mentioned above (such as the split-brain problem) seem to be very special, but there may be a very common fact behind them.
What causes the node to be misjudged
as dead, resulting in a split-brain?
Such reasons cause the communication between nodes to be inaccessible, at least in the short term.
More professionally, it is called network partition, which means a cluster is divided into several partitions with no network connection.
This leads to the famous CAP theorem.
Consistency (C), availability (A), and partition tolerance (P), at most two of which can be satisfied at the same time.
We have talked a lot about consistency and availability. We want availability, so we introduce the replica mechanism, which leads to a consistency crisis. Now, there is a network partition problem that may need to be solved.
However, the CAP theorem tells us we can't solve the problem.
Then let's deduce it.
In this analysis, the three cannot be satisfied at the same time.
In addition, in the analysis above, each case is based on the initial condition, if network partition occurs, which reveals its difference.
C, A, and P are not at the same level. C and A are targets, while P is an unavoidable precondition, although Partition-Tolerance is also a target. Numerous production accidents have told us that network partition occurs anytime and anywhere.
Therefore, a system without P does not have real high availability.
However, when CAP is implemented in the design of production-level distributed systems, it is more about making a trade-off between C and A on the premise of P.
We have introduced the replica mechanism for high availability, but the side effect of the replica mechanism is that it will cause data consistency problems.
For the example C above, we can interpret it differently: Data is copied from the master to multiple slaves, which can be regarded as several independent events of writing data to different nodes. It is the partial
success of these events that leads to data inconsistency.
If all events fail, try again. However, if some events succeed and some fail, the retry may cause data inconsistency.
There is already a reliable solution-transactions (the essence of the problem) to avoid partial success in multiple events or maintain the atomicity (either all succeed or all fail) of multiple events.
Specifically, what we need are distributed transactions.
In the next article, let's learn about distributed transactions.
This is a carefully conceived series of 20-30 articles. I hope to give everyone a core grasp of the distributed system in a storytelling way. Stay tuned for the next one!
Learning about Distributed Systems - Part 8: Improve Availability with Replications
Learning about Distributed Systems – Part 10: An Exploration of Distributed Transactions
64 posts | 54 followers
FollowAlibaba Cloud_Academy - September 1, 2022
Alibaba Cloud_Academy - October 7, 2023
Alibaba Clouder - October 23, 2020
Alibaba Cloud MaxCompute - July 15, 2021
Alibaba Cloud_Academy - September 4, 2023
Alibaba EMR - November 18, 2020
64 posts | 54 followers
FollowSecure and easy solutions for moving you workloads to the cloud
Learn MoreProvides scalable, distributed, and high-performance block storage and object storage services in a software-defined manner.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreAlibaba Cloud offers Independent Software Vendors (ISVs) the optimal cloud migration solutions to ready your cloud business with the shortest path.
Learn MoreMore Posts by Alibaba Cloud_Academy