MongoDB replica set (V3.0) synchronizes member status information through heartbeat information. Each node periodically sends heartbeat information, such as the replica set status information shown in the rs.status() method, to other members in the replica set.
The node initiating the heartbeat request is called the source, and the member receiving the heartbeat request is the target. A heartbeat request is divided into three phases.
Let us examine the main state synchronization logic in these three phases.
In the default configuration, nodes of a replica set send a heartbeat request to the other members once every two seconds, namely the replSetHeartbeat command request. The content of the heartbeat request is similar to the one shown below (obtained through mongosniff packet capturing). It mainly contains the replSetName, the address of the heartbeat sending node, and the replica set version.
command: replSetHeartbeat database: admin metadata: { $replData: 1 } commandArgs: { replSetHeartbeat: "mongo-9552", pv: 1, v: 22, from: "10.101.72.137:9552", fromId: 3, checkEmpty: false }
When a member in the replica set receives a heartbeat request, it begins processing the request and returns the processing result to the requesting node.
If the node is not of the replica set mode, or the replica set name does not match, an error response will be returned. If the replica set version configured (the content of rs.conf()) for the source node is lower than that of the target node, the target node adds its own configuration to the heartbeat response message, and adds its own oplog and other status information to the heartbeat response message. If the target node is uninitialized, it immediately sends the heartbeat request to the source node to update its replica set configuration.
commandReply: { ok: 1.0, time: 1460705698, electionTime: new Date(6273289095791771649), e: true, rs: true, state: 1, v: 22, hbmsg: "", set: "mongo-9552", opTime: new Date(6272251740930703361) } metadata: { $replData: { term: -1, lastOpCommitted: { ts: Timestamp 1460372410000|1, t: -1 }, lastOpVisible: { ts: Timestamp 0|0, t: -1 }, configVersion: 22, primaryIndex: 2, syncSourceIndex: -1 } }
Phase 3 is the most important part of processing. After receiving the heartbeat response, the source node will update the status of the peer node according to the response message and determine whether a re-election is required based on the final status.
When an error response to the heartbeat request is received (a response timeout is also considered an error response), if the current number of retries is fewer than or equal to kMaxHeartbeatRetries (two by default), and the last heartbeat request was sent within kDefaultHeartbeatTimeoutPeriod (10 by default), the next heartbeat request will be sent immediately. When the number of retries exceeds kMaxHeartbeatRetries, or a period of kDefaultHeartbeatTimeoutPeriod has elapsed since the last heartbeat, the node is considered down. If the replica set version of the peer node is higher than that of the node itself, the configuration of the node will be updated and stored persistently in the local database, and the node will update the peer status information based on the response message. If the node itself is the master node, and it finds another node with a higher priority level has been elected as the master node, it takes the initiative to downgrade itself to a slave node. If the node itself is a slave node but finds that it has a higher priority level and is eligible to be elected as a master node, it will take the initiative to request the current master node to downgrade. (This logic still contains some bugs, so self-downgrading by the master node will take priority so as to ensure that the node with the highest priority can act as the master node). If there is no master node at the moment, the node will take the initiative to trigger an election. A new master node can then be elected after a majority of nodes agree with the election result.
MongoDB synchronizes information between nodes through heartbeats and triggers the election to achieve final consistency in the replica set.
However, there is no theoretical basis for the correctness of the process. In MongoDB 3.2, a new version of the replica set communication protocol is used and election is conducted through raft, which can further shorten the time for fault discovery and instance restoration.
Structuring the Backend Service Architecture of a Mobile Card Game
2,599 posts | 762 followers
FollowApsaraDB - March 26, 2024
Apache Flink Community China - January 31, 2024
ApsaraDB - January 12, 2023
ApsaraDB - January 3, 2024
ApsaraDB - June 29, 2021
ApsaraDB - September 8, 2021
2,599 posts | 762 followers
FollowA secure, reliable, and elastically scalable cloud database service for automatic monitoring, backup, and recovery by time point
Learn MoreAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreSecure and easy solutions for moving you workloads to the cloud
Learn MoreThis solution helps you easily build a robust data security framework to safeguard your data assets throughout the data security lifecycle with ensured confidentiality, integrity, and availability of your data.
Learn MoreMore Posts by Alibaba Clouder