ZooKeeper Practice: Leader Election Caused by zxid Overflow

This article describes the issue that a ZooKeeper transaction ID (zxid) overflow occurs when a ZooKeeper instance is used and provides the cause and solutions to the issue.

By Zikui

Background

Online Flink users utilize ZooKeeper as the metadata center and use it to perform a leader election. However, in some versions of Flink, the leader election in ZooKeeper may cause the job to restart, leading to some unexpected business losses. Specifically, ZooKeeper will actively trigger a leader election when the zxid overflows, which will lead to an unexpected restart of the Flink jobs, resulting in business losses. This article analyzes the leader election issue caused by ZooKeeper zxid overflow from both the principle and best practice and provides a solution accordingly. This issue is detected when checking the ZooKeeper server logs.

zxid lower 32 bits have rolled over, forcing re-election, and therefore new epoch start

Solution

ZooKeeper provides the maximum zxid that is currently processed, and you can view it through the stat interface. With this value, you can calculate the gap between the current zxid and the overflow value. MSE provides risk management and alerts related to the leader election to prevent and detect the leader election risks in advance, thus avoiding business losses.

You can predict the risk by using MSE ZooKeeper risk management and leader election time alerts.

MSE ZooKeeper offers the capability of risk management which regularly scans for instance risks and notifies users. zxid overflow is one of the risks. Before zxid approaches the overflow value, the risk of zxid overflow can be detected when the risk management scans for the risk, so you can take measures in advance to avoid it.

The risk management feature scans the instance for various risks on a daily basis, and you can also manually trigger a health check with one click to diagnose risks.

At the same time, through the leader election time alerts of MSE ZooKeeper, the leader election duration can be monitored to prevent business losses caused by long leader election duration. You can create an MSE alert rule through the Alert management pane to configure related parameters for the leader election duration.

Cause

What zxid Is and How It Is Generated

First, let's understand what zxid is and how it is generated: zxid is the globally unique ID for a transaction in ZooKeeper, which describes the total order of transactions. The client's changes to the internal data of ZooKeeper are completed through the propagation and processing of transactions within the ZooKeeper instance. Therefore, zxid is the unique ID for the transaction generated by the client's data change in the global transaction. This ID describes the position of the changed transaction in the global transaction, and no two distinct transactions will share the same zxid (total order).

zxid is a 64-bit number that consists of two parts: the current election cycle (epoch, the high-order 32-bit value) and the counting part (counter, the low-order 32-bit value). The epoch indicates the changes in the leader relationship. Whenever a new cluster elects a new leader, a new epoch is generated to represent the cycle of the current leader election. After the ZooKeeper instance selects a leader successfully, it is guaranteed that there will be only one leader and this leader's epoch has not been used before. This ensures that only one leader will use the epoch generated during this election process. On this basis, whenever a client changes the data, the current counter's value is increased by 1 to generate a new transaction's zxid, and the leader uses this zxid to synchronize the transaction within the instance, thus ensuring the total order of transactions.

Why the Leader Re-election is Needed when a zxid Overflow Occurs

By studying the composition of zxid, it can be found that if too many transactions are processed in a single epoch, the counter corresponding to the current epoch exceeds the maximum 32-bit value, and if the counting continues, the epoch will be increased by 1. If in the future, an election occurs and another server becomes the leader, the new epoch it generates may coincide with the epoch in the current zxid. As a result, different transactions have the same zxid, which breaks the total order in transactions and may result in dirty data. Therefore, when the low-order 32-bit value reaches the maximum counter, ZooKeeper will actively trigger a leader election to avoid the above problems.

What the Impact Is when the ZooKeeper Instance Performs a Leader Election

In general, when using ZooKeeper as a registry and configuration center, the client is unaware of the leader election. After the leader election is completed, the client will be actively reconnected and recovered. However, applications that depend on the ZooKeeper Disconnected event may be affected. During the leader election, the server will return a Disconnected event to the client. For example, in a scenario where the LeaderLatch type is used as a Curator recipe, the LeaderLatch recipe will re-elect another registered instance as the leader.

Community

ZooKeeper Practice: Leader Election Caused by zxid Overflow

Background

Solution

Cause

What zxid Is and How It Is Generated

Why the Leader Re-election is Needed when a zxid Overflow Occurs

What the Impact Is when the ZooKeeper Instance Performs a Leader Election

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

Best Practices

Microservices Engine (MSE)