This topic describes the logical replication slot failure feature of PolarDB for PostgreSQL. In case the primary node fails, this feature ensures a logical replication slot is synced from the primary to the read-only node.
Prerequisites
The PolarDB for PostgreSQL cluster runs the following engine:
PostgreSQL 11 (revision version 1.1.27 or later)
You can execute one of the following statements to view the revision version of your PolarDB for PostgreSQL cluster:
show polar_version;
Background information
According to PostgreSQL's streaming replication protocol, the logical replication slot on the primary node is not synced to the read-only node that serves as the standby. In the event of a failover, the logical replication slot is lost, and consequently the logical subscriptions are disrupted. To resolve this issue, PolarDB for PostgreSQL enhances the system by supporting the failover of the logical replication slot.
The failover of the physical replication slot is not supported.
For more information, see the Replication slots section of PostgreSQL's Logical decoding concepts.
Enable or disable the logical replication slot failover feature
You can enable or disable the logical replication slot failover feature by configuring the polar_failover_slot_mode parameter. Valid values are as follows:
sync: enables logical replication slot failover and sets it to the synchronous mode.
NoteThe synchronous mode ensures that no data is lost in logical replication during the failover of a cluster.
This is made possible by ensuring that the logical replication client does not receive changes earlier than the read-only node. However, in the event of long-time disconnection and reconnection or the rebuild of the read-only node, the time window becomes large enough that the primary-standby sync latency is relatively high and the logical replication client may receive changes earlier than the read-only node. In case a failover occurs during this time window, data may be lost in logical replication. In addition, the data that is generated on the new primary node may also be lost.
To avoid such data loss, you can set the polar_priority_replication_force_wait parameter to on, so that when the read-only is disconnected, the primary does not send changes to the logical replication client until the read-only node is reconnected or rebuilt. However, this reduces the availability of logical replication. Proceed with caution.
async (default): enables logical replication slot failover and sets it to the asynchronous mode.
NoteIn the asynchronous mode, no data is lost in logical replication during the failover of an HA cluster. However, duplicate data may be sent to the client. If the client has newer data than the new primary after failover is complete, the client may have retained lost data, if any, during failover.
If the inconsistent data affects your business, we recommend that you use the synchronous mode.
off: disables logical replication slot failover.