×
Community Blog How to Determine Whether the Current PostgreSQL Database is in a Consistent State?

How to Determine Whether the Current PostgreSQL Database is in a Consistent State?

This article introduces how to determine whether the current PostgreSQL database is in a consistent state.

By Digoal

Background

1. How to determine whether the current database is in a consistent state?

When taking over a database, a concern is if "dangerous" parameters such as fpw=off and fsync=off are being used. How do you determine if the current database is consistent and if there are any block corruption issues?

Or having encountered issues like HA switching, server downtime, and forced shutdown and restart, even though these have been resolved, there's still uncertainty about the current database's consistency or potential block corruption.

2. What is a point of consistency?

The database regards all data blocks as consistent and there is no partial write issue (half new and half old). You can use the checksum to check it.

Additionally, there is no case where a transaction committed before this point does not exist or a transaction committed after this point exists. You can print the state by restoring logs.

3. How to check whether the current database is in a consistent state?

If the checksum is enabled when your database is initialized, you can stop the database and use pg_verify_checksums to check the state. At this time, the contents of all blocks are correct.

However, if the checksum is disabled when your database is initialized, you cannot check the state. Only when you find a block corruption can you know it through the error. At this time, you can skip the error by using zero_damaged_pages.

4. Has the point of consistency been restored when PITR is used?

After completing a collection of full and archived backups through normal methods, is the database in a consistent state when it is restored to a certain point?

The status can be verified by checking the log at the end of the recovery. If it has been restored to a point of consistency, then it is considered okay.

Sample code:

src/backend/access/transam/xlog.c

Example with a consistent state

reachedConsistency = true;    
ereport(LOG,    
                (errmsg("consistent recovery state reached at %X/%X",    
                                LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));    
/* Check if we should stop as soon as reaching consistency */    
if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE && reachedConsistency)    
{    
        ereport(LOG,    
                        (errmsg("recovery stopping after reaching consistency")));    

Example with an inconsistent state

if (reachedRecoveryTarget)    
{    
        if (!reachedConsistency)    
                ereport(FATAL,    
                                (errmsg("requested recovery stop point is before consistent recovery point")));    

5. Assuming that the database is healthy (that is, consistent) during online backup, under what circumstances can the backup set be restored to a consistent state?

WAL archiving is sufficient (all WALs from the start of backup to the end of backup) to restore it to the target WAL lsn point >= minimum recovery point (the WAL point after backup is stopped).

Reference:

Three Key Details and Principles of PostgreSQL Online Backup & Recovery

6. When the database crashes and recovers (including database crashes caused by kill 9, process crashes, shutdown immediate, or server power failure, but the file system is not damaged, the storage is not damaged, or the storage in the cache is not lost due to power loss), under what circumstances can it be restored to a consistent state?

If fsync is enabled and the full page write (FPW) is enabled (FPW can be disabled in the cow file system), it can be restored to a consistent state.

7. You can use the parameter to specify that the recovery should end as soon as a consistent state is reached, i.e., as early as possible.

recovery_target = 'immediate'

With this parameter, you do not need to set the target time point, target name, and target XID for recovery. When restoring from an online backup, this means the point where taking the backup ended.

Technically, this is a string parameter, but 'immediate' is currently the only allowed value.

Sample code:

/*    
 * For point-in-time recovery, this function decides whether we want to    
 * stop applying the XLOG before the current record.    
 *    
 * Returns true if we are stopping, false otherwise. If stopping, some    
 * information is saved in recoveryStopXid et al for use in annotating the    
 * new timeline's history file.    
 */    
static bool    
recoveryStopsBefore(XLogReaderState *record)    
{    
        bool            stopsHere = false;    
        uint8           xact_info;    
        bool            isCommit;    
        TimestampTz recordXtime = 0;    
        TransactionId recordXid;    
    
        /*    
         * Ignore recovery target settings when not in archive recovery (meaning    
         * we are in crash recovery).    
         */    
        if (!ArchiveRecoveryRequested)    
                return false;    
    
        /* Check if we should stop as soon as reaching consistency */    
        if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE && reachedConsistency)    
        {    
                ereport(LOG,    
                                (errmsg("recovery stopping after reaching consistency")));    
    
                recoveryStopAfter = false;    
                recoveryStopXid = InvalidTransactionId;    
                recoveryStopLSN = InvalidXLogRecPtr;    
                recoveryStopTime = 0;    
                recoveryStopName[0] = '\0';    
                return true;    
        }    
/*    
 * Same as recoveryStopsBefore, but called after applying the record.    
 *    
 * We also track the timestamp of the latest applied COMMIT/ABORT    
 * record in XLogCtl->recoveryLastXTime.    
 */    
static bool    
recoveryStopsAfter(XLogReaderState *record)    
{    
...    
        /* Check if we should stop as soon as reaching consistency */    
        if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE && reachedConsistency)    
        {    
                ereport(LOG,    
                                (errmsg("recovery stopping after reaching consistency")));    
    
                recoveryStopAfter = true;    
                recoveryStopXid = InvalidTransactionId;    
                recoveryStopTime = 0;    
                recoveryStopLSN = InvalidXLogRecPtr;    
                recoveryStopName[0] = '\0';    
                return true;    
        }    
    
    
    
    
    
                        if (reachedRecoveryTarget)    
                        {    
                                if (!reachedConsistency)    
                                        ereport(FATAL,    
                                                        (errmsg("requested recovery stop point is before consistent recovery point")));    
/*    
 * Have we reached a consistent database state? In crash recovery, we have    
 * to replay all the WAL, so reachedConsistency is never set. During archive    
 * recovery, the database is consistent once minRecoveryPoint is reached.    
 */    
bool            reachedConsistency = false;    
0 1 0
Share on

digoal

282 posts | 25 followers

You may also like

Comments