All Products
Search
Document Center

ApsaraDB RDS:Use the pg_repack extension to clear tablespaces

Last Updated:Nov 11, 2024

This topic describes how to clear the tablespaces of an ApsaraDB RDS for PostgreSQL instance by using the pg_repack extension. If you perform a large number of operations, such as UPDATE, on an entire table, a table bloat may occur. In this case, you can use this extension to mitigate the table bloat. When you use this extension to handle a table bloat issue, this extension does not need to hold an exclusive lock on the bloated table. This extension is more lightweight than the CLUSTER and VACUUM FULL statements.

Prerequisites

  • The RDS instance runs PostgreSQL 10.0 or a later version.

    Note

    This extension is not supported by ApsaraDB RDS for PostgreSQL instances that run PostgreSQL 17.

  • The RDS instance runs a minor engine version of 20240229 or later. For more information about how to update the minor engine version, see Update the minor engine version.

    Note

    If the RDS instance runs a minor engine version earlier than 20240229, the error message ERROR: must be superuser to use xxx function is displayed when you use the extension.

Usage notes

  • The pg_repack extension requires additional storage. A full-table repack requires the available storage to be at least twice the size of the table that you want to repack.

  • The pg_repack extension cannot remove bloat from temporary tables.

  • The pg_repack extension cannot remove bloat from GiST indexes.

  • When a table is being repacked by the pg_repack extension, you cannot execute DDL statements on the table. The pg_repack extension holds an ACCESS SHARE lock on the bloated table to prohibit the execution of DDL statements on the bloated table.

  • The pg_repack extension consumes a large number of disk I/O resources to create tables and indexes. Before you use the pg_repack extension, you must evaluate whether the repack operation can interrupt your workloads. For example, if your RDS instance uses enhanced SSDs (ESSDs) of performance level 1 (PL1) and you want to repack a 100-GB table, the IO throughput of the RDS instance can reach the maximum IO throughput of 250 MB/s.

Feature description

The pg_repack extension supports full-table repack and index repack.

  • The following procedure shows how the extension repacks an entire table:

    1. Creates a log table. The log table is used to record the changes that are made to the original table during the repack process.

    2. Creates triggers on the original table. The triggers are used to record the INSERT, UPDATE, and DELETE statements that are executed on the original table and insert the records into the log table during the repack process.

    3. Creates a table. The new table has the same rows and columns as the original table.

    4. Creates indexes in the new table.

    5. Applies the data changes in the log table to the new table.

    6. Switches the original and new tables in the system catalog.

    7. Deletes the original table.

    Note

    The pg_repack extension holds the ACCESS EXCLUSIVE lock to prohibit operations on the original table in Steps a, b, f, and g. In other steps, the extension holds the ACCESS SHARE lock. The lock does not prohibit the execution of INSERT, UPDATE, and DELETE statements on the original table.

  • The following procedure shows how the extension repacks the indexes on a table:

    1. Concurrently creates indexes.

    2. Switches the original and new indexes in the system catalog.

    3. Deletes the original indexes.

Enable or disable the pg_repack extension

Note

You must use the privileged account to execute the statements in this section.

  • Enable the pg_repack extension.

    CREATE EXTENSION pg_repack;
  • Disable the pg_repack extension.

    DROP EXTENSION pg_repack;

Install the client utility of the pg_repack extension

You must install the client utility of the pg_repack extension. If your Elastic Compute Service (ECS) instance runs Alibaba Cloud Linux 3.2104 or later, perform the following steps to install the client utility:

  1. Install environment dependencies.

    sudo yum install postgresql* redhat-rpm-config libpq* openssl-devel readline-devel -y
  2. Add environment variables.

    export PATH=$PATH:/usr/lib64/pgsql/postgresql-12/bin
    Note

    The path and PostgreSQL version in the preceding command are for reference only. Replace them with actual values in your business.

  3. Download the client utility, compile the client utility, and then install the client utility.

    wget https://github.com/reorg/pg_repack/archive/refs/tags/ver_1.4.6.tar.gz
    tar zxvf ver_1.4.6.tar.gz
    cd pg_repack-ver_1.4.6
    make && make install

Examples

-- Check but do not perform the repack operation: --dry-run
pg_repack --dry-run --no-superuser-check --echo --no-order -h Endpoint -p Port -d database1 -U user --table schema1.table1

Check and perform the repack operation:
pg_repack  --no-superuser-check --echo --no-order -h Endpoint -p Port -d database1 -U user --table schema1.table1

FAQ

What do I do if the "ERROR: pg_repack failed with error: You must be a superuser to use pg_repack" message is returned?

Use the -k or --no-superuser-check option to skip the superuser permission check. This way, you can prevent this type of permission error.

Important

When you repack a full table, you cannot use this method to obtain permissions.

References

For more information about the pg_repack extension, see Reorganize tables in PostgreSQL databases with minimal locks.