All Products
Search
Document Center

PolarDB:Read-ahead and pre-extension

Last Updated:May 17, 2024

This topic describes heap table read-ahead, heap table pre-extension, and index creation pre-extension.

Prerequisites

Your PolarDB for PostgreSQL cluster runs the following engine:

  • PostgreSQL 14 (revision version 14.5.1.0 or later)

  • PostgreSQL 11 (revision version 1.1.1 or later)

Note

You can execute one of the following statements to view the revision version of your PolarDB for PostgreSQL cluster:

  • PostgreSQL 14

    select version();
  • PostgreSQL 11

    show polar_version;

Background information

PolarDB for PostgreSQL uses PolarFileSystem (PFS) as its file system. Unlike standalone file systems such as ext4, PFS incurs high overheads for metadata updates during page extension. In PFS, the minimum size for page extension must be a multiple of 4 MB. In PostgreSQL, however, the minimum size for page extension must be a multiple of 8 MB. This is not suitable for PFS and causes performance degradation for writing tables or creating indexes. PFS is more I/O efficient for reading large pages.

Based on the preceding characteristics, PolarDB for PostgreSQL delivers the features of heap table read-ahead, heap table pre-extension, and index creation pre-extension, so that PolarDB for PostgreSQL can provide better performance on PFS.

Overview

  • Heap table read-ahead

    When PostgreSQL reads a heap table, it reads pages from the file system to the memory buffer pool in units of 8 KB. PFS is not efficient for I/O operations with small amounts of data. Therefore, PolarDB for PostgreSQL uses heap table read-ahead to adapt to PFS.

    When two or more pages are to be read, batch read-ahead is triggered, and 128 KB of data is read in each I/O to the buffer pool. Batch read-ahead doubles the performance of sequential scan and vacuum, and increases the performance for creating indexes by 18%.

  • Heap table pre-extension

    In PostgreSQL, 8 KB pages are applied and extended one by one during tablespace extension. Even if PostgreSQL supports batch page extension, N I/O operations are required in a task of extending N pages. This is not appropriate for PFS which uses the minimum page extension size of 4 MB. Therefore, PolarDB for PostgreSQL uses heap table pre-extension.

    In heap table extension, I/O extends 4 MB pages each time. In scenarios where tables are frequently written (for example, when data is loaded), the performance can be doubled.

  • Index creation pre-extension

    Index creation pre-extension is similar to heap table pre-extension. Index creation pre-extension optimizes the index creation process especially for PFS. In index creation pre-extension, I/O extends 4 MB pages each time. This can improve the performance for creating indexes by 30%.

    Note

    Index creation pre-extension supports only B-tree indexes. Other index types are not supported.

How it works

  • Heap table read-ahead

    Heap table read-ahead is implemented in four steps:

    1. Apply for N buffers from the buffer pool.

    2. Use palloc to apply for a space in memory which is N × Page size in size and name the space as p.

    3. Use PFS to read data of N × Page size in size from the heap table and copy the data to p.

    4. Copy the N pages in p to the N buffers which you apply for from the buffer pool.

    Subsequent read operations directly hit the buffers. The following figure shows the data flow: 堆表预读

  • Heap table pre-extension

    Heap table pre-extension is implemented in three steps:

    1. Apply for N buffers from the buffer pool without triggering page extension of the file system.

    2. Perform batch page extension by using the PFS file write interface and write all-zero pages.

    3. Initialize the pages one by one, identify the available space of the pages, and terminate the pre-extension process.

  • Index creation pre-extension

    Index creation pre-extension is implemented in a similar to heap table pre-extension, but does not need to apply for buffers. Perform the following steps:

    1. Perform batch page extension by using the PFS file write interface and write all-zero pages.

    2. Write the index pages that are built in the buffer pool to the file system.

Usage

  • Heap table read-ahead

    The polar_bulk_read_size parameter is related to heap table read-ahead. Heap table read-ahead is enabled by default. The default value of the parameter is 128 KB.

    Note

    We recommend that you do not modify the parameter value. 128 KB is the optimal value for PFS.

    • Disable heap table read-ahead.

      ALTER SYSTEM SET polar_bulk_read_size = 0;
      SELECT pg_reload_conf();
    • Enable heap table read-ahead and set the read-ahead size to 128 KB.

      ALTER SYSTEM SET polar_bulk_read_size = '128 KB';
      SELECT pg_reload_conf();
  • Heap table pre-extension

    The polar_bulk_extend_size parameter is related to heap table pre-extension. Heap table pre-extension is enabled by default. The default value of the parameter is 4 MB.

    Note

    We recommend that you do not modify the parameter value. 4 MB is the optimal value for PFS.

    • Disable heap table pre-extension.

      ALTER SYSTEM SET polar_bulk_extend_size = 0;
      SELECT pg_reload_conf();
    • Enable heap table pre-extension and set the pre-extension size to 4 MB.

      ALTER SYSTEM SET polar_bulk_extend_size = '4 MB';
      SELECT pg_reload_conf();
  • Index creation pre-extension

    The polar_index_create_bulk_extend_size parameter is related to index creation pre-extension. Index creation pre-extension is enabled by default. The default value of the parameter is 4 MB.

    Note

    We recommend that you do not modify the parameter value. 4 MB is the optimal value for PFS.

    • Disable index creation pre-extension.

      ALTER SYSTEM SET polar_index_create_bulk_extend_size = 0;
      SELECT pg_reload_conf();
    • Enable index creation pre-extension and set the pre-extension size to 4 MB.

      ALTER SYSTEM SET polar_index_create_bulk_extend_size = '4 MB';
      SELECT pg_reload_conf();

Performance comparison

To show the performance improvement results of heap table read-ahead, heap table pre-extension, and index creation pre-extension, the performance test is performed on the PolarDB for PostgreSQL cluster that runs PostgreSQL 14.

  • Cluster specifications: 8 cores and 32 GB of memory.

  • Test environment: 400 GB pgbench.

  • Heap table read-ahead

    • Performance comparison for vacuum on a 400 GB table: vacuum性能对比

    • Performance comparison for sequential scan on a 400 GB table: seqscan性能对比

    Conclusion:

    • Heap table read-ahead doubles or triples the performance for vacuum and sequential scan.

    • When the read-ahead size exceeds the default value of 128 KB, no significant performance improvement occurs.

  • Heap table pre-extension

    Performance comparison for data loading on a 400 GB table: 数据装载性能对比

    Conclusion:

    • Heap table pre-extension doubles the performance for data loading.

    • When the pre-extension size exceeds the default value of 4 MB, no significant performance improvement occurs.

  • Index creation pre-extension

    Performance comparison for creating indexes on a 400 GB table: 创建索引性能对比

    Conclusion:

    • Index creation pre-extension increases the performance for creating indexes by 30%.

    • When the pre-extension size exceeds the default value of 4 MB, no significant performance improvement occurs.