By digoal
PostgreSQL v12 — B-tree index enhanced (duplicate key sort by ctid)
PostgreSQL v12 NBtree has evolved into the fourth version, with three main enhancements:
1) The heap ctid is added to the key value of the index leaf page to ensure that the heap tuples of the duplicate value are stored in the leaf page of the index according to the physical order of the heap rows.
The index scan is now more efficient (index & heap correlate = 1) in improving scan or return of heap tuples with multiple repeated values. Previously, bitmap scan (sort blockid) was used for improvement while introducing the CPU recheck index cond.
2) With ctid in the leaf page, the key value in the leaf page is unique (because ctid is unique). When you insert the duplicate index field values, the rightmost leaf page containing the field value is split, significantly reducing the wasted space of the leaf page. In the earlier PG versions, the leaf page that satisfied the conditions was split at random, which could be the right page or the middle page.
Splitting the leaf page in versions earlier than PG v12:
Splitting the leaf page in PG v12:
3) For the NBtree index of PG v12, the internal page (or branch page) will truncate unnecessary key values, such as multi-field indexes. This function is similar to that of include index. In PG v11, only the columns in the include are not displayed in the index internal page. In PG v12 as well, the key columns may not be displayed in the internal page, making the index smaller.
As described above, PostgreSQL v12 introduced the TID as part of the index key, which would waste an inordinate amount of space in the internal index pages. So the same commit introduced truncation of ““redundant” index attributes from internal pages. The TID is redundant, so are non-key attributes from an INCLUDE clause (these were also removed from the internal index pages in v11). But PostgreSQL v12 can also truncate those index attributes not needed for table row identification.
In our primary key index, bid is a redundant column and is truncated from internal index pages, which saves 8 bytes of space per index entry. Let's examine an internal index page with the pageinspect extension:
PG v11:
drop table abc;
create table abc (c1 int, c2 int);
create index idx_abc_1 on abc(c1,c2);
create index idx_abc_2 on abc(c1) include (c2);
insert into abc select random()*100, generate_series(1,1000000);
postgres=# \di+ idx_abc_1
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-----------+-------+----------+-------+-------+-------------
public | idx_abc_1 | index | postgres | abc | 40 MB |
(1 row)
postgres=# \di+ idx_abc_2
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-----------+-------+----------+-------+-------+-------------
public | idx_abc_2 | index | postgres | abc | 27 MB |
(1 row)
PG v12:
postgres=# \di+ idx_abc_1
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-----------+-------+----------+-------+-------+-------------
public | idx_abc_1 | index | postgres | abc | 23 MB |
(1 row)
postgres=# \di+ idx_abc_2
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-----------+-------+----------+-------+-------+-------------
public | idx_abc_2 | index | postgres | abc | 20 MB |
(1 row)
You can see through pageinspect that PG v12 truncates some redundant key values in the internal page.
CREATE TABLE rel (
aid bigint NOT NULL,
bid bigint NOT NULL
);
ALTER TABLE rel
ADD CONSTRAINT rel_pkey PRIMARY KEY (aid, bid);
CREATE INDEX rel_bid_idx ON rel (bid);
\d rel
Table "public.rel"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+---------
aid | bigint | | not null |
bid | bigint | | not null |
Indexes:
"rel_pkey" PRIMARY KEY, btree (aid, bid)
"rel_bid_idx" btree (bid)
INSERT INTO rel (aid, bid)
SELECT i, i / 10000
FROM generate_series(1, 20000000) AS i;
/* set hint bits and calculate statistics */
VACUUM (ANALYZE) rel;
PG v11:
SELECT * FROM bt_page_items('rel_pkey', 2550);
itemoffset | ctid | itemlen | nulls | vars | data
------------+------------+---------+-------+------+-------------------------------------------------
1 | (2667,88) | 24 | f | f | cd 8f 0a 00 00 00 00 00 45 00 00 00 00 00 00 00
2 | (2462,0) | 8 | f | f |
3 | (2463,15) | 24 | f | f | d6 c0 09 00 00 00 00 00 3f 00 00 00 00 00 00 00
4 | (2464,91) | 24 | f | f | db c1 09 00 00 00 00 00 3f 00 00 00 00 00 00 00
5 | (2465,167) | 24 | f | f | e0 c2 09 00 00 00 00 00 3f 00 00 00 00 00 00 00
6 | (2466,58) | 24 | f | f | e5 c3 09 00 00 00 00 00 3f 00 00 00 00 00 00 00
7 | (2467,134) | 24 | f | f | ea c4 09 00 00 00 00 00 40 00 00 00 00 00 00 00
8 | (2468,25) | 24 | f | f | ef c5 09 00 00 00 00 00 40 00 00 00 00 00 00 00
9 | (2469,101) | 24 | f | f | f4 c6 09 00 00 00 00 00 40 00 00 00 00 00 00 00
10 | (2470,177) | 24 | f | f | f9 c7 09 00 00 00 00 00 40 00 00 00 00 00 00 00
...
205 | (2666,12) | 24 | f | f | c8 8e 0a 00 00 00 00 00 45 00 00 00 00 00 00 00
(205 rows)
PG v12:
SELECT * FROM bt_page_items('rel_pkey', 2700);
itemoffset | ctid | itemlen | nulls | vars | data
------------+----------+---------+-------+------+-------------------------
1 | (2862,1) | 16 | f | f | ab 59 0b 00 00 00 00 00
2 | (2576,0) | 8 | f | f |
3 | (2577,1) | 16 | f | f | 1f 38 0a 00 00 00 00 00
4 | (2578,1) | 16 | f | f | 24 39 0a 00 00 00 00 00
5 | (2579,1) | 16 | f | f | 29 3a 0a 00 00 00 00 00
6 | (2580,1) | 16 | f | f | 2e 3b 0a 00 00 00 00 00
7 | (2581,1) | 16 | f | f | 33 3c 0a 00 00 00 00 00
8 | (2582,1) | 16 | f | f | 38 3d 0a 00 00 00 00 00
9 | (2583,1) | 16 | f | f | 3d 3e 0a 00 00 00 00 00
10 | (2584,1) | 16 | f | f | 42 3f 0a 00 00 00 00 00
...
286 | (2861,1) | 16 | f | f | a6 58 0b 00 00 00 00 00
(286 rows)
In addition to the preceding three enhancements, PG v12 NBtree also features the following enhancements:
Select PG v12 NBtree v4 among pg_upgrade for updating to PG v12. We recommend using reindex CONCURRENTLY to rebuild indexes (this does not block the dml).
How PostgreSQL 12 Generates Less Log Data When Monitoring and Detecting Incomplete Startup Packet
digoal - May 25, 2021
digoal - May 19, 2021
ApsaraDB - October 20, 2020
ApsaraDB - December 27, 2023
Alibaba Clouder - November 6, 2018
digoal - February 17, 2022
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreCreate, delete and manage encryption keys with Alibaba Cloud Key Management Service
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreMore Posts by digoal