Global indexing is an indexing technique for partitioned tables. You can create global indexes on partitioned tables by using non-partition keys. Global indexes can provide unique constraints.
Prerequisites
The feature is supported on a PolarDB for PostgreSQL cluster that runs the following engine:
PostgreSQL 14 (revision version 14.6.4.0 or later)
You can execute the following statement to view the revision version of your PolarDB for PostgreSQL cluster:
select version();
Background information
As business data increases, data partitioning becomes a key feature of enterprise-level databases and an important means to reduce data scale. A partitioned table is split into several independent sub-tables based on partition keys. The sub-tables are managed separately to improve manageability, overall performance, and load balancing.
PolarDB for PostgreSQL allows you to manage partitions in a partitioned table. Typically, partitions are managed in the dimension of time.
Time is the partition keys of partitioned tables.
New partitions are created periodically (weekly or monthly), and new data is added to new partitions.
Old partitions are archived on a regular basis to reduce O&M costs.
In the preceding scenario, time is the partition key, instead of the primary key or unique ID. This causes the following problems:
In queries on the data in non-partition keys, all partitions are scanned because it is not clear which partition the data is in.
If you modify the data in non-partition keys, the data may not be unique in the partitioned table.
PolarDB for PostgreSQL provides the global index feature. Global indexes are created on a partitioned table. Unlike local indexes that are created on each partition by default, global indexes are for the data of the entire partitioned table (each index corresponding to multiple partitions). This can provide globally unique constraints on non-partition keys and greatly improve the query performance of non-partition keys.
Limits
You still can
attach
ordetach
partitions in partitioned tables where global indexes are created.To create a global index, add the
GLOBAL
keyword in the statement to create an index. If you do not add this keyword, a local index is created by default.You can concurrently create global indexes by adding the
CONCURRENTLY
keyword.You cannot create global indexes on non-partitioned tables or subpartitioned tables.
Global indexes cannot be expression indexes.
You cannot create global indexes on the partition key columns of partitioned tables.
Syntax
Create a global index.
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ ( opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ GLOBAL/LOCAL ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
Examples
Accelerate non-partition key queries
Create a partitioned table with a time column as the partition key.
CREATE TABLE partition_range ( id INT, a INT, b INT, created_date TIMESTAMP WITHOUT TIME ZONE ) PARTITION BY RANGE (created_date); CREATE TABLE partition_range_part01 PARTITION OF partition_range FOR VALUES FROM (MINVALUE) TO ('2020-01-01 00:00:00'); CREATE TABLE partition_range_part02 PARTITION OF partition_range FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-02-01 00:00:00'); CREATE TABLE partition_range_part03 PARTITION OF partition_range FOR VALUES FROM ('2020-02-01 00:00:00') TO ('2020-03-01 00:00:00');
Query the partition table based on non-partition key conditions.
EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;
The results indicate that all partitions are scanned and that the partition pruning feature cannot be used.
QUERY PLAN ------------------------------------------------------------ Append -> Seq Scan on partition_range_part01 partition_range_1 Filter: (id = 1) -> Seq Scan on partition_range_part02 partition_range_2 Filter: (id = 1) -> Seq Scan on partition_range_part03 partition_range_3 Filter: (id = 1) (7 rows)
Create local indexes on the partitioned table and perform the query again.
CREATE INDEX partition_range_idx_local ON partition_range(id); EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;
The results indicate that the local indexes of all partitions are scanned because local indexes are created on individual partitions.
QUERY PLAN -------------------------------------------------------------------------------------------------- Append -> Index Scan using partition_range_part01_id_idx on partition_range_part01 partition_range_1 Index Cond: (id = 1) -> Index Scan using partition_range_part02_id_idx on partition_range_part02 partition_range_2 Index Cond: (id = 1) -> Index Scan using partition_range_part03_id_idx on partition_range_part03 partition_range_3 Index Cond: (id = 1) (7 rows)
Create a global index on the partitioned table by using the
GLOBAL
keyword and perform the query again.CREATE INDEX partition_range_idx_global ON partition_range(id) GLOBAL; EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;
The results indicate that the global index is used to find the partition that contains the data.
QUERY PLAN ----------------------------------------------------------------------- Global Index Scan using partition_range_idx_global on partition_range Index Cond: (id = 1) (2 rows)
Unique constraint for non-partition keys
The partitioned table in the preceding example is still used. The partition key is created_date
, but the unique constraint column is id
.
CREATE UNIQUE INDEX partition_range_id_unique_idx ON partition_range(id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: UNIQUE constraint on table "partition_range" lacks column "created_date" which is part of the partition key.
If you create the unique constraint for a local index of a non-partition key, an error message is returned, indicating that a partition key must be included in the index.
If you add a unique constraint to a global index, such a restriction is removed.
CREATE UNIQUE INDEX partition_range_id_unique_idx ON partition_range(id) GLOBAL;
Performance test
Use pgbench to generate 80,000
rows of data, and create a partitioned table and a non-partitioned table.
Point query performance on non-partition keys
Item | TPS | |||||
Prepared Statement | Not used | Used | ||||
Concurrency | 1 | 32 | 64 | 1 | 32 | 64 |
Common table | 27,732 | 494,433 | 430,848 | 53,935 | 985,880 | 886,882 |
Partitioned table and local index | 367 | 4,155 | 3,688 | 856 | 8,742 | 6,790 |
Partitioned table and global index | 19,006 | 308,128 | 262,941 | 45,090 | 820,924 | 731,557 |
TPC-B performance on non-partition keys
Both point queries and DML are included.
Item | TPS | |||||
Prepared Statement | Not used | Used | ||||
Concurrency | 1 | 32 | 64 | 1 | 32 | 64 |
Common table | 1,115 | 51,025 | 60,409 | 4,822 | 90,312 | 100,802 |
Partitioned table and local index | 271 | 2,903 | 2,524 | 550 | 5,276 | 4,237 |
Partitioned table and global index | Not supported | 4,334 | 69,040 | 75,232 |
Conclusions
Global indexes can exponentially improve performance for point queries and DML.