By digoal
1. Create 2 partition tables, 1024 and 4 partitions, respectively:
create table a (id int, info text) partition by hash (id);
do language plpgsql $$
declare
begin
for i in 0..1023 loop
execute format('create table a%s partition of a for values with (MODULUS %s, REMAINDER %s)', i, 1024, i);
end loop;
end;
$$;
create table b (id int, info text) partition by hash (id);
do language plpgsql $$
declare
begin
for i in 0..3 loop
execute format('create table b%s partition of b for values with (MODULUS %s, REMAINDER %s)', i, 4, i);
end loop;
end;
$$;
2. Insert Data into the Partition Table of Four Partitions:
postgres=# insert into b select generate_series(1,1000000), random()::text;
INSERT 0 1000000
postgres=# \dt+ b*
List of relations
Schema | Name | Type | Owner | Persistence | Size | Description
--------+------+-------------------+----------+-------------+---------+-------------
public | b | partitioned table | postgres | permanent | 0 bytes |
public | b0 | table | postgres | permanent | 13 MB |
public | b1 | table | postgres | permanent | 13 MB |
public | b2 | table | postgres | permanent | 13 MB |
public | b3 | table | postgres | permanent | 13 MB |
(5 rows)
3. Query the data of the b0 subpartition, and you will find that the value of ID is not directly related to the result of the direct hash modulo.
postgres=# select * from b0 limit 10;
id | info
----+----------------------
1 | 0.5017019882606526
12 | 0.4488983802956099
14 | 0.036601106391117355
16 | 0.8031411032108977
17 | 0.2561659714681248
26 | 0.8746840737824613
28 | 0.9703716579314445
30 | 0.597006160271178
32 | 0.4628211184001785
34 | 0.9344181322067726
(10 rows)
If you use the hashvalue of hashint4, the result of the remainder is not related to the REMAINDER of the subpartition. What is the hash partition algorithm?
postgres=# select hashint4(id),* from b0 limit 10;
hashint4 | id | info
-------------+----+----------------------
-1905060026 | 1 | 0.5017019882606526
1938269380 | 12 | 0.4488983802956099
-1018458207 | 14 | 0.036601106391117355
-771555569 | 16 | 0.8031411032108977
-917053276 | 17 | 0.2561659714681248
-1668038584 | 26 | 0.8746840737824613
496699940 | 28 | 0.9703716579314445
1660582443 | 30 | 0.597006160271178
1125460419 | 32 | 0.4628211184001785
800709825 | 34 | 0.9344181322067726
(10 rows)
postgres=# select mod(hashint4(id),4),* from b0 limit 10;
mod | id | info
-----+----+----------------------
-2 | 1 | 0.5017019882606526
0 | 12 | 0.4488983802956099
-3 | 14 | 0.036601106391117355
-1 | 16 | 0.8031411032108977
0 | 17 | 0.2561659714681248
0 | 26 | 0.8746840737824613
0 | 28 | 0.9703716579314445
3 | 30 | 0.597006160271178
3 | 32 | 0.4628211184001785
1 | 34 | 0.9344181322067726
(10 rows)
Code related to hash partitioning and sharding computing:
https://doxygen.postgresql.org/partbounds_8c_source.html#l04789
https://doxygen.postgresql.org/partition_8h.html
If you want to insert the hash value into the corresponding partition, you need to write a function of the SQL interface to convert the original value to the hash shard ID. Please see
Partition Table of PostgreSQL Hash and Partition ID Calculation for more information about the INT type.
4. Write b0 data into A and observe which subpartitions are written to
The written subpartition is modulo by 4=0, which is in line with expectations. This is more convenient for splitting and merging partitions:
postgres=# insert into a select * from b0;
postgres=# select distinct mod(substring(x,'a(\d*)')::int, 4) from ( select distinct tableoid::regclass::text x from a order by 1) t;
mod
-----
0
(1 row)
For example, you can align MODULUS and REMAINDER to avoid recalculation of all data as long as the scaling value is a multiple.
For example, 1024 partitions are merged into 512 partitions.
REMAINDER 0,512 Merged to 0
REMAINDER 1,513 Merged to 1
...
REMAINDER 511,1023 Merged to 511
Even a partition table might contain different MODULUS subpartitions:
create table c (id int, info text) partition by hash (id);
create table c2_0 partition of c for values with (MODULUS 2, REMAINDER 0);
create table c4_1 partition of c for values with (MODULUS 4, REMAINDER 1);
create table c4_3 partition of c for values with (MODULUS 4, REMAINDER 3);
postgres=# \d+ c
Partitioned table "public.c"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
Partition key: HASH (id)
Partitions: c2_0 FOR VALUES WITH (modulus 2, remainder 0),
c4_1 FOR VALUES WITH (modulus 4, remainder 1),
c4_3 FOR VALUES WITH (modulus 4, remainder 3)
postgres=# create table c4_0 partition of c for values with (MODULUS 4, REMAINDER 0);
ERROR: 42P17: partition "c4_0" would overlap partition "c2_0"
LOCATION: check_new_partition_bound, partbounds.c:3077
https://www.postgresql.org/docs/current/sql-createtable.html
Example of splitting partitions and extending partitions:
set lock_timeout='1s';
begin;
alter table c detach partition c2_0;
create table c4_0 partition of c for values with (MODULUS 4, REMAINDER 0);
create table c4_2 partition of c for values with (MODULUS 4, REMAINDER 2);
insert into c select * from c2_0;
end;
Example of merging partitions and shrinking partitions:
set lock_timeout='1s';
begin;
alter table c detach partition c4_0;
alter table c detach partition c4_2;
create table c2_0 partition of c for values with (MODULUS 2, REMAINDER 0);
insert into c select * from c4_0;
insert into c select * from c4_2;
end;
Partition Table of PostgreSQL Hash and Partition ID Calculation
PostgreSQL Recursive Query – Examples of Depth-First and Breadth-First Search
ApsaraDB - July 26, 2024
digoal - April 30, 2021
digoal - June 26, 2019
digoal - September 22, 2022
digoal - April 20, 2021
digoal - August 6, 2021
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreA fully managed NoSQL cloud database service that enables storage of massive amount of structured and semi-structured data
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreMore Posts by digoal