全部產品
Search
文件中心

PolarDB:pg_pathman(分區管理)

更新時間:Nov 20, 2024

本文介紹pg_pathman外掛程式的一些常見用法。

背景資訊

為了提高分區表的效能,PolarDB PostgreSQL版引入pg_pathman外掛程式。該外掛程式是一款分區管理外掛程式,提供了分區最佳化機制。

建立pg_pathman外掛程式

說明

如您需要使用pg_pathman外掛程式的分區管理功能,請聯絡我們操作。

CREATE EXTENSION IF NOT EXISTS pg_pathman;

建立外掛程式成功後可通過以下SQL語句查看版本:

SELECT extname,extversion FROM pg_extension WHERE extname = 'pg_pathman';

返回結果如下:

  extname   | extversion 
------------+------------
 pg_pathman | 1.5
(1 row)

外掛程式升級

PolarDB PostgreSQL版會定期對外掛程式進行升級,以提供更優質的資料庫服務。而當您需要升級外掛程式版本時,需要將叢集升級到最新版本。

外掛程式特性

  • 目前支援HASH分區、RANGE分區。

  • 支援自動分區管理(通過函數介面建立分區,自動將主表資料移轉到分區表),或手工分區管理(通過函數實現,將已有的表綁定到分區表,或者從分區表剝離)。

  • 支援的分區欄位類型包括int、float、date以及其他常用類型,包括自訂的domain。

  • 有效分區表查詢計劃(JOINs、subselects 等)。

  • 使用RuntimeAppend & RuntimeMergeAppend 自訂計劃節點實現了動態分區選擇。

  • PartitionFilter是一種有效插入觸發器替換方法。

  • 支援自動新增分區(目前僅支援RANGE分區表)。

  • 支援copy from/to直接讀取或寫入分區表,提高效率。

  • 支援分區欄位的更新,需要添加觸發器,如果不需要更新分區欄位,則不建議添加這個觸發器,對應會產生一定的效能影響。

  • 允許使用者自訂回呼函數,在建立分區時會自動觸發。

  • 非堵塞式建立分區表,以及後台自動將主表資料非堵塞式遷移到分區表。

  • 支援FDW,通過配置參數pg_pathman.insert_into_fdw=(disabled | postgres | any_fdw)支援postgres_fdw或任意FDW。

外掛程式用法

更多用法,請參見GitHub

相關視圖和表

pg_pathman利用函數來維護分區表,並建立了一些視圖,以便於查看分區表的狀態。具體如下:

  • pathman_config

    CREATE TABLE IF NOT EXISTS pathman_config (
        partrel         REGCLASS NOT NULL PRIMARY KEY,  -- 主表oid
        attname         TEXT NOT NULL,  -- 分區列名
        parttype        INTEGER NOT NULL,  -- 分區類型(hash or range)
        range_interval  TEXT,  -- range分區的interval
    
        CHECK (parttype IN (1, 2)) /* check for allowed part types */ );
  • pathman_config_params

    CREATE TABLE IF NOT EXISTS pathman_config_params (
        partrel        REGCLASS NOT NULL PRIMARY KEY,  -- 主表oid
        enable_parent  BOOLEAN NOT NULL DEFAULT TRUE,  -- 是否在最佳化器中過濾主表
        auto           BOOLEAN NOT NULL DEFAULT TRUE,  -- insert時是否自動擴充不存在的分區
        init_callback  REGPROCEDURE NOT NULL DEFAULT 0);  -- create partition時的回呼函數oid
  • pathman_concurrent_part_tasks

    -- helper SRF function
    CREATE OR REPLACE FUNCTION show_concurrent_part_tasks()  
    RETURNS TABLE (
        userid     REGROLE,
        pid        INT,
        dbid       OID,
        relid      REGCLASS,
        processed  INT,
        status     TEXT)
    AS 'pg_pathman', 'show_concurrent_part_tasks_internal'
    LANGUAGE C STRICT;
    
    CREATE OR REPLACE VIEW pathman_concurrent_part_tasks
    AS SELECT * FROM show_concurrent_part_tasks();
  • pathman_partition_list

    -- helper SRF function
    CREATE OR REPLACE FUNCTION show_partition_list()
    RETURNS TABLE (
        parent     REGCLASS,
        partition  REGCLASS,
        parttype   INT4,
        partattr   TEXT,
        range_min  TEXT,
        range_max  TEXT)
    AS 'pg_pathman', 'show_partition_list_internal'
    LANGUAGE C STRICT;
    
    CREATE OR REPLACE VIEW pathman_partition_list
    AS SELECT * FROM show_partition_list();

分區管理

RANGE分區

有四個管理函數用來建立定界分割。其中兩個可以指定起始值、間隔、分區個數,其函數定義如下:

create_range_partitions(relation       REGCLASS,  -- 主表OID
                        attribute      TEXT,      -- 分區列名
                        start_value    ANYELEMENT,  -- 開始值
                        p_interval     ANYELEMENT,  -- 間隔;任意類型,適合任意類型的分區表
                        p_count        INTEGER DEFAULT NULL,   --  分多少個區
                        partition_data BOOLEAN DEFAULT TRUE)   --  是否立即將資料從主表遷移到分區, 不建議這麼使用, 建議使用非堵塞式的遷移( 調用partition_table_concurrently() )

create_range_partitions(relation       REGCLASS,  -- 主表OID
                        attribute      TEXT,      -- 分區列名
                        start_value    ANYELEMENT,  -- 開始值
                        p_interval     INTERVAL,    -- 間隔;interval 類型,用於時間分區表
                        p_count        INTEGER DEFAULT NULL,   --  分多少個區
                        partition_data BOOLEAN DEFAULT TRUE)   --  是否立即將資料從主表遷移到分區, 不建議這麼使用, 建議使用非堵塞式的遷移( 調用partition_table_concurrently() )

另外兩個可以指定起始值、終值、間隔,其定義如下:

create_partitions_from_range(relation       REGCLASS,  -- 主表OID
                             attribute      TEXT,      -- 分區列名
                             start_value    ANYELEMENT,  -- 開始值
                             end_value      ANYELEMENT,  -- 結束值
                             p_interval     ANYELEMENT,  -- 間隔;任意類型,適合任意類型的分區表
                             partition_data BOOLEAN DEFAULT TRUE)   --  是否立即將資料從主表遷移到分區, 不建議這麼使用, 建議使用非堵塞式的遷移( 調用partition_table_concurrently() )

create_partitions_from_range(relation       REGCLASS,  -- 主表OID
                             attribute      TEXT,      -- 分區列名
                             start_value    ANYELEMENT,  -- 開始值
                             end_value      ANYELEMENT,  -- 結束值
                             p_interval     INTERVAL,    -- 間隔;interval 類型,用於時間分區表
                             partition_data BOOLEAN DEFAULT TRUE)   --  是否立即將資料從主表遷移到分區, 不建議這麼使用, 建議使用非堵塞式的遷移( 調用partition_table_concurrently() )

樣本如下所示:

  1. 建立需要分區的主表,並插入測試資料。

    --- 建立需要分區的主表
    CREATE TABLE part_test(id int, info text, crt_time timestamp not null);  -- 分區列必須有not null約束  
    
    --- 插入測試資料,類比已存在資料的主表
    INSERT INTO part_test SELECT id,md5(random()::text),clock_timestamp() + (id||' hour')::interval from generate_series(1,10000) t(id); 

    查詢主表的資料情況:

    SELECT * FROM part_test limit 10;

    返回結果如下:

     id |               info               |          crt_time          
    ----+----------------------------------+----------------------------
      1 | 36fe1adedaa5b848caec4941f87d443a | 2016-10-25 10:27:13.206713
      2 | c7d7358e196a9180efb4d0a10269c889 | 2016-10-25 11:27:13.206893
      3 | 005bdb063550579333264b895df5b75e | 2016-10-25 12:27:13.206904
      4 | 6c900a0fc50c6e4da1ae95447c89dd55 | 2016-10-25 13:27:13.20691
      5 | 857214d8999348ed3cb0469b520dc8e5 | 2016-10-25 14:27:13.206916
      6 | 4495875013e96e625afbf2698124ef5b | 2016-10-25 15:27:13.206921
      7 | 82488cf7e44f87d9b879c70a9ed407d4 | 2016-10-25 16:27:13.20693
      8 | a0b92547c8f17f79814dfbb12b8694a0 | 2016-10-25 17:27:13.206936
      9 | 2ca09e0b85042b476fc235e75326b41b | 2016-10-25 18:27:13.206942
     10 | 7eb762e1ef7dca65faf413f236dff93d | 2016-10-25 19:27:13.206947
    (10 rows)
  2. 建立分區,每個分區包含1個月的跨度資料 。

    --- 建立分區,每個分區包含1個月的跨度資料  
    SELECT                                             
    create_range_partitions('part_test'::regclass,             -- 主表OID
                            'crt_time',                        -- 分區列名
                            '2016-10-25 00:00:00'::timestamp,  -- 開始值
                            interval '1 month',                -- 間隔;interval 類型,用於時間分區表
                            24,                                -- 分多少個區
                            false) ;                           -- 不遷移資料
  3. 使用非堵塞式的遷移介面,遷移主表資料。

    --- 未遷移資料時,資料還在主表
    SELECT count(*) FROM ONLY part_test;
     count 
    -------
     10000
    (1 row)
    
    
    --- 非堵塞式的遷移介面  
    partition_table_concurrently(relation   REGCLASS,              -- 主表OID
                                 batch_size INTEGER DEFAULT 1000,  -- 一個事務批量遷移多少記錄
                                 sleep_time FLOAT8 DEFAULT 1.0)    -- 獲得行鎖失敗時,休眠多久再次擷取,重試60次退出任務。
    
    
    --- 使用非堵塞式的遷移介面,遷移主表資料
    SELECT partition_table_concurrently('part_test'::regclass,
                                 10000,
                                 1.0);
    
    
    --- 遷移結束後,主表資料已不存在,所有資料均已轉移至分區中
    SELECT count(*) FROM ONLY part_test;
     count 
    -------
         0
    (1 row)
  4. 資料移轉完成後,建議禁用主表,以避免在執行計畫中出現主表。

    --- 禁用主表
    SELECT set_enable_parent('part_test'::regclass, false);
    
    --- 驗證
    EXPLAIN SELECT * FROM part_test WHERE crt_time = '2016-10-25 00:00:00'::timestamp;
                                       QUERY PLAN                                    
    ---------------------------------------------------------------------------------
     Append  (cost=0.00..16.18 rows=1 width=45)
       ->  Seq Scan on part_test_1  (cost=0.00..16.18 rows=1 width=45)
             Filter: (crt_time = '2016-10-25 00:00:00'::timestamp without time zone)
    (3 rows)
說明

在RANGE分區表使用過程中,建議您遵循以下原則:

  • 分區列必須有not null約束。

  • 分區個數必須能覆蓋已有的所有記錄。

  • 使用非堵塞式遷移介面。

  • 資料移轉完成後,禁用主表。

HASH分區

有一個管理函數用來建立定界分割,可以指定起始值、間隔、分區個數,具體如下:

create_hash_partitions(relation         REGCLASS,  -- 主表OID
                       attribute        TEXT,      -- 分區列名
                       partitions_count INTEGER,   -- 打算建立多少個分區
                       partition_data   BOOLEAN DEFAULT TRUE)   --  是否立即將資料從主表遷移到分區, 不建議這麼使用, 建議使用非堵塞式的遷移( 調用partition_table_concurrently() )

樣本如下所示:

  1. 建立需要分區的主表,並插入測試資料。

    --- 建立需要分區的主表
    CREATE TABLE part_test(id int, info text, crt_time timestamp not null);    -- 分區列必須有not null約束  
    
    --- 插入測試資料,類比已存在資料的主表
    INSERT INTO part_test SELECT id,md5(random()::text),clock_timestamp() + (id||' hour')::interval FROM generate_series(1,10000) t(id); 

    查詢主表的資料情況:

    SELECT * FROM part_test limit 10;

    返回結果如下:

     id |               info               |          crt_time          
    ----+----------------------------------+----------------------------
      1 | 29ce4edc70dbfbe78912beb7c4cc95c2 | 2016-10-25 10:47:32.873879
      2 | e0990a6fb5826409667c9eb150fef386 | 2016-10-25 11:47:32.874048
      3 | d25f577a01013925c203910e34470695 | 2016-10-25 12:47:32.874059
      4 | 501419c3f7c218e562b324a1bebfe0ad | 2016-10-25 13:47:32.874065
      5 | 5e5e22bdf110d66a5224a657955ba158 | 2016-10-25 14:47:32.87407
      6 | 55d2d4fd5229a6595e0dd56e13d32be4 | 2016-10-25 15:47:32.874076
      7 | 1dfb9a783af55b123c7a888afe1eb950 | 2016-10-25 16:47:32.874081
      8 | 41eeb0bf395a4ab1e08691125ae74bff | 2016-10-25 17:47:32.874087
      9 | 83783d69cc4f9bb41a3978fe9e13d7fa | 2016-10-25 18:47:32.874092
     10 | affc9406d5b3412ae31f7d7283cda0dd | 2016-10-25 19:47:32.874097
    (10 rows)
  2. 建立分區。

    --- 建立128個分區
    SELECT                                              
    create_hash_partitions('part_test'::regclass,              -- 主表OID
                            'crt_time',                        -- 分區列名
                            128,                               -- 計劃建立分區個數
                            false) ;                           -- 不遷移資料
  3. 使用非堵塞式的遷移介面,遷移主表資料。

    --- 未遷移資料時,資料還在主表
    SELECT count(*) FROM ONLY part_test;
     count 
    -------
     10000
    (1 row)
    
    
    --- 非堵塞式的遷移介面  
    partition_table_concurrently(relation   REGCLASS,              -- 主表OID
                                 batch_size INTEGER DEFAULT 1000,  -- 一個事務批量遷移多少記錄
                                 sleep_time FLOAT8 DEFAULT 1.0)    -- 獲得行鎖失敗時,休眠多久再次擷取,重試60次退出任務。
    
    --- 使用非堵塞式的遷移介面,遷移主表資料
    SELECT partition_table_concurrently('part_test'::regclass,
                                 10000,
                                 1.0);
    
    --- 遷移結束後,主表資料已不存在,所有資料均已轉移至分區中
    SELECT count(*) FROM ONLY part_test;
     count 
    -------
         0
    (1 row)
  4. 資料移轉完成後,建議禁用主表,以避免在執行計畫中出現主表。

    --- 禁用主表
    SELECT set_enable_parent('part_test'::regclass, false);

    驗證執行計畫操作:

    --- 查詢單個分區
    EXPLAIN SELECT * FROM part_test WHERE crt_time = '2016-10-25 00:00:00'::timestamp;
                                       QUERY PLAN                                    
    ---------------------------------------------------------------------------------
     Append  (cost=0.00..1.91 rows=1 width=45)
       ->  Seq Scan on part_test_122  (cost=0.00..1.91 rows=1 width=45)
             Filter: (crt_time = '2016-10-25 00:00:00'::timestamp without time zone)
    (3 rows)

    分區資料表條件約束如下,可以看出pg_pathman自動完成了轉換,如果是傳統的繼承,SELECT * FROM part_test WHERE crt_time = '2016-10-25 00:00:00'::timestamp;這種寫法無法篩選分區。

    \d+ part_test_122
                                    Table "public.part_test_122"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_122_3_check" CHECK (get_hash_part_idx(timestamp_hash(crt_time), 128) = 122)
    Inherits: part_test
說明

在HASH分區表使用過程中,建議您:

  • 分區列必須有not null約束。

  • 使用非堵塞式遷移介面。

  • 資料移轉完成後,禁用主表。

  • pg_pathman不會受制於運算式的寫法,所以select * from part_test where crt_time = '2016-10-25 00:00:00'::timestamp;這樣的寫法也能用於HASH分區的。

  • HASH分區列不局限於int類型的列,會使用HASH函數自動轉換。

資料移轉到分區

如果建立分區表時,未將主表資料移轉到分區,那麼可以使用非堵塞式的遷移介面,將資料移轉到分區。用法如下:

WITH tmp AS (DELETE FROM 主表 limit xx nowait returning *) INSERT INTO 分區 SELECT * FROM tmp;

或者使用如下語句進行標示,然後執行DELETE和INSERT。

SELECT array_agg(ctid) FROM 主表 limit xx FOR UPDATE nowati;

函數介面如下:

partition_table_concurrently(relation   REGCLASS,              -- 主表OID
                             batch_size INTEGER DEFAULT 1000,  -- 一個事務批量遷移多少記錄
                             sleep_time FLOAT8 DEFAULT 1.0)    -- 獲得行鎖失敗時,休眠多久再次擷取,重試60次退出任務。

樣本如下所示:

SELECT partition_table_concurrently('part_test'::regclass,
                             10000,
                             1.0);

查看背景資料移轉任務。

SELECT * FROM pathman_concurrent_part_tasks;

分裂定界分割

如果某個分區太大,想分裂為兩個分區,可以使用如下方法(目前僅支援RANGE分區表):

split_range_partition(partition      REGCLASS,            -- 分區oid
                      split_value    ANYELEMENT,          -- 分裂值
                      partition_name TEXT DEFAULT NULL)   -- 分裂後新增的分區表名

樣本如下所示:

  1. 使用RANGE分區樣本的分區表,分區表結構如下所示。

    \d+ part_test
                                      Table "public.part_test"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Child tables: part_test_1,
                  part_test_10,
                  part_test_11,
                  part_test_12,
                  part_test_13,
                  part_test_14,
                  part_test_15,
                  part_test_16,
                  part_test_17,
                  part_test_18,
                  part_test_19,
                  part_test_2,
                  part_test_20,
                  part_test_21,
                  part_test_22,
                  part_test_23,
                  part_test_24,
                  part_test_3,
                  part_test_4,
                  part_test_5,
                  part_test_6,
                  part_test_7,
                  part_test_8,
                  part_test_9
    
    \d+ part_test_1
                                     Table "public.part_test_1"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_1_3_check" CHECK (crt_time >= '2016-10-25 00:00:00'::timestamp without time zone AND crt_time < '2016-11-25 00:00:00'::timestamp without time zone)
    Inherits: part_test
  2. 分區分裂。

    SELECT split_range_partition('part_test_1'::regclass,              -- 分區oid
                          '2016-11-10 00:00:00'::timestamp,     -- 分裂值
                          'part_test_1_2');                     -- 分區表名

    分裂後的兩個表如下:

    \d+ part_test_1
                                     Table "public.part_test_1"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_1_3_check" CHECK (crt_time >= '2016-10-25 00:00:00'::timestamp without time zone AND crt_time < '2016-11-10 00:00:00'::timestamp without time zone)
    Inherits: part_test
    
    \d+ part_test_1_2 
                                    Table "public.part_test_1_2"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_1_2_3_check" CHECK (crt_time >= '2016-11-10 00:00:00'::timestamp without time zone AND crt_time < '2016-11-25 00:00:00'::timestamp without time zone)
    Inherits: part_test

    此時,資料會自動遷移到另一個分區。

    SELECT count(*) FROM part_test_1;
     count 
    -------
       373
    (1 row)
    
    SELECT count(*) FROM part_test_1_2;
     count 
    -------
       360
    (1 row)

    繼承關係如下:

    \d+ part_test
                                      Table "public.part_test"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Child tables: part_test_1,
                  part_test_10,
                  part_test_11,
                  part_test_12,
                  part_test_13,
                  part_test_14,
                  part_test_15,
                  part_test_16,
                  part_test_17,
                  part_test_18,
                  part_test_19,
                  part_test_1_2,    -- 新增的表
                  part_test_2,
                  part_test_20,
                  part_test_21,
                  part_test_22,
                  part_test_23,
                  part_test_24,
                  part_test_3,
                  part_test_4,
                  part_test_5,
                  part_test_6,
                  part_test_7,
                  part_test_8,
                  part_test_9

合并定界分割

目前僅支援RANGE分區,且必須為相鄰分區。調用如下介面:

--- 指定兩個需要合并分區  
merge_range_partitions(partition1 REGCLASS, partition2 REGCLASS)

樣本如下所示:

  1. 使用分裂定界分割樣本,進行合并。

    SELECT merge_range_partitions('part_test_1'::regclass, 'part_test_1_2'::regclass);
    說明

    非相鄰分區執行合并將報錯,報錯如下。

    SELECT merge_range_partitions('part_test_2'::regclass, 'part_test_12'::regclass) ;
    ERROR:  merge failed, partitions must be adjacent
    CONTEXT:  PL/pgSQL function merge_range_partitions_internal(regclass,regclass,regclass,anyelement) line 27 at RAISE
    SQL statement "SELECT public.merge_range_partitions_internal($1, $2, $3, NULL::timestamp without time zone)"
    PL/pgSQL function merge_range_partitions(regclass,regclass) line 44 at EXECUTE
  2. 合并後,會刪處其中一個分區表。

    \d part_test_1_2
    Did not find any relation named "part_test_1_2".
    
    \d part_test_1
                 Table "public.part_test_1"
      Column  |            Type             | Modifiers 
    ----------+-----------------------------+-----------
     id       | integer                     | 
     info     | text                        | 
     crt_time | timestamp without time zone | not null
    Check constraints:
        "pathman_part_test_1_3_check" CHECK (crt_time >= '2016-10-25 00:00:00'::timestamp without time zone AND crt_time < '2016-11-25 00:00:00'::timestamp without time zone)
    Inherits: part_test
    
    SELECT count(*) FROM part_test_1;
     count 
    -------
       733
    (1 row)

添加定界分割

如果主表已完成分區,將來若需增加分區,可以採用多種方法。以下將介紹三種方法:向後添加定界分割、向前添加定界分割以及指定分區起始值添加定界分割。

向後添加定界分割

向後添加定界分割時(即在末尾追加分區),會使用初次建立該分區表時的interval作為間隔。可以在pathman_config中查詢每個分區表初次建立時的interval,如下:

SELECT * FROM pathman_config;
  partrel  | attname  | parttype | range_interval 
-----------+----------+----------+----------------
 part_test | crt_time |        2 | 1 mon
(1 row)

添加分區介面(目前不支援指定資料表空間)。

append_range_partition(parent         REGCLASS,            -- 主表OID
                       partition_name TEXT DEFAULT NULL,   -- 新增的分區表名, 預設不需要輸入
                       tablespace     TEXT DEFAULT NULL)   -- 新增的分區表放到哪個資料表空間, 預設不需要輸入

樣本如下所示:

SELECT append_range_partition('part_test'::regclass);

\d+ part_test_25
                                Table "public.part_test_25"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_25_3_check" CHECK (crt_time >= '2018-10-25 00:00:00'::timestamp without time zone AND crt_time < '2018-11-25 00:00:00'::timestamp without time zone)
Inherits: part_test

\d+ part_test_24
                                Table "public.part_test_24"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_24_3_check" CHECK (crt_time >= '2018-09-25 00:00:00'::timestamp without time zone AND crt_time < '2018-10-25 00:00:00'::timestamp without time zone)
Inherits: part_test

向前添加定界分割

向前添加定界分割,介面如下:

prepend_range_partition(parent         REGCLASS,
                        partition_name TEXT DEFAULT NULL,
                        tablespace     TEXT DEFAULT NULL)

樣本如下所示:

SELECT prepend_range_partition('part_test'::regclass);

\d+ part_test_26
                                Table "public.part_test_26"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_26_3_check" CHECK (crt_time >= '2016-09-25 00:00:00'::timestamp without time zone AND crt_time < '2016-10-25 00:00:00'::timestamp without time zone)
Inherits: part_test

\d+ part_test_1
                                 Table "public.part_test_1"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_1_3_check" CHECK (crt_time >= '2016-10-25 00:00:00'::timestamp without time zone AND crt_time < '2016-11-25 00:00:00'::timestamp without time zone)
Inherits: part_test

指定分區起始值添加分區

指定分區起始值的方式添加定界分割,只要新建立的分區與現有分區之間不存在資料重疊,即可成功建立。也就是說,採用這種方法並不要求強制建立連續的分區。例如已有分區覆蓋了2010-2015的範圍,您可以直接建立一個2020年的分區表,不需要覆蓋2015到2020的範圍。介面如下:

add_range_partition(relation       REGCLASS,    -- 主表OID
                    start_value    ANYELEMENT,  -- 起始值
                    end_value      ANYELEMENT,  -- 結束值
                    partition_name TEXT DEFAULT NULL,  -- 分區名
                    tablespace     TEXT DEFAULT NULL)  -- 分區建立在哪個資料表空間下

樣本如下所示:

SELECT add_range_partition('part_test'::regclass,    -- 主表OID
                    '2020-01-01 00:00:00'::timestamp,  -- 起始值
                    '2020-02-01 00:00:00'::timestamp); -- 結束值

\d+ part_test_27
                                Table "public.part_test_27"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_27_3_check" CHECK (crt_time >= '2020-01-01 00:00:00'::timestamp without time zone AND crt_time < '2020-02-01 00:00:00'::timestamp without time zone)
Inherits: part_test

刪除分區

  • 刪除單個定界分割,介面如下:

    drop_range_partition(partition TEXT,   -- 分區名稱
                        delete_data BOOLEAN DEFAULT TRUE)  -- 是否刪除分區資料,如果false,表示分區資料移轉到主表。  
    
    Drop RANGE partition and all of its data if delete_data is true.
  • 刪除所有分區,並且指定是否要將資料移轉到主表。介面如下:

    drop_partitions(parent      REGCLASS,
                    delete_data BOOLEAN DEFAULT FALSE)
    
    Drop partitions of the parent table (both foreign and local relations). 
    If delete_data is false, the data is copied to the parent table first. 
    Default is false.

樣本如下所示:

  • 刪除分區,資料移轉到主表。

    SELECT drop_range_partition('part_test_1',false);
    SELECT drop_range_partition('part_test_2',false);

    查詢當前主表資料量:

    SELECT count(*) FROM part_test;
     count 
    -------
     10000
    (1 row)
  • 刪除分區,同時刪除分區資料,分區資料不遷移至主表。

    SELECT drop_range_partition('part_test_3',true);

    查詢當前主表資料量:

    SELECT count(*) FROM part_test;
     count 
    -------
      9256
    (1 row)
    
    SELECT count(*) FROM ONLY part_test;
     count 
    -------
      1453
    (1 row)
  • 刪除所有分區。

    SELECT drop_partitions('part_test'::regclass, false);  -- 刪除所有分區表,並將資料移轉到主表

    查詢主表資料:

    SELECT count(*) FROM part_test;
     count 
    -------
      9256
    (1 row)

綁定分區

將已有的表,綁定到已有的某個分區主表。已有的表與主表要保持一致的結構,包括dropped columns(查看pg_attribute的一致性)。介面如下:

attach_range_partition(relation    REGCLASS,    -- 主表OID
                       partition   REGCLASS,    -- 分區表OID
                       start_value ANYELEMENT,  -- 起始值
                       end_value   ANYELEMENT)  -- 結束值

樣本如下所示:

  1. 建立作為分區的表。

    CREATE TABLE part_test_1 (like part_test including all);
  2. 將已有的表綁定至主表。

    \d+ part_test
                                      Table "public.part_test"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    
    \d+ part_test_1
                                     Table "public.part_test_1"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    
    SELECT attach_range_partition('part_test'::regclass, 'part_test_1'::regclass, '2019-01-01 00:00:00'::timestamp, '2019-02-01 00:00:00'::timestamp);

    綁定分區時,將自動建立繼承關係,自動建立約束。

    \d+ part_test_1
                                     Table "public.part_test_1"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_1_3_check" CHECK (crt_time >= '2019-01-01 00:00:00'::timestamp without time zone AND crt_time < '2019-02-01 00:00:00'::timestamp without time zone)
    Inherits: part_test

解除綁定分區

將分區從主表的繼承關係中刪除,不刪除資料,刪除繼承關係,刪除約束。介面如下:

detach_range_partition(partition REGCLASS)  -- 指定分區名,轉換為普通表

樣本如下所示:

  1. 查詢當前主表和分區表資料量。

    SELECT count(*) FROM part_test;
     count 
    -------
      9256
    (1 row)
    
    SELECT count(*) FROM part_test_2;
     count 
    -------
       733
    (1 row)
  2. 解除綁定分區。

    SELECT detach_range_partition('part_test_2');

    查詢當前主表和分區表資料量。

    SELECT count(*) FROM part_test_2;
     count 
    -------
       733
    (1 row)
    
    SELECT count(*) FROM part_test;
     count 
    -------
      8523
    (1 row)

禁用pg_pathman

您可以針對單個分區主表禁用pg_pathman。介面函數如下:

重要

disable_pathman_for沒有可逆操作,請慎使用。

\sf disable_pathman_for
CREATE OR REPLACE FUNCTION public.disable_pathman_for(parent_relid regclass)
 RETURNS void
 LANGUAGE plpgsql
 STRICT
AS $function$
BEGIN
        PERFORM public.validate_relname(parent_relid);

        DELETE FROM public.pathman_config WHERE partrel = parent_relid;
        PERFORM public.drop_triggers(parent_relid);

        /* Notify backend about changes */
        PERFORM public.on_remove_partitions(parent_relid);
END
$function$

樣本如下所示:

SELECT disable_pathman_for('part_test');

\d+ part_test
                                  Table "public.part_test"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Child tables: part_test_10,
              part_test_11,
              part_test_12,
              part_test_13,
              part_test_14,
              part_test_15,
              part_test_16,
              part_test_17,
              part_test_18,
              part_test_19,
              part_test_20,
              part_test_21,
              part_test_22,
              part_test_23,
              part_test_24,
              part_test_25,
              part_test_26,
              part_test_27,
              part_test_28,
              part_test_29,
              part_test_3,
              part_test_30,
              part_test_31,
              part_test_32,
              part_test_33,
              part_test_34,
              part_test_35,
              part_test_4,
              part_test_5,
              part_test_6,
              part_test_7,
              part_test_8,
              part_test_9

\d+ part_test_10
                                Table "public.part_test_10"
  Column  |            Type             | Modifiers | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+--------------+-------------
 id       | integer                     |           | plain    |              | 
 info     | text                        |           | extended |              | 
 crt_time | timestamp without time zone | not null  | plain    |              | 
Check constraints:
    "pathman_part_test_10_3_check" CHECK (crt_time >= '2017-06-25 00:00:00'::timestamp without time zone AND crt_time < '2017-07-25 00:00:00'::timestamp without time zone)
Inherits: part_test

禁用pg_pathman外掛程式後,繼承關係和約束不會變化,只是pg_pathman外掛程式不介入custom scan執行計畫。禁用pg_pathman外掛程式後的執行計畫如下:

EXPLAIN SELECT * FROM part_test WHERE crt_time='2017-06-25 00:00:00'::timestamp;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Append  (cost=0.00..16.00 rows=2 width=45)
   ->  Seq Scan on part_test  (cost=0.00..0.00 rows=1 width=45)
         Filter: (crt_time = '2017-06-25 00:00:00'::timestamp without time zone)
   ->  Seq Scan on part_test_10  (cost=0.00..16.00 rows=1 width=45)
         Filter: (crt_time = '2017-06-25 00:00:00'::timestamp without time zone)
(5 rows)

進階分區管理

禁用主表

當主表的資料全部遷移到分區後,可以禁用主表。介面函數如下:

set_enable_parent(relation REGCLASS, value BOOLEAN)


Include/exclude parent table into/from query plan. 

In original PostgreSQL planner parent table is always included into query plan even if it's empty which can lead to additional overhead. 

You can use disable_parent() if you are never going to use parent table as a storage. 

Default value depends on the partition_data parameter that was specified during initial partitioning in create_range_partitions() or create_partitions_from_range() functions. 

If the partition_data parameter was true then all data have already been migrated to partitions and parent table disabled. 

Otherwise it is enabled.

樣本如下所示:

SELECT set_enable_parent('part_test', false);

自動擴充分區

定界分割表,允許自動擴充分區。如果新插入的資料不在已有的分區範圍內,會自動建立分區。

set_auto(relation REGCLASS, value BOOLEAN)

Enable/disable auto partition propagation (only for RANGE partitioning). 

It is enabled by default.

樣本如下所示:

  1. 查詢當前測試表格分區情況。

    \d+ part_test
                                      Table "public.part_test"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Child tables: part_test_10,
                  part_test_11,
                  part_test_12,
                  part_test_13,
                  part_test_14,
                  part_test_15,
                  part_test_16,
                  part_test_17,
                  part_test_18,
                  part_test_19,
                  part_test_20,
                  part_test_21,
                  part_test_22,
                  part_test_23,
                  part_test_24,
                  part_test_25,
                  part_test_26,
                  part_test_3,
                  part_test_4,
                  part_test_5,
                  part_test_6,
                  part_test_7,
                  part_test_8,
                  part_test_9
    
    \d+ part_test_26
                                    Table "public.part_test_26"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_26_3_check" CHECK (crt_time >= '2018-09-25 00:00:00'::timestamp without time zone AND crt_time < '2018-10-25 00:00:00'::timestamp without time zone)
    Inherits: part_test
    
    \d+ part_test_25
                                    Table "public.part_test_25"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Check constraints:
        "pathman_part_test_25_3_check" CHECK (crt_time >= '2018-08-25 00:00:00'::timestamp without time zone AND crt_time < '2018-09-25 00:00:00'::timestamp without time zone)
    Inherits: part_test
  2. 插入一個不在已有分區範圍的值,將根據建立分區時的interval自動擴充若干個分區,需要提醒您的是這個操作耗時可能較長。

    INSERT INTO part_test VALUES (1,'test','2222-01-01'::timestamp);

    查詢此時的分區情況:

    \d+ part_test
                                      Table "public.part_test"
      Column  |            Type             | Modifiers | Storage  | Stats target | Description 
    ----------+-----------------------------+-----------+----------+--------------+-------------
     id       | integer                     |           | plain    |              | 
     info     | text                        |           | extended |              | 
     crt_time | timestamp without time zone | not null  | plain    |              | 
    Child tables: part_test_10,
                  part_test_100,
                  part_test_1000,
                  part_test_1001,
                  ......
說明

不建議開啟自動擴充定界分割,不合理的自動擴充可能會消耗大量的時間。

回呼函數

回呼函數是在每建立一個分區時會自動觸發調用的函數。例如,可以用在DDL邏輯複製中,將DDL語句記錄下來,存放到表中。回呼函數如下:

set_init_callback(relation REGCLASS, callback REGPROC DEFAULT 0)

Set partition creation callback to be invoked for each attached or created partition (both HASH and RANGE). 

The callback must have the following signature: 

part_init_callback(args JSONB) RETURNS VOID. 

Parameter arg consists of several fields whose presence depends on partitioning type:

/* RANGE-partitioned table abc (child abc_4) */
{
    "parent":    "abc",
    "parttype":  "2",
    "partition": "abc_4",
    "range_max": "401",
    "range_min": "301"
}

/* HASH-partitioned table abc (child abc_0) */
{
    "parent":    "abc",
    "parttype":  "1",
    "partition": "abc_0"
}

樣本如下所示:

  1. 建立回呼函數。

    CREATE OR REPLACE FUNCTION f_callback_test(jsonb) RETURNS void AS
    $$
    DECLARE
    BEGIN
      CREATE TABLE if NOT EXISTS rec_part_ddl(id serial primary key, parent name, parttype int, partition name, range_max text, range_min text);
      if ($1->>'parttype')::int = 1 then
        raise notice 'parent: %, parttype: %, partition: %', $1->>'parent', $1->>'parttype', $1->>'partition';
        INSERT INTO rec_part_ddl(parent, parttype, partition) values (($1->>'parent')::name, ($1->>'parttype')::int, ($1->>'partition')::name);
      elsif ($1->>'parttype')::int = 2 then
        raise notice 'parent: %, parttype: %, partition: %, range_max: %, range_min: %', $1->>'parent', $1->>'parttype', $1->>'partition', $1->>'range_max', $1->>'range_min';
        INSERT INTO rec_part_ddl(parent, parttype, partition, range_max, range_min) values (($1->>'parent')::name, ($1->>'parttype')::int, ($1->>'partition')::name, $1->>'range_max', $1->>'range_min');
      END if;
    END;
    $$ LANGUAGE plpgsql strict;
  2. 準備測試表。

    CREATE TABLE tt(id int, info text, crt_time timestamp not null);
    
    --- 設定測試表的回呼函數
    SELECT set_init_callback('tt'::regclass, 'f_callback_test'::regproc);
    
    --- 建立分區
    SELECT                                                           
    create_range_partitions('tt'::regclass,                    -- 主表OID
                            'crt_time',                        -- 分區列名
                            '2016-10-25 00:00:00'::timestamp,  -- 開始值
                            interval '1 month',                -- 間隔;interval 類型,用於時間分區表
                            24,                                -- 分多少個區
                            false) ;
  3. 檢查回呼函數是否已調用。

    SELECT * FROM rec_part_ddl;

    返回結果如下:

     id | parent | parttype | partition |      range_max      |      range_min      
    ----+--------+----------+-----------+---------------------+---------------------
      1 | tt     |        2 | tt_1      | 2016-11-25 00:00:00 | 2016-10-25 00:00:00
      2 | tt     |        2 | tt_2      | 2016-12-25 00:00:00 | 2016-11-25 00:00:00
      3 | tt     |        2 | tt_3      | 2017-01-25 00:00:00 | 2016-12-25 00:00:00
      4 | tt     |        2 | tt_4      | 2017-02-25 00:00:00 | 2017-01-25 00:00:00
      5 | tt     |        2 | tt_5      | 2017-03-25 00:00:00 | 2017-02-25 00:00:00
      6 | tt     |        2 | tt_6      | 2017-04-25 00:00:00 | 2017-03-25 00:00:00
      7 | tt     |        2 | tt_7      | 2017-05-25 00:00:00 | 2017-04-25 00:00:00
      8 | tt     |        2 | tt_8      | 2017-06-25 00:00:00 | 2017-05-25 00:00:00
      9 | tt     |        2 | tt_9      | 2017-07-25 00:00:00 | 2017-06-25 00:00:00
     10 | tt     |        2 | tt_10     | 2017-08-25 00:00:00 | 2017-07-25 00:00:00
     11 | tt     |        2 | tt_11     | 2017-09-25 00:00:00 | 2017-08-25 00:00:00
     12 | tt     |        2 | tt_12     | 2017-10-25 00:00:00 | 2017-09-25 00:00:00
     13 | tt     |        2 | tt_13     | 2017-11-25 00:00:00 | 2017-10-25 00:00:00
     14 | tt     |        2 | tt_14     | 2017-12-25 00:00:00 | 2017-11-25 00:00:00
     15 | tt     |        2 | tt_15     | 2018-01-25 00:00:00 | 2017-12-25 00:00:00
     16 | tt     |        2 | tt_16     | 2018-02-25 00:00:00 | 2018-01-25 00:00:00
     17 | tt     |        2 | tt_17     | 2018-03-25 00:00:00 | 2018-02-25 00:00:00
     18 | tt     |        2 | tt_18     | 2018-04-25 00:00:00 | 2018-03-25 00:00:00
     19 | tt     |        2 | tt_19     | 2018-05-25 00:00:00 | 2018-04-25 00:00:00
     20 | tt     |        2 | tt_20     | 2018-06-25 00:00:00 | 2018-05-25 00:00:00
     21 | tt     |        2 | tt_21     | 2018-07-25 00:00:00 | 2018-06-25 00:00:00
     22 | tt     |        2 | tt_22     | 2018-08-25 00:00:00 | 2018-07-25 00:00:00
     23 | tt     |        2 | tt_23     | 2018-09-25 00:00:00 | 2018-08-25 00:00:00
     24 | tt     |        2 | tt_24     | 2018-10-25 00:00:00 | 2018-09-25 00:00:00
    (24 rows)