Time series scenarios usually have the following requirements:
1) Expired historical data must automatically clear up as fast as possible to prevent write-ahead logging (WAL). Either the drop or truncate method can be used.
2) PG supports partition tables. It is better to use partitions. However, this requires the constant creation of partitions. Currently, pg_pathman
automatically creates partitions. But partitions aren't automatically deleted.
So, now the critical question is how to proceed ahead?
Let's take a look at the proposed solution to fix the issue raised in the preceding section:
1) Specify a fixed number of partition tables (similar to list partitions) in a period. For example, there's a need for 24 tables ending with hours, 7 weekly tables, and 60 minute-based tables.
2) Trigger and clean (truncate) old data based on data insertion time.
This method fixes the written code without necessitating any further changes and creates all the necessary data tables at one time.
Step 1) Create all detailed partition tables required in a period. Create a total of six tables, each for 10 minutes as shown below.
create table t1_0 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='0'));
create table t1_1 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='1'));
create table t1_2 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='2'));
create table t1_3 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='3'));
create table t1_4 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='4'));
create table t1_5 (id int, info text, crt_time timestamp, check (substring(to_char(crt_time,'yyyymmddhh24miss'), 11, 1)='5'));
Step 2) Create a constraint table or a scheduling table (because it's not possible to trigger a truncate event for each record). Also, clean data generated more than 10 minutes ago at every 10-minute interval as shown below.
create table t_const(crt_time timestamp primary key);
Step 3) Create a trigger for the partition table, truncate data by period, and write the data into the constraint table.
create or replace function tg() returns trigger as $$
declare
begin
insert into t_const values (to_timestamp(substring(to_char(NEW.crt_time,'yyyymmddhh24miss'), 1, 11)||'000','yyyymmddhh24miss')) on conflict (crt_time) do nothing;
return null;
end;
$$ language plpgsql strict;
create trigger tg1 after insert on t1_0 for each row execute procedure tg();
create trigger tg1 after insert on t1_1 for each row execute procedure tg();
create trigger tg1 after insert on t1_2 for each row execute procedure tg();
create trigger tg1 after insert on t1_3 for each row execute procedure tg();
create trigger tg1 after insert on t1_4 for each row execute procedure tg();
create trigger tg1 after insert on t1_5 for each row execute procedure tg();
Step 4) Execute the following commands to create a constraint table trigger to trigger the rules for cleaning detailed partitions.
create or replace function tg_truncate() returns trigger as $$
declare
suffix int := substring(to_char(NEW.crt_time,'yyyymmddhh24miss'), 11, 1)::int;
begin
set lock_timeout = '1s';
for i in 0..5 loop
if i=suffix then
continue;
end if;
if suffix=0 and i=5 then
continue;
end if;
if i=suffix-1 then
continue;
end if;
execute 'truncate t1_'||i;
raise notice 'truncated %', 't1_'||i;
end loop;
return null;
end;
$$ language plpgsql strict;
create trigger tg1 after insert on t_const for each row execute procedure tg_truncate();
With this the automatic cleanup is complete! Do give a try yourself.
The following snippet displays the automatic cleanup.
postgres=# insert into t1_0 values (1,now(),'2018-01-02 10:00:40');
NOTICE: truncated t1_1
NOTICE: truncated t1_2
NOTICE: truncated t1_3
NOTICE: truncated t1_4
INSERT 0 1
If data is written to the same partition later, automatic cleanup will not be triggered. This complies with requirements. Automatic cleanup is triggered only by the first piece of written data.
postgres=# insert into t1_0 values (1,now(),'2018-01-02 10:00:40');
INSERT 0 1
postgres=# insert into t1_0 values (1,now(),'2018-01-02 10:01:50');
INSERT 0 1
Data can be stored by partition. If you use a native partition table, a shared lock is added to the primary table and all its partitions during data writing, and hence the user cannot perform truncate. We recommend using the pg_pathman
partition or directly write data into a partition.
Also, note that the truncate method prevents WAL log generation during data cleanup, and hence triggers are used to enable automatic data cleanup.
Note: The above articles are in Chinese
Compatibility between PostgreSQL and Oracle - Data Sampling and Desensitization
Use of the PostgreSQL Upsert (INSERT ON CONFLICT DO) Function
digoal - March 25, 2020
amap_tech - March 16, 2021
digoal - May 16, 2019
digoal - May 16, 2019
digoal - May 16, 2019
digoal - May 16, 2019
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreA fully managed NoSQL cloud database service that enables storage of massive amount of structured and semi-structured data
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreMore Posts by digoal