By digoal
Time series data can be stored in rows or columns, but how do I choose which one is suitable for me?
How do column stores spell out the same row of records?
The storage format of PostgreSQL is the HEAP table by default. How can we convert a wide table into a point table?
Example:
Create a wide table (indicating all dimensions (points) collected by the sensor. For example, there are eight dimensions plus a time field).
create table t (ts timestamp primary key, c1 int, c2 int, c3 int, c4 text, c5 int, c6 int, c7 int, c8 int);
Create a point table:
create table t1 (ts timestamp primary key, c1 int);
create table t2 (ts timestamp primary key, c2 int);
create table t3 (ts timestamp primary key, c3 int);
create table t4 (ts timestamp primary key, c4 text);
create table t5 (ts timestamp primary key, c5 int);
create table t6 (ts timestamp primary key, c6 int);
create table t7 (ts timestamp primary key, c7 int);
create table t8 (ts timestamp primary key, c8 int);
Create a rule. Distribute data to a point table when writing to a wide table, with TS fields to string the columns of the same row:
create rule r1 as on insert to t do instead (
insert into t1 values (NEW.ts, NEW.c1);
insert into t2 values (NEW.ts, NEW.c2);
insert into t3 values (NEW.ts, NEW.c3);
insert into t4 values (NEW.ts, NEW.c4);
insert into t5 values (NEW.ts, NEW.c5);
insert into t6 values (NEW.ts, NEW.c6);
insert into t7 values (NEW.ts, NEW.c7);
insert into t8 values (NEW.ts, NEW.c8)
);
Write test and the wide table, and the data will be automatically converted to the point table:
postgres=# insert into t values (now(),1,1,1,'test',1,1,1,1);
INSERT 0 1
postgres=# select * from t;
ts | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8
----+----+----+----+----+----+----+----+----
(0 rows)
postgres=# select * from t1;
ts | c1
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t2;
ts | c2
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t3;
ts | c3
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t4;
ts | c4
----------------------------+------
2022-06-10 15:22:41.916042 | test
(1 row)
postgres=# select * from t5;
ts | c5
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t6;
ts | c6
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t7;
ts | c7
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
postgres=# select * from t8;
ts | c8
----------------------------+----
2022-06-10 15:22:41.916042 | 1
(1 row)
How is the performance?
postgres=# \timing
Timing is on.
postgres=# insert into t select clock_timestamp(),1,1,1,'test',1,1,1,1 from generate_series(1,100000);
INSERT 0 100000
Time: 791.200 ms
postgres=# insert into t select clock_timestamp(),1,1,1,'test',1,1,1,1 from generate_series(1,300000);
INSERT 0 300000
Time: 2233.316 ms (00:02.233)
public | t | table | postgres | permanent | heap | 8192 bytes |
public | t1 | table | postgres | permanent | heap | 17 MB |
public | t2 | table | postgres | permanent | heap | 17 MB |
public | t3 | table | postgres | permanent | heap | 17 MB |
public | t4 | table | postgres | permanent | heap | 17 MB |
public | t5 | table | postgres | permanent | heap | 17 MB |
public | t6 | table | postgres | permanent | heap | 17 MB |
public | t7 | table | postgres | permanent | heap | 17 MB |
public | t8 | table | postgres | permanent | heap | 17 MB |
If you convert it by points, the processing performance is 1.01 million points per second.
The hardware is listed below:
Apple Notebook
MacBook Pro (13-inch, 2019, Four Thunderbolt 3 ports)
2.4GHz quad-core Intel Core i5
16GB 2133 MHz LPDDR3
256 GB disk
Isolation of PostgreSQL CTID Physical Line Numbers in Concurrent DML Operations
digoal - December 23, 2020
ApsaraDB - October 16, 2024
ApsaraDB - June 7, 2022
ApsaraDB - July 4, 2022
digoal - September 27, 2022
digoal - April 12, 2019
A cost-effective online time series database service that offers high availability and auto scaling features
Learn MoreTSDB is a stable, reliable, and cost-effective online high-performance time series database service.
Learn MoreAlibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreBuild business monitoring capabilities with real time response based on frontend monitoring, application monitoring, and custom business monitoring capabilities
Learn MoreMore Posts by digoal