×
Community Blog Comparison Between PostgreSQL 10 and Greenplum in Spatiotemporal Object Selection and Data Pivoting

Comparison Between PostgreSQL 10 and Greenplum in Spatiotemporal Object Selection and Data Pivoting

In this article, the author discusses spatiotemporal object selection and data pivoting and extensively compares PostgreSQL 10 and Greenplum in this respect with example code.

By digoal

Background

Generally, a person's address may include home, children's home, parents' home, family relative's home, girlfriend's or boyfriend's home, and company.

The address data can be used in many scenarios. Examples:

1) Missing people search

2) Target customer discovery for offline advertising, site selection, and target customer profiling for businesses

3) Location-based user profile pivoting, user group-based location pivoting, and data pivoting based on the above requirements combined with spatiotemporal conditions. Example:

  • You can display the average income and consumption level of each box and grid on a map, for example, within a radius of 100 meters. Then use WIDTH_BUCKET or a user-defined function (UDF) to obtain these box IDs and use shades of colors to indicate the values of income and consumption levels.
  • We can divide the consumption field into sub-fields such as food, clothing, electronic products, and books. Now, isn't it easy for us to determine the store type to set up? Of course, with today’s e-commerce prosperity, we must consider the combination of online and offline stores.

Greenplum is undoubtedly a good choice (simple, straight, and functional) in scenarios such as spatial target selection + user data pivoting. We can also use PostgreSQL 10 in these scenarios. So, which one should we choose?

Features and Selection of Greenplum and PostgreSQL

1) PostgreSQL 10 is Suitable for OLTP-based OLAP-supplemented Scenarios with 10 - 100 TB Data

It is very similar to the scenario covered by Oracle.

Compatible with SQL: 2011, with 1 million+ tpmC.

Supports multi-core parallel computing.

Supports readable and writable OSS external tables.

Supports common and extended data types, such as JSON (B), Hstore (KV), PostGIS spatial database, pgrouting (route, graph search), array, ltree, HLL valuation, smlar, and imgsmlr.

Supports SQL stream computing plug-ins.

Supports time series plug-ins.

Supports B-tree, Hash, GIN, GiST, SP-GiST, Bloom, BRIN, and other indexes.

Supports plpgsql and sql server-side programming.

Supports analytic syntaxes (such as multidimensional computing and window query) and recursive queries (such as tree query and graph search). Supports full-text search, fuzzy query, similarity query, and regular expression query. Supports for array similarity queries and graph similarity queries.

1.1) Applicable Business Scenarios

  • TB-level OLTP (Online Transaction Processing) + OLAP (Online Analytical Processing)
  • Fuzzy query, similarity search, and regular search
  • Full-text Search
  • IoT
  • Streaming Data Processing
  • Social Activity
  • Graph Search
  • Independent Event Analysis
  • Cold/hot Data Separation
  • Asynchronous Messages
  • Similarity search of multi-value types and image characteristics
  • Real-time Data Cleansing
  • GIS Applications
  • Real-time Search of Random Field
  • ... ...

1.2) Main Features: Functionality, Stability, Performance, High Availability, Reliability, Oracle Compatibility, and HTAP

2) HybridDB for PostgreSQL

It is developed based on GPDB, an open-source version of Greenplum and is suitable for PB-level real-time OLAP. It is a very typical data warehouse for massive amounts of data.

Compatible with SQL: 2008, TPC-H, and TPC-DS. It has decades of commercialization experience.

Supports readable and writable OSS external tables.

Supports common and extended data types, such as JSON, PostGIS spatial database, array, and HLL valuation.

Supports bitmap, hash, and B-tree index.

Supports pljava server-side programming

Supports analytic syntaxes (such as multidimensional computing, window query, and MADlib machine learning) and the full-text search syntax.

Supports column-store, row-store, compression, and hybrid storage.

Supports 4-stage aggregation and automatic redistribution between nodes.

Supports horizontal resizing.

2.1) Applicable Business Scenarios

PB+ level real-time analysis (conventional statistics; time, space, attribute — multi-dimensional data pivoting and target selection; as well as random table and random dimensional JOIN).

2.2) Main Features: Analytic SQL Compatibility, Functionality, Stability, Performance, High Availability, and Scalability

Spatial Target Selection + Data Pivoting DEMO

Schema Design

1) Schema design 1: Wide table (when the tag types are less than 1600)

create table tbl_pos (  
  uid int8,   -- 用户ID  
  att1 int8,  -- 用户标签1  
  att2 int8,  -- 用户标签2  
  att3 int8,  -- 用户标签3  
  ....  
  pos1 geometry,  -- 用户家庭位置  
  pos2 geometry,  -- 用户公司位置  
  pos3 geometry,  -- 用户xx位置  
  pos4 geometry,  -- 用户xxx位置  
  ...  
);  

or

create table tbl_tag (  
  uid int8,   -- 用户ID  
  att1 int8,  -- 用户标签1  
  att2 int8,  -- 用户标签2  
  att3 int8,  -- 用户标签3  
  ....  
);  
  
create table tbl_pos (  
  uid int8,  
  pos_att int2,   -- 位置属性,(家、公司、。。。)  
  pos geometry,   -- 位置  
);  

2) Schema design 2: Using JSONB as a tag field when the number of tag types is greater than 1600.

create table tbl_pos (  
  uid int8,   -- 用户ID  
  att jsonb,  -- 用户标签,用JSONB表示  
  ....  
  pos1 geometry,  -- 用户家庭位置  
  pos2 geometry,  -- 用户公司位置  
  pos3 geometry,  -- 用户xx位置  
  pos4 geometry,  -- 用户xxx位置  
  ...  
);  

3) Schema design 3: Storing tags in an array when the number of tag types is greater than 1600.

create table tbl_tag (  
  uid int8,  
  tag text[],  
  ...  
  pos1 geometry,  -- 用户家庭位置  
  pos2 geometry,  -- 用户公司位置  
  pos3 geometry,  -- 用户xx位置  
  pos4 geometry,  -- 用户xxx位置  
  ...  
);  

4) Schema design 4: Tag inverted index design. When the number of tag types exceeds 1600 and the tag value is YES or NO, tags need to be modified through merging and UDFs, only applicable to PostgreSQL.

create table tbl_pos (  
  uid int8,  
  pos1 geometry,  -- 用户家庭位置  
  pos2 geometry,  -- 用户公司位置  
  pos3 geometry,  -- 用户xx位置  
  pos4 geometry,  -- 用户xxx位置  
  ...  
);  
  
create table tbl_tag (  
  tag int,  
  userbits varbit  
);  

Each design has its benefits, drawbacks, and applicable scenarios. We can select a design based on the specific scenario.

Next, we will compare PostgreSQL 10 with the Greenplum database by using the first design. There are 200 numeric tag types and ten common addresses in the use case.

PostgreSQL 10

Data Preparation

1) Create a tag table and a location table, and write data of 1 billion tags and 10 billion location data records.

create extension postgis;  
  
create or replace function ct1 () returns void as $$  
declare  
  sql text := '';  
begin  
  sql := 'create table tbl_tag(uid int8,';  
  for i in 1..200 loop  
    sql := sql||'c'||i||' int2 default random()*32767,';  
  end loop;  
  sql := rtrim(sql, ',');  
  sql := sql||')';  
  execute sql;  
end;  
$$ language plpgsql strict;  
  
  
create table tbl_pos(  
  uid int8 primary key,   
  pos_att int2,    
  pos geometry default st_setsrid(st_makepoint(73+random()*62, 3+random()*50), 4326)  
);  
  
  
create table tbl_pos1 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos2 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos3 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos4 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos5 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos6 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos7 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos8 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos9 (like tbl_pos including all) inherits (tbl_pos);  
create table tbl_pos10 (like tbl_pos including all) inherits (tbl_pos);  
  
  
select ct1();  
  
  
nohup psql -c "insert into tbl_tag select generate_series(1,1000000000)" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos1 select generate_series(1,1000000000),1" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos2 select generate_series(1,1000000000),2" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos3 select generate_series(1,1000000000),3" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos4 select generate_series(1,1000000000),4" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos5 select generate_series(1,1000000000),5" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos6 select generate_series(1,1000000000),6" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos7 select generate_series(1,1000000000),7" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos8 select generate_series(1,1000000000),8" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos9 select generate_series(1,1000000000),9" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos10 select generate_series(1,1000000000),10" >/dev/null 2>&1 &  

2) Organize the location table according to the locations (because the user's location data is calculated by behavior and changes rarely, it can be regarded as semi-static data and is suitable for sorting).

create table tbl_pos_1 (like tbl_pos including all, check (pos_att=1)) inherits(tbl_pos);  
create table tbl_pos_2 (like tbl_pos including all, check (pos_att=2)) inherits(tbl_pos);  
create table tbl_pos_3 (like tbl_pos including all, check (pos_att=3)) inherits(tbl_pos);  
create table tbl_pos_4 (like tbl_pos including all, check (pos_att=4)) inherits(tbl_pos);  
create table tbl_pos_5 (like tbl_pos including all, check (pos_att=5)) inherits(tbl_pos);  
create table tbl_pos_6 (like tbl_pos including all, check (pos_att=6)) inherits(tbl_pos);  
create table tbl_pos_7 (like tbl_pos including all, check (pos_att=7)) inherits(tbl_pos);  
create table tbl_pos_8 (like tbl_pos including all, check (pos_att=8)) inherits(tbl_pos);  
create table tbl_pos_9 (like tbl_pos including all, check (pos_att=9)) inherits(tbl_pos);  
create table tbl_pos_10 (like tbl_pos including all, check (pos_att=10)) inherits(tbl_pos);  
  
-- 10位已经精确到米级, 足够使用  
  
nohup psql -c "insert into tbl_pos_1 select * from tbl_pos1 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_2 select * from tbl_pos2 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_3 select * from tbl_pos3 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_4 select * from tbl_pos4 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_5 select * from tbl_pos5 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_6 select * from tbl_pos6 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_7 select * from tbl_pos7 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_8 select * from tbl_pos8 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_9 select * from tbl_pos9 order by pos_att, st_geohash(pos, 10);  " >/dev/null 2>&1 &    
nohup psql -c "insert into tbl_pos_10 select * from tbl_pos10 order by pos_att, st_geohash(pos, 10);" >/dev/null 2>&1 &  

3) Create a Geohash BRIN index on the table to accelerate spatial data selection.

create index idx_tbl_pos1_pos on tbl_pos1 using brin( pos );    
create index idx_tbl_pos2_pos on tbl_pos2 using brin( pos );    
create index idx_tbl_pos3_pos on tbl_pos3 using brin( pos );    
create index idx_tbl_pos4_pos on tbl_pos4 using brin( pos );    
create index idx_tbl_pos5_pos on tbl_pos5 using brin( pos );    
create index idx_tbl_pos6_pos on tbl_pos6 using brin( pos );    
create index idx_tbl_pos7_pos on tbl_pos7 using brin( pos );    
create index idx_tbl_pos8_pos on tbl_pos8 using brin( pos );    
create index idx_tbl_pos9_pos on tbl_pos9 using brin( pos );    
create index idx_tbl_pos10_pos on tbl_pos10 using brin( pos );    

Remove the inheritance of the unorganized table (we only need to retain the organized data).

alter table tbl_pos1 no inherit tbl_pos;  
alter table tbl_pos2 no inherit tbl_pos;  
alter table tbl_pos3 no inherit tbl_pos;  
alter table tbl_pos4 no inherit tbl_pos;  
alter table tbl_pos5 no inherit tbl_pos;  
alter table tbl_pos6 no inherit tbl_pos;  
alter table tbl_pos7 no inherit tbl_pos;  
alter table tbl_pos8 no inherit tbl_pos;  
alter table tbl_pos9 no inherit tbl_pos;  
alter table tbl_pos10 no inherit tbl_pos;  

4) Create a B-tree index for tag data.

for ((i=1;i<=200;i++))   
do  
  nohup psql -c "set maintenance_work_mem='512MB'; create index idx_tbl_tag_$i on tbl_tag (c$i);" >/dev/null 2>&1 &  
done  

5) Space usage

Data: The size of the tag table of 1 billion records is 424 GB, and that of the location table of 10 billion records is 640 GB.

postgres=# \dt+  
                           List of relations  
 Schema |      Name       | Type  |  Owner   |    Size    | Description   
--------+-----------------+-------+----------+------------+-------------  
 public | tbl_pos         | table | postgres | 8192 bytes |   
 public | tbl_pos_1       | table | postgres | 64 GB      |   
 public | tbl_pos_10      | table | postgres | 64 GB      |   
 public | tbl_pos_2       | table | postgres | 64 GB      |   
 public | tbl_pos_3       | table | postgres | 64 GB      |   
 public | tbl_pos_4       | table | postgres | 64 GB      |   
 public | tbl_pos_5       | table | postgres | 64 GB      |   
 public | tbl_pos_6       | table | postgres | 64 GB      |   
 public | tbl_pos_7       | table | postgres | 64 GB      |   
 public | tbl_pos_8       | table | postgres | 64 GB      |   
 public | tbl_pos_9       | table | postgres | 64 GB      |   
 public | tbl_tag         | table | postgres | 424 GB     |   

Index:

The tag table’s index size is 21 GB, and the total size is 4200 GB.

                                     List of relations  
 Schema |         Name         | Type  |  Owner   |      Table      |  Size  | Description   
--------+----------------------+-------+----------+-----------------+--------+-------------  
 public | idx_tbl_tag_1        | index | postgres | tbl_tag         | 21 GB  |   

The location table’s BRIN index size is 2.7 MB, and the table’s total size is 27 MB.

postgres=# \di+  
                                     List of relations  
 Schema |         Name         | Type  |  Owner   |      Table      |  Size   | Description   
--------+----------------------+-------+----------+-----------------+---------+-------------  
 public | idx_tbl_pos10_pos    | index | postgres | tbl_pos10       | 2728 kB |   
 public | idx_tbl_pos1_pos     | index | postgres | tbl_pos1        | 2728 kB |   
 public | idx_tbl_pos2_pos     | index | postgres | tbl_pos2        | 2728 kB |   
 public | idx_tbl_pos3_pos     | index | postgres | tbl_pos3        | 2728 kB |   
 public | idx_tbl_pos4_pos     | index | postgres | tbl_pos4        | 2728 kB |   
 public | idx_tbl_pos5_pos     | index | postgres | tbl_pos5        | 2728 kB |   
 public | idx_tbl_pos6_pos     | index | postgres | tbl_pos6        | 2728 kB |   
 public | idx_tbl_pos7_pos     | index | postgres | tbl_pos7        | 2728 kB |   
 public | idx_tbl_pos8_pos     | index | postgres | tbl_pos8        | 2728 kB |   
 public | idx_tbl_pos9_pos     | index | postgres | tbl_pos9        | 2728 kB | 

Performance Test of Space and Attribute-based Target Selection + Data Pivoting

1) For 10 billion records of spatial data, it takes 400 milliseconds to select about 10 million persons by space.

实际可以按游标返回  
  
postgres=# select count(*) from tbl_pos where pos_att=1 and st_within(pos, st_setsrid(st_makebox2d(st_makepoint(120,5), st_makepoint(125.5,10.5)),4326));  
  count    
---------  
 9757154  
(1 row)  
Time: 399.846 ms  

By performing parallel scanning using the BRIN index, the implementation code is as follows:

postgres=# explain select count(*) from tbl_pos where pos_att=1 and st_within(pos, st_setsrid(st_makebox2d(st_makepoint(120,5), st_makepoint(125.5,10.5)),4326));  
                                           
        QUERY PLAN         
-------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=10016617202.13..10016617202.14 rows=1 width=8)  
   ->  Gather  (cost=10016617202.04..10016617202.05 rows=32 width=8)  
         Workers Planned: 32  
         ->  Partial Aggregate  (cost=10016617202.04..10016617202.05 rows=1 width=8)  
               ->  Parallel Append  (cost=0.00..10016617175.99 rows=10418 width=0)  
                     ->  Parallel Seq Scan on tbl_pos  (cost=10000000000.00..10000000000.00 rows=1 width=0)  
                           Filter: (('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos) AND (pos_att = 1) AND _st_contains('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry, pos))  
                     ->  Parallel Bitmap Heap Scan on tbl_pos_1  (cost=2591.99..16617175.99 rows=10417 width=0)  
                           Recheck Cond: ('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos)  
                           Filter: ((pos_att = 1) AND _st_contains('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry, pos))  
                           ->  Bitmap Index Scan on idx_tbl_pos1_posbn  (cost=0.00..2508.66 rows=1000000000 width=0)  
                                 Index Cond: ('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos)  
(12 rows)  

2) For 10 billion records of spatial data, it takes 7 seconds to select about 10 million people by space, JOIN 1 billion records of tag data, and pivot tag attributes of these people.

Using parallel BRIN scanning and parallel nested-loop JOIN, the implementation code is as follows:

postgres=# explain analyze select c1,count(*),avg(c2),max(c3) from tbl_tag t2 join (select uid from tbl_pos where pos_att=1 and st_within(pos, st_setsrid(st_makebox2d(st_makepoint(120,5), st_makepoint(125.5,10.5)),4326))) t1 on (t1.uid=t2.uid) group by c1;  
                                                                         
              QUERY PLAN                  
--------------------------------------------------------------------------  
 Finalize GroupAggregate  (cost=10016650419.09..10016663638.78 rows=32873 width=44) (actual time=5417.105..6404.328 rows=32768 loops=1)  
   Group Key: t2.c1  
   ->  Gather Merge  (cost=10016650419.09..10016659894.42 rows=333344 width=44) (actual time=5417.071..6212.057 rows=1081163 loops=1)  
         Workers Planned: 32  
         Workers Launched: 32  
         ->  Partial GroupAggregate  (cost=10016650418.26..10016650652.64 rows=10417 width=44) (actual time=5392.695..5506.923 rows=32763 loops=33)  
               Group Key: t2.c1  
               ->  Sort  (cost=10016650418.26..10016650444.30 rows=10417 width=6) (actual time=5392.676..5442.197 rows=295671 loops=33)  
                     Sort Key: t2.c1  
                     Sort Method: quicksort  Memory: 30914kB  
                     ->  Nested Loop  (cost=0.57..10016649723.09 rows=10417 width=6) (actual time=8.413..5277.270 rows=295671 loops=33)  
                           ->  Parallel Append  (cost=0.00..10016617175.99 rows=10418 width=8) (actual time=8.342..407.141 rows=295671 loops=33)  
                                 ->  Parallel Seq Scan on tbl_pos  (cost=10000000000.00..10000000000.00 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=33)  
                                       Filter: (('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos) AND (pos_att = 1) AND _st_contains('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry, pos))  
                                 ->  Parallel Bitmap Heap Scan on tbl_pos_1  (cost=2591.99..16617175.99 rows=10417 width=8) (actual time=8.341..381.660 rows=295671 loops=33)  
                                       Recheck Cond: ('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos)  
                                       Rows Removed by Index Recheck: 32474  
                                       Filter: ((pos_att = 1) AND _st_contains('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry, pos))  
                                       Heap Blocks: lossy=3688  
                                       ->  Bitmap Index Scan on idx_tbl_pos1_posbn  (cost=0.00..2508.66 rows=1000000000 width=0) (actual time=19.182..19.182 rows=902400 loops=1)  
                                             Index Cond: ('0103000020E610000001000000050000000000000000005E4000000000000014400000000000005E4000000000000025400000000000605F4000000000000025400000000000605F4000000000000014400000000000005E400000000000001440'::geometry ~ pos)  
                           ->  Index Scan using idx on tbl_tag t2  (cost=0.57..3.11 rows=1 width=14) (actual time=0.016..0.016 rows=1 loops=9757154)  
                                 Index Cond: (uid = tbl_pos.uid)  
 Planning time: 0.690 ms  
 Execution time: 7098.662 ms  
(25 rows)  

3) For 1 billion tags, it takes 14.5 seconds to select about 10 million people by tag and perform parallel bitmap or scanning.

postgres=# select count(*) from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200;  
  count    
---------  
 9196602  
(1 row)  
Time: 14491.705 ms (00:14.492)  

The implementation code is as follows:

postgres=# explain analyze select count(*) from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200;  
                                                                             QUERY PLAN                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=10080648.93..10080648.94 rows=1 width=8) (actual time=16170.482..16170.482 rows=1 loops=1)  
   ->  Gather  (cost=10080648.84..10080648.85 rows=32 width=8) (actual time=16148.118..16170.475 rows=33 loops=1)  
         Workers Planned: 32  
         Workers Launched: 32  
         ->  Partial Aggregate  (cost=10080648.84..10080648.85 rows=1 width=8) (actual time=16135.754..16135.754 rows=1 loops=33)  
               ->  Parallel Bitmap Heap Scan on tbl_tag  (cost=132192.09..10079940.37 rows=283386 width=0) (actual time=2174.588..16107.422 rows=278685 loops=33)  
                     Recheck Cond: ((c1 = 1) OR ((c2 >= 1) AND (c2 <= 100)) OR (c13 = 100) OR ((c4 >= 1) AND (c4 <= 200)))  
                     Rows Removed by Index Recheck: 4311619  
                     Heap Blocks: exact=3516 lossy=293793  
                     ->  BitmapOr  (cost=132192.09..132192.09 rows=9087533 width=0) (actual time=2094.773..2094.773 rows=0 loops=1)  
                           ->  Bitmap Index Scan on idx_tbl_tag_1  (cost=0.00..333.62 rows=30020 width=0) (actual time=9.718..9.718 rows=30332 loops=1)  
                                 Index Cond: (c1 = 1)  
                           ->  Bitmap Index Scan on idx_tbl_tag_2  (cost=0.00..43418.50 rows=3200783 width=0) (actual time=787.952..787.952 rows=3053594 loops=1)  
                                 Index Cond: ((c2 >= 1) AND (c2 <= 100))  
                           ->  Bitmap Index Scan on idx_tbl_tag_13  (cost=0.00..332.99 rows=29936 width=0) (actual time=3.662..3.662 rows=30554 loops=1)  
                                 Index Cond: (c13 = 100)  
                           ->  Bitmap Index Scan on idx_tbl_tag_4  (cost=0.00..79038.62 rows=5826795 width=0) (actual time=1293.437..1293.437 rows=6101279 loops=1)  
                                 Index Cond: ((c4 >= 1) AND (c4 <= 200))  
 Planning time: 0.289 ms  
 Execution time: 16733.719 ms  
(20 rows)  

The multidimensional MetaScan feature of Alibaba Cloud can solve the problem of excessive scanned data blocks and reduce the time to less than 500 milliseconds.

4) For 1 billion tags, it takes 203 seconds to select about 10 million people by tag, pivot space attributes of these people, and perform hash join and parallelization.

explain (analyze,verbose,timing,costs,buffers)   
select st_geohash(t1.pos,6), count(*) from   
  tbl_pos_1 t1 join (select uid from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200) t2  
  on (t1.uid=t2.uid)   
group by st_geohash(t1.pos, 6);   
  
                                                                                         QUERY PLAN                                                                                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Finalize GroupAggregate  (cost=292262288.63..292678044.97 rows=9068342 width=40) (actual time=187392.324..197157.012 rows=8421096 loops=1)  
   Output: (st_geohash(t1.pos, 6)), count(*)  
   Group Key: (st_geohash(t1.pos, 6))  
   Buffers: shared hit=38723554 read=635308  
   ->  Gather Merge  (cost=292262288.63..292519348.94 rows=9068352 width=40) (actual time=187392.317..194293.700 rows=9171920 loops=1)  
         Output: (st_geohash(t1.pos, 6)), (PARTIAL count(*))  
         Workers Planned: 32  
         Workers Launched: 32  
         Buffers: shared hit=38723554 read=635308  
         ->  Partial GroupAggregate  (cost=292262287.80..292267955.52 rows=283386 width=40) (actual time=187348.727..187501.151 rows=277937 loops=33)  
               Output: (st_geohash(t1.pos, 6)), PARTIAL count(*)  
               Group Key: (st_geohash(t1.pos, 6))  
               Buffers: shared hit=1272770058 read=11675191  
               Worker 0: actual time=187342.771..187498.250 rows=282452 loops=1  
                 Buffers: shared hit=39055272 read=348022  
...... 并行聚合  
               ->  Sort  (cost=292262287.80..292262996.26 rows=283386 width=32) (actual time=187348.715..187401.757 rows=278685 loops=33)  
                     Output: (st_geohash(t1.pos, 6))  
                     Sort Key: (st_geohash(t1.pos, 6))  
                     Sort Method: quicksort  Memory: 25570kB  
                     Buffers: shared hit=1272770058 read=11675191  
                     Worker 0: actual time=187342.758..187396.443 rows=283206 loops=1  
                       Buffers: shared hit=39055272 read=348022  
...... 并行SORT  
                     ->  Hash Join  (cost=10413383.91..292236623.78 rows=283386 width=32) (actual time=79890.153..186716.320 rows=278685 loops=33)  
                           Output: st_geohash(t1.pos, 6)  
                           Hash Cond: (t1.uid = tbl_tag.uid)  
                           Buffers: shared hit=1272769802 read=11675191  
                           Worker 0: actual time=81406.113..186712.149 rows=283206 loops=1  
                             Buffers: shared hit=39055264 read=348022  
...... 并行索引扫描  
                           ->  Parallel Index Scan using idx1 on public.tbl_pos_1 t1  (cost=0.57..281390010.62 rows=31250000 width=40) (actual time=0.040..92949.279 rows=30303030 loops=33)  
                                 Output: t1.uid, t1.pos_att, t1.pos  
                                 Buffers: shared hit=991056941 read=11675191  
                                 Worker 0: actual time=0.078..91228.338 rows=30782430 loops=1  
                                   Buffers: shared hit=30518510 read=348022  
...... 并行HASH  
                           ->  Hash  (cost=10300029.06..10300029.06 rows=9068342 width=8) (actual time=77789.991..77789.991 rows=9196602 loops=33)  
                                 Output: tbl_tag.uid  
                                 Buckets: 16777216  Batches: 1  Memory Usage: 490315kB  
                                 Buffers: shared hit=281712413  
                                 Worker 0: actual time=79153.913..79153.913 rows=9196602 loops=1  
                                   Buffers: shared hit=8536740  
...... 并行bitmap扫描  
                                 ->  Bitmap Heap Scan on public.tbl_tag  (cost=132192.09..10300029.06 rows=9068342 width=8) (actual time=44896.981..74587.551 rows=9196602 loops=33)  
                                       Output: tbl_tag.uid  
                                       Recheck Cond: ((tbl_tag.c1 = 1) OR ((tbl_tag.c2 >= 1) AND (tbl_tag.c2 <= 100)) OR (tbl_tag.c13 = 100) OR ((tbl_tag.c4 >= 1) AND (tbl_tag.c4 <= 200)))  
                                       Heap Blocks: exact=8511538  
                                       Buffers: shared hit=281712413  
                                       Worker 0: actual time=45358.544..75896.906 rows=9196602 loops=1  
                                         Buffers: shared hit=8536740  
...... 并行bitmap扫描  
                                       ->  BitmapOr  (cost=132192.09..132192.09 rows=9087533 width=0) (actual time=38429.522..38429.522 rows=0 loops=33)  
                                             Buffers: shared hit=831659  
                                             Worker 0: actual time=38869.151..38869.151 rows=0 loops=1  
                                               Buffers: shared hit=25202  
...... 并行bitmap扫描  
                                             ->  Bitmap Index Scan on idx_tbl_tag_1  (cost=0.00..333.62 rows=30020 width=0) (actual time=9.922..9.922 rows=30332 loops=33)  
                                                   Index Cond: (tbl_tag.c1 = 1)  
                                                   Buffers: shared hit=2999  
                                                   Worker 0: actual time=10.045..10.045 rows=30332 loops=1  
                                                     Buffers: shared hit=91  
...... 并行bitmap扫描  
                                             ->  Bitmap Index Scan on idx_tbl_tag_2  (cost=0.00..43418.50 rows=3200783 width=0) (actual time=9529.886..9529.886 rows=3053594 loops=33)  
                                                   Index Cond: ((tbl_tag.c2 >= 1) AND (tbl_tag.c2 <= 100))  
                                                   Buffers: shared hit=275483  
                                                   Worker 0: actual time=9710.639..9710.639 rows=3053594 loops=1  
                                                     Buffers: shared hit=8348  
...... 并行bitmap扫描  
                                             ->  Bitmap Index Scan on idx_tbl_tag_13  (cost=0.00..332.99 rows=29936 width=0) (actual time=9019.691..9019.691 rows=30554 loops=33)  
                                                   Index Cond: (tbl_tag.c13 = 100)  
                                                   Buffers: shared hit=2903  
                                                   Worker 0: actual time=9143.024..9143.024 rows=30554 loops=1  
                                                     Buffers: shared hit=88  
...... 并行bitmap扫描  
                                             ->  Bitmap Index Scan on idx_tbl_tag_4  (cost=0.00..79038.62 rows=5826795 width=0) (actual time=19870.013..19870.013 rows=6101279 loops=33)  
                                                   Index Cond: ((tbl_tag.c4 >= 1) AND (tbl_tag.c4 <= 200))  
                                                   Buffers: shared hit=550274  
                                                   Worker 0: actual time=20005.432..20005.432 rows=6101279 loops=1  
                                                     Buffers: shared hit=16675  
......  
 Planning time: 0.302 ms  
 Execution time: 203637.896 ms  
(754 rows)  

Greenplum

Data Preparation

create or replace function ct1 () returns void as $$  
declare  
  sql text := '';  
begin  
  sql := 'create table tbl_tag(uid int8,';  
  for i in 1..200 loop  
    sql := sql||'c'||i||' int2 default random()*32767,';  
  end loop;  
  sql := rtrim(sql, ',');  
  sql := sql||') with (APPENDONLY=true, ORIENTATION=column, COMPRESSTYPE=zlib, CHECKSUM=false)';  
  execute sql;  
end;  
$$ language plpgsql strict;  
  
select ct1();  
  
create table tbl_pos(  
  uid int8,   
  pos_att int2,  
  pos geometry default st_setsrid(st_makepoint(73+random()*62, 3+random()*50), 4326)  
)  
with (APPENDONLY=true, ORIENTATION=row, COMPRESSTYPE=zlib, CHECKSUM=false)  
partition by list (pos_att)  
(  
  partition p1 values (1),  
  partition p2 values (2),  
  partition p3 values (3),  
  partition p4 values (4),  
  partition p5 values (5),  
  partition p6 values (6),  
  partition p7 values (7),  
  partition p8 values (8),  
  partition p9 values (9),  
  partition p10 values (10)  
)  
;  
  
nohup psql -c "insert into tbl_tag select generate_series(1,1000000000)" >/dev/null 2>&1 &  
  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),1" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),2" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),3" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),4" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),5" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),6" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),7" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),8" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),9" >/dev/null 2>&1 &  
nohup psql -c "insert into tbl_pos select generate_series(1,1000000000),10" >/dev/null 2>&1 &  
nohup psql -c "copy (select uid,pos_att,st_geohash(pos, 10) from tbl_pos) to stdout"|psql -c "copy tbl_pos from stdin" >/dev/null 2>&1 &

You can use Alibaba Cloud MetaScan, a feature similar to the PostgreSQL BRIN index, to accelerate the filtering of tbl_pos.pos fields.

Performance Test of Space and Attribute-based Target Selection + Data Pivoting

The PostGIS spatial plug-in is not used in the Greenplum database test. We use the Geohash text code instead. The following is the test result:

1) For 10 billion pieces of spatial data, it takes 21 seconds to circle about 10 million people by space.

select count(*) from tbl_pos where pos_att=1 and st_within(pos, st_setsrid(st_makebox2d(st_makepoint(120,5), st_makepoint(125.5,10.5)),4326));  
  
postgres=# select count(*) from tbl_pos where pos_att=1 and pos between 't9m' and 'tbbd' ;  
  count    
---------  
 9635855  
(1 row)  
  
Time: 21371.543 ms  

2) For 10 billion pieces of spatial data, it takes 29.3 seconds to circle about 10 million people by space, JOIN 1 billion pieces of tag data, and pivot tag attributes of these people.

explain analyze select c1,count(*),avg(c2),max(c3) from tbl_tag t2 join (select uid from tbl_pos where pos_att=1 and st_within(pos, st_setsrid(st_makebox2d(st_makepoint(120,5), st_makepoint(125.5,10.5)),4326))) t1 on (t1.uid=t2.uid) group by c1;  
  
postgres=# explain analyze select c1,count(*),avg(c2),max(c3) from tbl_tag t2 join (select uid from tbl_pos where pos_att=1 and pos between 't9m' and 'tbbd') t1 on (t1.uid=t2.uid) group by c1;  
                                                                             QUERY PLAN                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 48:1  (slice2; segments: 48)  (cost=43547383.92..43547955.26 rows=32648 width=44)  
   Rows out:  32768 rows at destination with 28854 ms to end, start offset by 448 ms.  
   ->  HashAggregate  (cost=43547383.92..43547955.26 rows=681 width=44)  
         Group By: t2.c1  
         Rows out:  Avg 682.7 rows x 48 workers.  Max 689 rows (seg6) with 0.001 ms to first row, 11625 ms to end, start offset by 466 ms.  
         ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=43546078.00..43546730.96 rows=681 width=44)  
               Hash Key: t2.c1  
               Rows out:  Avg 32694.9 rows x 48 workers at destination.  Max 33008 rows (seg31) with 17172 ms to end, start offset by 494 ms.  
               ->  HashAggregate  (cost=43546078.00..43546078.00 rows=681 width=44)  
                     Group By: t2.c1  
                     Rows out:  Avg 32694.9 rows x 48 workers.  Max 32719 rows (seg22) with 0.009 ms to first row, 82 ms to end, start offset by 491 ms.  
                     ->  Hash Join  (cost=18492191.00..43506178.00 rows=83125 width=6)  
                           Hash Cond: t2.uid = postgres.tbl_pos.uid  
                           Rows out:  Avg 200747.0 rows x 48 workers.  Max 201863 rows (seg25) with 0.044 ms to first row, 25419 ms to end, start offset by 494 ms.  
                           Executor memory:  6274K bytes avg, 6309K bytes max (seg25).  
                           Work_mem used:  6274K bytes avg, 6309K bytes max (seg25). Workfile: (0 spilling, 0 reused)  
                           ->  Append-only Columnar Scan on tbl_tag t2  (cost=0.00..22464112.00 rows=20833334 width=14)  
                                 Rows out:  0 rows (seg0) with 0.004 ms to end, start offset by 501 ms.  
                           ->  Hash  (cost=17993441.00..17993441.00 rows=831251 width=8)  
                                 Rows in:  (No row requested) 0 rows (seg0) with 0 ms to end.  
                                 ->  Append  (cost=0.00..17993441.00 rows=831251 width=8)  
                                       Rows out:  0 rows (seg0) with 15 ms to end, start offset by 503 ms.  
                                       ->  Append-only Scan on tbl_pos_1_prt_p1 tbl_pos  (cost=0.00..17993441.00 rows=831251 width=8)  
                                             Filter: pos_att = 1 AND pos >= 't9m'::text AND pos <= 'tbbd'::text  
                                             Rows out:  Avg 200747.0 rows x 48 workers.  Max 201863 rows (seg25) with 48 ms to end, start offset by 494 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 501K bytes.  
   (slice1)    Executor memory: 1613K bytes avg x 48 workers, 1613K bytes max (seg0).  Work_mem: 6309K bytes max.  
   (slice2)    Executor memory: 524K bytes avg x 48 workers, 524K bytes max (seg0).  
 Statement statistics:  
   Memory used: 128000K bytes  
 Settings:  optimizer=off  
 Optimizer status: legacy query optimizer  
 Total runtime: 29302.351 ms  
(34 rows)  
  
Time: 29306.897 ms  

3) For 1 billion pieces of tag data, it takes 3.4 seconds to select about 10 million people by tag.

select count(*) from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200;  
  
postgres=# select count(*) from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200;  
  count    
---------  
 9198749  
(1 row)  
  
Time: 3426.337 ms  

4) For 1 billion pieces of tag data, it takes 26.2 seconds to select about 10 million people by tag and pivot tag attributes of these people.

explain (analyze,verbose,timing,costs,buffers)   
select st_geohash(t1.pos,6), count(*) from   
  tbl_pos_1 t1 join (select uid from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200) t2  
  on (t1.uid=t2.uid)   
group by st_geohash(t1.pos, 6);   
  
postgres=# explain analyze   
postgres-# select substring(pos,1,6), count(*) from   
postgres-#   tbl_pos t1 join (select uid from tbl_tag where c1=1 or c2 between 1 and 100 or c13=100 or c4 between 1 and 200) t2  
postgres-#   on (t1.uid=t2.uid and t1.pos_att=1)   
postgres-# group by substring(pos,1,6);   
                                                                                     QUERY PLAN                                                                                       
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 48:1  (slice2; segments: 48)  (cost=53124325.91..53135748.71 rows=913824 width=40)  
   Rows out:  8421444 rows at destination with 25714 ms to end, start offset by 449 ms.  
   ->  HashAggregate  (cost=53124325.91..53135748.71 rows=19038 width=40)  
         Group By: "?column1?"  
         Rows out:  Avg 175446.8 rows x 48 workers.  Max 176265 rows (seg2) with 0.001 ms to first row, 8243 ms to end, start offset by 466 ms.  
         ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=53090057.51..53110618.55 rows=19038 width=40)  
               Hash Key: unnamed_attr_1  
               Rows out:  Avg 191284.2 rows x 48 workers at destination.  Max 192297 rows (seg37) with 15634 ms to end, start offset by 529 ms.  
               ->  HashAggregate  (cost=53090057.51..53092342.07 rows=19038 width=40)  
                     Group By: "substring"(t1.pos, 1, 6)  
                     Rows out:  Avg 191284.2 rows x 48 workers.  Max 191966 rows (seg1) with 0.006 ms to first row, 134 ms to end, start offset by 468 ms.  
                     ->  Hash Join  (cost=37578340.02..53085488.39 rows=19039 width=11)  
                           Hash Cond: t1.uid = tbl_tag.uid  
                           Rows out:  Avg 191640.6 rows x 48 workers.  Max 192331 rows (seg1) with 0.039 ms to first row, 18171 ms to end, start offset by 468 ms.  
                           Executor memory:  5989K bytes avg, 6011K bytes max (seg1).  
                           Work_mem used:  5989K bytes avg, 6011K bytes max (seg1). Workfile: (0 spilling, 0 reused)  
                           ->  Append  (cost=0.00..12993441.00 rows=20833334 width=19)  
                                 Rows out:  0 rows (seg0) with 1228 ms to end, start offset by 531 ms.  
                                 ->  Append-only Scan on tbl_pos_1_prt_p1 t1  (cost=0.00..12993441.00 rows=20833334 width=19)  
                                       Filter: pos_att = 1  
                                       Rows out:  Avg 20833333.3 rows x 48 workers.  Max 20833547 rows (seg37) with 0.005 ms to first row, 0.006 ms to end, start offset by 531 ms.  
                           ->  Hash  (cost=37464112.00..37464112.00 rows=190381 width=8)  
                                 Rows in:  (No row requested) 0 rows (seg0) with 0 ms to end.  
                                 ->  Append-only Columnar Scan on tbl_tag  (cost=0.00..37464112.00 rows=190381 width=8)  
                                       Filter: c1 = 1 OR (c2 >= 1 AND c2 <= 100) OR c13 = 100 OR (c4 >= 1 AND c4 <= 200)  
                                       Rows out:  0 rows (seg0) with 57 ms to end, start offset by 528 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 487K bytes.  
   (slice1)    Executor memory: 1725K bytes avg x 48 workers, 1725K bytes max (seg0).  Work_mem: 6011K bytes max.  
   (slice2)    Executor memory: 524K bytes avg x 48 workers, 524K bytes max (seg0).  
 Statement statistics:  
   Memory used: 128000K bytes  
 Settings:  optimizer=off  
 Optimizer status: legacy query optimizer  
 Total runtime: 26166.164 ms  
(35 rows)  
  
Time: 26170.031 ms  

Summary

Space

Database Table Record count Size
PostgreSQL 10 Tag table (201 fields) 1 billion 424 GB
Greenplum Tag table (201 fields) 1 billion 369 GB
PostgreSQL 10 Location table (12 fields) 10 billion 640 GB
Greenplum Location table (12 fields) 10 billion 150 GB
Database Index Index type SIZE
PostgreSQL 10 Tag table B-tree 4200 GB
PostgreSQL 10 Location table BRIN 27 MB

Performance

Database Service Time spent
PostgreSQL 10 Select about 10 million people by space from 10 billion spatial data records. 400 milliseconds
Greenplum Select about 10 million people by space from 10 billion spatial data records. 21 seconds
PostgreSQL 10 Select about 10 million people by space from 10 billion spatial data records. JOIN 1 billion tag data records, and pivot the tag attributes of these people. 7 seconds
Greenplum Select about 10 million people by space from 10 billion spatial data records. JOIN 1 billion tag data records, and pivot the tag attributes of these people. 29.3 seconds
PostgreSQL 10 Select about 10 million people by tag from 1 billion tag data records. 14.5 seconds (can be reduced to less than 500 ms through MetaScan)
Greenplum Select about 10 million people by tag from 1 billion tag data records. 3.4 seconds
PostgreSQL 10 Select about 10 million people by tag from 1 billion tag data records and pivot space attributes of these people. 203 seconds (performance can be greatly improved through the merge, join, and partial scan of PostgreSQL 11)
Greenplum Select about 10 million people by tag from 1 billion tag data records and pivot space attributes of these people. 26.2 seconds

Future Optimization Points of PostgreSQL 10

With the BRIN index and bitmap scan, PostgreSQL 10’s performance has exceeded that of Greenplum in some scenarios based on the same resources.

With the introduction of the column-store engine, parallel hash patches, and range merge join patches, PostgreSQL 10 will provide similar or even better performance than Greenplum in other scenarios based on the same resources.

1) Parallel hash join: https://commitfest.postgresql.org/14/871/

2) Range merge join: https://commitfest.postgresql.org/14/1106/

3) Parallel append scan: https://commitfest.postgresql.org/14/987/

The strength of Greenplum lies in the processing of larger data volumes. For example, Greenplum supports PB-level data analysis in real time through discretization.

PostgreSQL is more suitable for scenarios based on transaction processing (TP) and supported by analytical processing (AP), that is, scenarios covered by Oracle databases.

For more information about how to select Greenplum and PostgreSQL, see the preceding sections.

0 0 0
Share on

digoal

282 posts | 24 followers

You may also like

Comments