Use BRIN to Discover Target Customers from Massive E-Commerce Advertising Data in Milliseconds

By digoal

Background

A single computer supports discovering target customers up to 100 TB of data in milliseconds. How? You can do this using PostgreSQL.

PostgreSQL is the most advanced open-source database globally and offers many options to solve our data-related problems. We have always been able to solve business problems based on advanced PostgreSQL features.

Compared with other indexes, a Block Range Index (BRIN) is more cost-effective and is thus recommended for PostgreSQL. What you need is to add a data rule. Moreover, data used in advertising services is generally append-only static data. Therefore, you can introduce the data normalization step to the business logic without undermining the logic's overall aesthetic. This becomes practicable upon verification on the application side.

Introduction

The business scenario is the same as that introduced in the previous two documents.

For example, how are a store's target customers discovered?

We first need to collect some data to answer this question. For example:

1) Users who browse and make purchases at this and other similar stores.

Browsing online stores generates behavior records, for instance, the stores and commodities we browse, the products we purchase, the stores where we make those purchases, and the times at which all of this happens.

Then for each store, we can extract a sample of users who browsed or made purchases.

2) After obtaining these user groups, we filter them for similar purchase intentions or other attributes.

We can analyze the attributes of these groups to form a larger target group and market to them.

Design Assumptions

Daily User-view Statistics

1) 100 million daily active users
2) 100 million daily active stores
3) Each user visits 64 stores daily on average
4) 6.4 billion data points

Weekly User-view Statistics

1) 500 million weekly active users
2) 200 million weekly active stores
3) Each user views 256 stores each week on average
4) 128,000 million data points in total

We assumed that the volume will be enlarged up to 10 times and reaches a maximum of 1.28 trillion while operating.

Schema Design

1) New data is generated in the user-visit statistics table every day from an analytic system. This data is imported in parallel to Alibaba Cloud RDS for PostgreSQL through its external table interface for Object Storage Service (OSS) tables.

create table bi_user_tmall_vis(    
  uid int8, -- 用户ID    
  bid int8, -- 店铺ID，(商品ID使用其他表来表示，结果和查询需求类似，不再赘述)    
  cnt int   -- 浏览次数，(商品浏览次数、购买次数，使用其他表来表示，结果和查询需求类似，不再赘述)    
);

Query Requirements

1) Querying a user ID that has browsed a specified store at, within, or beyond a fixed frequency.

This operation is performed to discover target customers of a specified store. This operation's idea is that customers who have viewed your products must have some interest in them.

2) It relates to the same condition but consists of multiple OR conditions.

The Magic of BRIN

Data Normalization

Store IDs and views must be normalized to make queries more efficient.

The normalization method is as follows:

insert into bi_user_tmall_vis1 select * from bi_user_tmall_vis1 order by bid,cnt;

A bid and cnt-combined BRIN index is created after normalization.

create index idx_bi1 on bi_user_tmall_vis1 using brin (bid,cnt) WITH (pages_per_range='256');

Extra Requirements: Multi-attribute Filtering

We may also need to filter other customer attributes, such as gender, purchasing frequency, and age range.

That is to say, the table structure is no longer as simple as what we mentioned above. Nevertheless, we have done some screening in order to simplify the demo.

We can use three optimization methods when we have multiple query requirements:

1) Performing associated indexing

2) Performing multi-indexing, and combining the indexes using bitmapAnd and bitmapOr in PostgreSQL, and then performing SKIP scanning

3) Using table partitioning, i.e., partitioning other query conditions as partitioning fields

All of these methods aim to reduce the scanning volume and improve query efficiency.

Multi-level Partition Tables

Alibaba Cloud AnalyticDB for PostgreSQL supports partitioning syntax on multiple levels.

PostgreSQL can achieve multi-level partitioning using multi-level inheritance. Meanwhile, both PostgreSQL 10 and the plug-in pg_pathman support multi-level partitioning.

Filtering and Optimization Outside the Partition Key

This feature leverages PostgreSQL's built-in multi-indexing BitmapAnd and BitmapOr scanning functions.

Example:

create table test(c1 int , c2 int, c3 int);

Index 1

create index idx1 on test (c1);

Index 2

create index idx3 on test (c3);

If you run a query based on these two indexes, PostgreSQL automatically combines them.

-- bitmapAnd scan    
    
select * from test where c1 = ? and c3=?;    
    
-- bitmapOr scan    
select * from test where c1 = ? or c3=?;

Performance Demonstration for a Single Table with 6.4 Billion Data Points

1) Write 6.4 billion pieces of test data.

vi test.sql    
    
insert into bi_user_tmall_vis select random()*2000000000,random()*100000000,random()*1000 from generate_series(1,10000);    
    
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -t 20000

Table size:

postgres=# \dt+ bi_user_tmall_vis    
                          List of relations    
 Schema |       Name        | Type  |  Owner   |  Size  | Description     
--------+-------------------+-------+----------+--------+-------------    
 public | bi_user_tmall_vis | table | postgres | 311 GB |     
(1 row)

2) Perform data normalization.

create table bi_user_tmall_vis1 (like bi_user_tmall_vis);    
    
nohup psql -c "set work_mem='128GB';set maintenance_work_mem='128GB';insert into bi_user_tmall_vis1 select * from bi_user_tmall_vis order by bid,cnt;" >/dev/null 2>&1 &

3) Create a BRIN index.

create index idx_bi on bi_user_tmall_vis1 using brin (bid,cnt) WITH (pages_per_range='512');

Size of the index:

3MB左右，夸张吧，311GB的表，索引只有3MB大小。

4) Filter users who have browsed any store N to M times.

 public | idx_bi                 | index | postgres | bi_user_tmall_vis1 | 3336 kB    |       
      
      
postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where bid=1 and cnt between 1 and 100;      
                                                               QUERY PLAN                                                                     
----------------------------------------------------------------------------------------------------------------------------------------      
 Bitmap Heap Scan on public.bi_user_tmall_vis1  (cost=521.47..105255.40 rows=7351 width=20) (actual time=16.024..25.791 rows=4 loops=1)      
   Output: uid, bid, cnt      
   Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100))      
   Rows Removed by Index Recheck: 80380      
   Heap Blocks: lossy=512      
   Buffers: shared hit=529 read=511      
   ->  Bitmap Index Scan on idx_bi  (cost=0.00..519.63 rows=80384 width=0) (actual time=16.010..16.010 rows=5120 loops=1)      
         Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100))      
         Buffers: shared hit=528      
 Planning time: 0.238 ms      
 Execution time: 25.822 ms      
(11 rows)            
      
postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) or (bid=12000 and cnt <10000);      
                                  QUERY PLAN       
--------------------------------------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.bi_user_tmall_vis1  (cost=1674.17..315338.06 rows=153721 width=20) (actual time=47.115..78.014 rows=138 loops=1)      
   Output: uid, bid, cnt      
   Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) OR ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000)))      
   Rows Removed by Index Recheck: 241014      
   Heap Blocks: lossy=1536      
   Buffers: shared hit=2608 read=512      
   ->  BitmapOr  (cost=1674.17..1674.17 rows=241151 width=0) (actual time=47.099..47.099 rows=0 loops=1)      
         Buffers: shared hit=1584      
         ->  Bitmap Index Scan on idx_bi  (cost=0.00..519.63 rows=80384 width=0) (actual time=16.167..16.167 rows=5120 loops=1)      
               Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100))      
               Buffers: shared hit=528      
         ->  Bitmap Index Scan on idx_bi  (cost=0.00..519.63 rows=80384 width=0) (actual time=15.494..15.494 rows=5120 loops=1)      
               Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000))      
               Buffers: shared hit=528      
         ->  Bitmap Index Scan on idx_bi  (cost=0.00..519.63 rows=80384 width=0) (actual time=15.437..15.437 rows=5120 loops=1)      
               Index Cond: ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000))      
               Buffers: shared hit=528      
 Planning time: 0.145 ms      
 Execution time: 78.062 ms      
(19 rows)

Indexing Accuracy	Data Size of a Single Table	Size of a Single Table	Index Size	1 Condition	2 Conditions	3 Conditions
pages_per_range=1	6.4 billion	311GB	1.6GB	8.2 s	-	-
pages_per_range=128	6.4 billion	311GB	13MB	62 ms	-	191 ms
pages_per_range=256	6.4 billion	311GB	6MB	33 ms	-	105 ms
pages_per_range=512	6.4 billion	311GB	3MB	25 ms	-	78 ms
pages_per_range=sqrt(pg_class.relpages)=6384	6.4 billion	311GB	300KB	97 ms	112 ms	139 ms

Optimizing Pages_per_range

As explained above, the I/O cost for BRIN indexing is divided into two parts:

1) Scanning BRIN index blocks
2) Scanning HEAP table blocks

These two scans combined represent the I/O costs, the remaining cost resulting from CUP computing for filtering each record.

I/O Cost Estimation in BRIN Index Scanning

I) For a single condition query, the number of HEAP data blocks that will be hit is determined by two factors:

How many data blocks are occupied by a single condition value? For example, for condition bid=1, there are 1 million records, and the normalized data mentioned above occupies 5,000 data blocks.
Accuracy of pages_per_range, for example, the accuracy could be 512. That is to say, at least 512 HEAP blocks are scanned.

The greater value will be selected, so 5,000 HEAP data blocks need to be queried at a time.

II) For a single condition query, how many BRIN index data blocks need to be scanned is determined by the index size.

When pages_per_range=512, the BRIN index size is about 3 MB.

III) For a single condition query, estimating I/O costs in BRIN index scanning requires scanning 3 MB + 5,000 HEAP blocks.

IV) The estimation method for a multi-condition query is similar.

Taking a three-condition query as an example, scanning costs (HEAP BLOCK + BRIN IDX BLOCK) are multiplied by three times.

So, how shall we select the BRIN index accuracy parameter pages_per_range?

Computing Pages_per_range

How can we compute a ten-condition query's block costs assuming a measurement requirement, for example, execution in milliseconds?

1) Estimate how many records are occupied by an equivalent condition (A):

1、pg_stats.n_distinct >= 1 时  
  
(pg_class.reltuples/pg_stats.n_distinct)  
  
2、pg_stats.n_distinct < 1 时  
  
(pg_class.reltuples*pg_stats.n_distinct)/pg_class.reltuples  
  
3、pg_stats.n_distinct = -1 时  
  
1

2) Estimate the correlation (B):

B = abs(pg_stats.correlation)

3) Estimate how many HEAP blocks are occupied by an equivalent condition (C):

C = A / B

4) Estimate how many data blocks are occupied by the BRIN index when pages_per_range=1 (D):

D = pg_class.relpages/(pg_class.reltuples/pg_class.relpages)

5) Estimate how many data blocks are occupied by the BRIN index when pages_per_range=n (E):

E = D / n

6) Estimate how many BRIN index blocks need to be scanned in a ten-condition query when pages_per_range=n (F):

F = 10 * E

7) Estimate how many HEAP blocks need to be scanned in a ten-condition query when pages_per_range=n (G):

G = 10 * C

8) Estimate how many HEAP blocks need to be scanned in a ten-condition query when pages_per_range=n (H):

H = F + G

You can use these formulas to compute how much pages_per_range needs to be set to in order to achieve milliseconds returns for a ten condition query.

Designing a 1 Trillion Data Point Volume

According to weekly statistics, the volume usually is 1 billion data points in size, 1 trillion data points during operations.

We tested a 6.4 billion single table in the text above and got query returns within milliseconds.

Then how about a single table with 1 trillion data points? Partitioning based on stores and product IDs can be solved with a partition table.

Take HASH partitioning, for example:

HASH is divided into 64 partitions based on stores, with a 100 million data point volume for each partition.

HASH is divided into 640 partitions based on stores, with a 200 million - 2 billion data point volume for each partition.

A 6.4 billion data point single table is tested, and performance is proven. So, would you be concerned about a small table that only has a few hundred million data points?

Introduction to the PostgreSQL partitioning use method: https://github.com/postgrespro/pg_pathman

Cloud Products

Summary

The following are the technical points necessary to the document:

1) BRIN, which helps users efficiently filter data at zero cost and deliver a response in milliseconds during target customer discovery using any store condition with a single table of 6.4 billion data points.

2) Data normalization that improves the linear correlation of fields. It makes the BRIN data scope almost perfect.

Data rules and acceleration methods:

a) Writing the targets in order when importing.
b) PostgreSQL 11 parallel sorting.
c) Partition table parallel sorting.

3) Partitioning + data normalization, a data optimization method used for volumes with 1 trillion or more data points.

4) HDB normalization + metascan, where metascan is similar to BRIN and relates to a kernel feature added by Alibaba Cloud for HDB for PostgreSQL. The original Greenplum does not have this feature.

5) Parallel append scan. After being split into multiple partition tables, PostgreSQL can then parallel-scan the partition tables. This can improve overall performance.

6) Multi-field indexing that improves data filter accuracy, reduces the amount scanned, and improves query performance by combining bitmapAnd and bitmapOr.

Community

Use BRIN to Discover Target Customers from Massive E-Commerce Advertising Data in Milliseconds

Background

Introduction

Design Assumptions

Daily User-view Statistics

Weekly User-view Statistics

Schema Design

Query Requirements

The Magic of BRIN

Data Normalization

Extra Requirements: Multi-attribute Filtering

Multi-level Partition Tables

Filtering and Optimization Outside the Partition Key

Performance Demonstration for a Single Table with 6.4 Billion Data Points

Optimizing Pages_per_range

I/O Cost Estimation in BRIN Index Scanning

Computing Pages_per_range

Designing a 1 Trillion Data Point Volume

Cloud Products

Summary

Read previous post:

Read next post:

digoal

You may also like

Comments

digoal

Related Products

PolarDB for PostgreSQL

AnalyticDB for PostgreSQL

Tair

Time Series Database (TSDB)