AnalyticDB for PostgreSQL has the features of Greenplum Database and supports the hll extension. AnalyticDB for PostgreSQL is suited for industries such as Internet advertising and estimation analysis that require quick estimation of business metrics such as page views (PVs) and unique visitors (UVs).
Install the hll extension
Before you use the hll extension in an AnalyticDB for PostgreSQL instance, install the hll extension on the Extensions page of the instance. For more information, see Install, update, and uninstall extensions.
Basic types
Execute the following statement to create a table that contains the hll field:
create table agg (id int primary key,userids hll);
Execute the following statement to convert the int type into the hll_hashval type:
select 1::hll_hashval;
Basic operators
The hll type supports =, !=, <>, ||, and #.
select hll_add_agg(1::hll_hashval) = hll_add_agg(2::hll_hashval); select hll_add_agg(1::hll_hashval) || hll_add_agg(2::hll_hashval); select #hll_add_agg(1::hll_hashval);
The hll_hashval type supports =, !=, and <>.
select 1::hll_hashval = 2::hll_hashval; select 1::hll_hashval <> 2::hll_hashval;
Basic functions
Hash functions such as hll_hash_boolean, hll_hash_smallint, and hll_hash_bigint.
select hll_hash_boolean(true); select hll_hash_integer(1);
hll_add_agg: converts a value of the int type into the hll_hashval type and aggregates the value.
select hll_add_agg(1::hll_hashval);
hll_union: aggregates the hll fields.
select hll_union(hll_add_agg(1::hll_hashval),hll_add_agg(2::hll_hashval));
hll_set_defaults: sets the precision.
select hll_set_defaults(15,5,-1,1);
hll_print: displays debugging information.
select hll_print(hll_add_agg(1::hll_hashval));
Examples
create table access_date (acc_date date unique, userids hll);
insert into access_date select current_date, hll_add_agg(hll_hash_integer(user_id)) from generate_series(1,10000) t(user_id);
insert into access_date select current_date-1, hll_add_agg(hll_hash_integer(user_id)) from generate_series(5000,20000) t(user_id);
insert into access_date select current_date-2, hll_add_agg(hll_hash_integer(user_id)) from generate_series(9000,40000) t(user_id);
postgres=# select #userids from access_date where acc_date=current_date;
?column?
------------------
9725.85273370708
(1 row)
postgres=# select #userids from access_date where acc_date=current_date-1;
?column?
------------------
14968.6596883279
(1 row)
postgres=# select #userids from access_date where acc_date=current_date-2;
?column?
------------------
29361.5209149911
(1 row)