By digoal
A GIN index is a multi-value type inverted index in PostgreSQL. Accordingly, a record may involve keys in multiple GIN indexes. Therefore, real-time index merging in writing would sharply increase I/O and the write RT. To improve the write throughput, PostgreSQL (from now on referred to as PG) allows the GIN index merging delay. After enabling it, the data is written to the pending list first rather than directly to the index page. When the pending list reaches a certain size, or when the autovacuum corresponds to the table, the pending list is merged into the index.
During a query, if a PENDING LIST is not merged into the index, the pending list and the index information will be queried.
When the pending list is huge with a large amount of writes, and the merging (by autovacuum worker) fails to catch up, the query GIN index performance degrades.
Therefore, you can solve the problems after rooting out the causes.
1) fastupdate
This setting controls the usage of the fast update technique described in Section 66.4.1. It is a Boolean parameter: ON enables fast update, and OFF disables it. Alternative spellings of ON and OFF are allowed as described in Section 19.1. The default is ON.
Note:
Turning fastupdate off via ALTER INDEX prevents future insertions from going into the list of pending index entries but does not flush previous entries. You might want to VACUUM the table or call the gin_clean_pending_list function afterward to ensure the pending list is emptied.
2) gin_pending_list_limit
Custom gin_pending_list_limit parameter. You can specify this value in kilobytes.
Current settings:
postgres=# show gin_pending_list_limit ;
gin_pending_list_limit
------------------------
4MB
(1 row)
1) pages_per_range
Defines the number of table blocks that make up one block range for each entry of a BRIN index (see Section 67.1 for more details). The default is 128.
2) autosummarize
Defines whether a summarization run is invoked for the previous page range after detecting an insertion on the next one.
You can learn more at this link: https://www.postgresql.org/docs/11/static/pageinspect.html
postgres=# create extension pageinspect ;
CREATE EXTENSION
1) Create a table.
postgres=# create table t(id int, arr int[]);
CREATE TABLE
2) Create an inverted index.
postgres=# create index idx_t_1 on t using gin (arr);
CREATE INDEX
3) Create a function which generates a random array.
postgres=# create or replace function gen_rand_arr() returns int[] as $$
select array(select (100*random())::int from generate_series(1,64));
$$ language sql strict;
CREATE FUNCTION
4) Write test data.
postgres=# insert into t select generate_series(1,100000), gen_rand_arr();
INSERT 0 100000
postgres=# insert into t select generate_series(1,1000000), gen_rand_arr();
INSERT 0 1000000
5) Check the size of the pending list of the current GIN index through the pageinspect plug-in. You can see that there are 356 pending pages, with 2,484 pieces of records.
Due to the huge records in the pending list, the query performance would decrease significantly.
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t where arr @> array[1,2,3];
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.t (cost=82.38..262.28 rows=11373 width=284) (actual time=82.444..141.559 rows=114906 loops=1)
Output: id, arr
Recheck Cond: (t.arr @> '{1,2,3}'::integer[])
Heap Blocks: exact=41304
Buffers: shared hit=42043
-> Bitmap Index Scan on idx_t_1 (cost=0.00..79.92 rows=11373 width=0) (actual time=75.902..75.902 rows=114906 loops=1)
Index Cond: (t.arr @> '{1,2,3}'::integer[])
Buffers: shared hit=739
Planning Time: 0.092 ms
Execution Time: 152.260 ms
(10 rows)
6) Query Test One (pending list is greater than 0).
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t where arr @> array[1,2,3];
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.t (cost=82.38..262.28 rows=11373 width=284) (actual time=82.444..141.559 rows=114906 loops=1)
Output: id, arr
Recheck Cond: (t.arr @> '{1,2,3}'::integer[])
Heap Blocks: exact=41304
Buffers: shared hit=42043
-> Bitmap Index Scan on idx_t_1 (cost=0.00..79.92 rows=11373 width=0) (actual time=75.902..75.902 rows=114906 loops=1)
Index Cond: (t.arr @> '{1,2,3}'::integer[])
Buffers: shared hit=739
Planning Time: 0.092 ms
Execution Time: 152.260 ms
(10 rows)
7) Vacuum table, forcibly merging the pending list.
set vacuum_cost_delay=0;
postgres=# vacuum t;
VACUUM
8) After merging the pending list, n_pending_tuples is 0.
postgres=# SELECT * FROM gin_metapage_info(get_raw_page('idx_t_1', 0));
pending_head | pending_tail | tail_free_size | n_pending_pages | n_pending_tuples | n_total_pages | n_entry_pages | n_data_pages | n_entries | version
--------------+--------------+----------------+-----------------+------------------+---------------+---------------+--------------+-----------+---------
4294967295 | 4294967295 | 0 | 0 | 0 | 9978 | 41 | 9421 | 101 | 2
(1 row)
9) Query Test Two (pending list = 0).
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t where arr @> array[1,2,3];
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.t (cost=792.36..1699.10 rows=117244 width=284) (actual time=79.861..139.603 rows=114906 loops=1)
Output: id, arr
Recheck Cond: (t.arr @> '{1,2,3}'::integer[])
Heap Blocks: exact=41304
Buffers: shared hit=41687
-> Bitmap Index Scan on idx_t_1 (cost=0.00..766.95 rows=117244 width=0) (actual time=73.360..73.360 rows=114906 loops=1)
Index Cond: (t.arr @> '{1,2,3}'::integer[])
Buffers: shared hit=383 -- 大幅减少
Planning Time: 0.135 ms
Execution Time: 150.656 ms
(10 rows)
The process is also similar to BRIN Indexes.
PG is used as the business search engine for random field combination query, fuzzy query, and full-text search, among others. A large amount of data is updated according to KEY value with a query before each update (also by KEY value). However, some users place all fields including the key field in the GIN index for convenience.
What are the consequences?
1) The GIN index is used for queries.
2) A large number of update operations would increase the size of gin pending list.
3) With a large number of pending lists, list scanning cosumes the CPU, thus increasing the total time.
4) The GIN index uses the Bitmapscan, which consumes more CPU resources in total than index scan does due to CPU recheck. The CPU would break down quickly in high-concurrency scenarios.
Solution: Extract the key fields from the GIN index and put them in the B-tree index. The database preferentially chooses B-tree index, which queries faster with lower CPU consumption. If the key has other random field combination queries, it can still be retained in the GIN index. That is, the key fields are in both GIN index and B-tree index.
1) Install the plug-in.
create extension btree_gin;
create extension pg_trgm;
create extension pageinspect;
2) Create a table and index with problems. Put all fields in the GIN index with 10 million records.
create table test(id int, c1 int, c2 int, c3 int, c4 int, c5 text);
insert into test select generate_series(1,10000000), random()*100, random()*1000, random()*10000, random()*100000, md5(random()::text);
create index idx_test_1 on test using gin (id,c1,c2,c3,c4,c5 gin_trgm_ops);
3) Execute the plan.
explain (analyze,verbose,timing,costs,buffers) select * from test where id=1;
explain (analyze,verbose,timing,costs,buffers) update test set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=1;
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test where id=1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.test (cost=0.10..0.12 rows=1 width=53) (actual time=0.143..0.143 rows=1 loops=1)
Output: id, c1, c2, c3, c4, c5
Recheck Cond: (test.id = 1)
Heap Blocks: exact=1
Buffers: shared hit=5
-> Bitmap Index Scan on idx_test_1 (cost=0.00..0.10 rows=1 width=0) (actual time=0.098..0.098 rows=1 loops=1)
Index Cond: (test.id = 1)
Buffers: shared hit=4
Planning Time: 2.089 ms
Execution Time: 0.922 ms
(10 rows)
postgres=# explain (analyze,verbose,timing,costs,buffers) update test set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.test (cost=0.10..0.15 rows=1 width=59) (actual time=1.342..1.343 rows=0 loops=1)
Buffers: shared hit=9 read=1 dirtied=1
-> Bitmap Heap Scan on public.test (cost=0.10..0.15 rows=1 width=59) (actual time=0.053..0.054 rows=1 loops=1)
Output: id, (random() * '100'::double precision), (random() * '1000'::double precision), (random() * '10000'::double precision), (random() * '100000'::double precision), c5, ctid
Recheck Cond: (test.id = 1)
Heap Blocks: exact=1
Buffers: shared hit=5
-> Bitmap Index Scan on idx_test_1 (cost=0.00..0.10 rows=1 width=0) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (test.id = 1)
Buffers: shared hit=4
Planning Time: 0.149 ms
Execution Time: 1.496 ms
(12 rows)
4) Stress testing and update: id in gin, 4,960 qps.
vi test.sql
\set id random(1,10000000)
update test set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=:id;
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 595656
latency average = 12.887 ms
latency stddev = 27.841 ms
tps = 4960.755350 (including connections establishing)
tps = 4963.713963 (excluding connections establishing)
statement latencies in milliseconds:
0.002 \set id random(1,10000000)
12.924 update test set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=:id;
During the stress testing, you can multiple pending pages for the GIN index.
SELECT * FROM gin_metapage_info(get_raw_page('idx_test_1', 0));
postgres=# SELECT * FROM gin_metapage_info(get_raw_page('idx_test_1', 0));
-[ RECORD 1 ]----+---------
pending_head | 175345
pending_tail | 178885
tail_free_size | 6640
n_pending_pages | 4627
n_pending_tuples | 30024
n_total_pages | 175024
n_entry_pages | 76904
n_data_pages | 98119
n_entries | 10231456
version | 2
Take out the key field and retain the fuzzy query fields in the GIN index. If necessary, you can still retain c1, c2, c3, and c4 in the GIN index.
create table test1(id int, c1 int, c2 int, c3 int, c4 int, c5 text);
insert into test1 select * from test;
create index idx_test1_1 on test1 using btree(id);
create index idx_test1_2 on test1 using btree(c1);
create index idx_test1_3 on test1 using btree(c2);
create index idx_test1_4 on test1 using btree(c3);
create index idx_test1_5 on test1 using btree(c4);
create index idx_test1_6 on test1 using gin(c5 gin_trgm_ops);
或
create index idx_test1_1 on test1 using btree(id);
create index idx_test1_2 on test1 using gin(id,c1,c2,c3,c4,c5 gin_trgm_ops);
Execute the plan.
explain (analyze,verbose,timing,costs,buffers) select * from test1 where id=1;
explain (analyze,verbose,timing,costs,buffers) update test1 set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=1;
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test1 where id=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_test1_1 on public.test1 (cost=0.38..0.42 rows=1 width=53) (actual time=0.089..0.090 rows=1 loops=1)
Output: id, c1, c2, c3, c4, c5
Index Cond: (test1.id = 1)
Buffers: shared hit=1 read=3
Planning Time: 0.696 ms
Execution Time: 0.115 ms
(6 rows)
postgres=#
postgres=# explain (analyze,verbose,timing,costs,buffers) update test1 set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.test1 (cost=0.38..0.44 rows=1 width=59) (actual time=1.183..1.183 rows=0 loops=1)
Buffers: shared hit=11 read=13 dirtied=6
-> Index Scan using idx_test1_1 on public.test1 (cost=0.38..0.44 rows=1 width=59) (actual time=0.019..0.021 rows=1 loops=1)
Output: id, (random() * '100'::double precision), (random() * '1000'::double precision), (random() * '10000'::double precision), (random() * '100000'::double precision), c5, ctid
Index Cond: (test1.id = 1)
Buffers: shared hit=4
Planning Time: 0.135 ms
Execution Time: 1.246 ms
(8 rows)
Stress testing: qps is 43,231 with a 10 times increase.
vi test1.sql
\set id random(1,10000000)
update test1 set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=:id;
pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 64 -j 64 -T 120
transaction type: ./test1.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 5187765
latency average = 1.480 ms
latency stddev = 30.987 ms
tps = 43212.076731 (including connections establishing)
tps = 43231.697380 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,10000000)
1.479 update test1 set c1=random()*100, c2=random()*1000, c3=random()*10000, c4=random()*100000 where id=:id;
During the stress testing, the GIN index is still generating pending lists. However, since the GIN index is not used for queries, the update efficiency does not decrease.
postgres=# SELECT * FROM gin_metapage_info(get_raw_page('idx_test1_6', 0));
-[ RECORD 1 ]----+-------
pending_head | 125629
pending_tail | 177046
tail_free_size | 6840
n_pending_pages | 77183
n_pending_tuples | 483977
n_total_pages | 80904
n_entry_pages | 23
n_data_pages | 80880
n_entries | 9248
version | 2
The database adopts merging delay to reduce the increase of write RT introduced by the index. If the database is under long-term heavy write pressure, the number of unmerged LISTs may be large and may affect the query performance.
You can use the pageinspect plug-in to view the size of the unmerged pending lists. In addition, you can merge the pending lists forcibly using vacuum to improve query performance.
PostgreSQL Time Series Database: How TimescaleDB Supports Compression Scheduling
digoal - May 18, 2021
Alibaba Clouder - December 11, 2017
digoal - February 3, 2020
digoal - April 12, 2019
digoal - December 11, 2019
digoal - November 28, 2023
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMore Posts by digoal