Upsert (INSERT ON CONFLICT DO) is a new function of PostgreSQL 9.5. When a constraint error occurs during data insertion, data insertion is rolled back or changed to update. The syntax for the same is as follows:
Command: INSERT
Description: create new rows in a table
Syntax:
[ WITH [ RECURSIVE ] with_query [, ...] ]
INSERT INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
[ ON CONFLICT [ conflict_target ] conflict_action ]
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
where conflict_target can be one of:
( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]
ON CONSTRAINT constraint_name
and conflict_action is one of:
DO NOTHING
DO UPDATE SET { column_name = { expression | DEFAULT } |
( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) |
( column_name [, ...] ) = ( sub-SELECT )
} [, ...]
[ WHERE condition ]
For versions earlier than PostgreSQL 9.5, use functions or the with
syntax to implement functions similar to upsert.
Execute the following command to create a test table with one field as the unique key or primary key.
create table test(id int primary key, info text, crt_time timestamp);
Step 1) If a table does not exist, insert it. Otherwise, update it.
test03=# insert into test values (1,'test',now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
INSERT 0 1
test03=# select * from test;
id | info | crt_time
----+------+----------------------------
1 | test | 2017-04-24 15:27:25.393948
(1 row)
test03=# insert into test values (1,'hello digoal',now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
INSERT 0 1
test03=# select * from test;
id | info | crt_time
----+--------------+----------------------------
1 | hello digoal | 2017-04-24 15:27:39.140877
(1 row)
Step 2) If the data does not exist, insert it. Otherwise, do not perform any operation.
test03=# insert into test values (1,'hello digoal',now()) on conflict (id) do nothing;
INSERT 0 0
test03=# insert into test values (1,'pu',now()) on conflict (id) do nothing;
INSERT 0 0
test03=# insert into test values (2,'pu',now()) on conflict (id) do nothing;
INSERT 0 1
test03=# select * from test;
id | info | crt_time
----+--------------+----------------------------
1 | hello digoal | 2017-04-24 15:27:39.140877
2 | pu | 2017-04-24 15:28:20.37392
(2 rows)
Use different methods to meet the actual requirements.
test03=# create or replace function f_upsert(int,text,timestamp) returns void as $$
declare
res int;
begin
update test set info=$2,crt_time=$3 where id=$1;
if not found then
insert into test (id,info,crt_time) values ($1,$2,$3);
end if;
exception when SQLSTATE '23505' then
-- return; -- on conflict (id) do nothing
update test set info=$2,crt_time=$3 where id=$1; -- insert into test values ($1,$2,$3) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time
end;
$$ language plpgsql strict;
CREATE FUNCTION
test03=# select f_upsert(1,'digoal',now()::timestamp);
f_upsert
----------
(1 row)
test03=# select * from test;
id | info | crt_time
----+--------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal | 2017-04-24 15:31:29.254325
(2 rows)
test03=# select f_upsert(1,'digoal001',now()::timestamp);
f_upsert
----------
(1 row)
test03=# select * from test;
id | info | crt_time
----+-----------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal001 | 2017-04-24 15:31:38.0529
(2 rows)
test03=# select f_upsert(3,'hello',now()::timestamp);
f_upsert
----------
(1 row)
test03=# select * from test;
id | info | crt_time
----+-----------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal001 | 2017-04-24 15:31:38.0529
3 | hello | 2017-04-24 15:31:49.14291
(3 rows)
Use functions to upsert data. To write data in batches, create variables as shown below.
create table batch (id int primary key, info text, crt_time timestamp);
create or replace function merge_batch(VARIADIC i batch[])
returns void as $$
declare var batch;
begin
foreach var in array i loop
update batch set info=var.info,crt_time=var.crt_time where id=var.id;
if not found then
insert into batch values (var.*);
end if;
end loop;
exception when others then
return;
end;
$$ language plpgsql strict;
select merge_batch((1,'abc','2000-01-01'),(1,'abcd','2017-02-02'));
create table test(id int primary key, info text, crt_time timestamp);
If the exists, update it. Otherwise, insert it.
with upsert as (update test set info=$info,crt_time=$crt_time where id=$id returning *) insert into test select $id,$info,$crt_time where not exists (select 1 from upsert where id=$id);
Replace the variables and perform a test.
with upsert as (update test set info='test',crt_time=now() where id=1 returning *) insert into test select 1,'test',now() where not exists (select 1 from upsert where id=1);
When a nonexistent value is inserted, only one session has the successfully inserted value, whereas the other session returns a primary key constraint error.
Ensure concurrency even if the table does not have a primary key or a unique constraint.
create table test(id int, info text, crt_time timestamp);
When the same data item is updated, the first established session locks the record, whereas the session that is established later enters the waiting state.
with
w1 as(select ('x'||substr(md5('$id'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info=$info,crt_time=$crt_time where id=$id returning *)
insert into test select $id, $info, $crt_time from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=$id);
Next, replace the variables and perform a test.
with
w1 as(select ('x'||substr(md5('1'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info='digoal0123',crt_time=now() where id=1 returning *)
insert into test select 1, 'digoal0123', now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=1);
INSERT 0 0
test03=# select * from test;
id | info | crt_time
----+------------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
1 | digoal0123 | 2017-04-24 15:31:38.0529
(3 rows)
with
w1 as(select ('x'||substr(md5('4'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info='digoal0123',crt_time=now() where id=4 returning *)
insert into test select 4, 'digoal0123', now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=4);
INSERT 0 1
test03=# select * from test;
id | info | crt_time
----+------------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
1 | digoal0123 | 2017-04-24 15:31:38.0529
4 | digoal0123 | 2017-04-24 15:38:39.801908
(4 rows)
When the same data item is updated, the session that is established first updates the data, whereas the session that is established later directly returns an error.
with w1 as(select ('x'||substr(md5('$id'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info=$info,crt_time=$crt_time from w1 where pg_try_advisory_xact_lock(tra_id) and id=$id returning *)
insert into test select $id,$info,$crt_time from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=$id);
Next, replace the variables and perform a test.
with w1 as(select ('x'||substr(md5('1'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info='test',crt_time=now() from w1 where pg_try_advisory_xact_lock(tra_id) and id=1 returning *)
insert into test select 1,'test',now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=1);
INSERT 0 0
test03=# select * from test;
id | info | crt_time
----+------------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
4 | digoal0123 | 2017-04-24 15:42:50.912887
1 | test | 2017-04-24 15:44:44.245167
(4 rows)
PostgreSQL has supported Rule syntax for a long time. Create a rule with Rule syntax. If the rule exists, update it. Otherwise, insert it. However, when using the volatile function, do not directly use exists. Otherwise, it will be processed as an immutable function.
If it exists, do not insert it (ignore it). Otherwise, update it.
Implement idempotent writing so no problems occur while writing during resumable uploads.
postgres=# create table e(id int primary key, info text);
CREATE TABLE
Execute the following code to create a volatile function.
CREATE OR REPLACE FUNCTION public.ff(integer)
RETURNS boolean
LANGUAGE sql
STRICT
volatile
AS $function$
select true from e where id=$1 limit 1;
$function$;
Create a rule as shown below.
postgres=# create rule r1 as on insert to e where ff(NEW.id) do instead update e set info=NEW.info where id=NEW.id;
CREATE RULE
postgres=# insert into e values (1,'test'),(1,'test');
INSERT 0 0
postgres=# insert into e values (2,'test'),(2,'test');
INSERT 0 1
postgres=# insert into e values (3,'test'),(3,'test');
INSERT 0 1
postgres=# truncate e;
TRUNCATE TABLE
postgres=# select * from b;
id | info
----+------
1 | a
1 | b
(2 rows)
postgres=# insert into e select * from b;
INSERT 0 1
postgres=# \d+ b
Table "public.b"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
Implement performance stress testing as shown below.
vi test.sql
\set id random(1,1000000)
insert into e values (:id, md5(random()::text));
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 1000
progress: 90.0 s, 132056.5 tps, lat 0.220 ms stddev 0.055
progress: 91.0 s, 131656.9 tps, lat 0.220 ms stddev 0.040
progress: 92.0 s, 134941.0 tps, lat 0.215 ms stddev 0.090
progress: 93.0 s, 134324.5 tps, lat 0.216 ms stddev 0.076
progress: 94.0 s, 136699.4 tps, lat 0.212 ms stddev 0.070
progress: 95.0 s, 139291.4 tps, lat 0.208 ms stddev 0.067
progress: 96.0 s, 136073.2 tps, lat 0.213 ms stddev 0.076
progress: 97.0 s, 135804.6 tps, lat 0.214 ms stddev 0.076
progress: 98.0 s, 146037.6 tps, lat 0.199 ms stddev 0.069
progress: 99.0 s, 129619.5 tps, lat 0.224 ms stddev 0.049
progress: 100.0 s, 129230.0 tps, lat 0.224 ms stddev 0.047
progress: 101.0 s, 131048.4 tps, lat 0.221 ms stddev 0.055
progress: 102.0 s, 128808.0 tps, lat 0.225 ms stddev 0.048
progress: 103.0 s, 128954.6 tps, lat 0.225 ms stddev 0.048
progress: 104.0 s, 131227.9 tps, lat 0.221 ms stddev 0.042
progress: 105.0 s, 129604.0 tps, lat 0.224 ms stddev 0.057
Always take the following precautions for using the Rule syntax.
1) The use of exists in the Rule syntax is only suitable for a single insert statement (use the volatile function to solve this problem).
postgres=# create table e(id int, info text);
CREATE TABLE
postgres=# create rule r1 as on insert to e where exists (select 1 from e t1 where t1.id=NEW.id limit 1) do instead nothing;
CREATE RULE
在一个语句中插入多条,如果多条中有重复,则在RULE中判断条件时仅判断一次(类似immutable函数)。
postgres=# insert into e values (1,'test'),(1,'test');
INSERT 0 2
postgres=# select * from e;
id | info
----+------
1 | test
1 | test
(2 rows)
2) Rule syntax does not support copy statements. Repeated copy statements also cause problems.
In any case, it is recommended that tables requiring upsert have a primary key.
Automatic Cleaning and Scheduling of PostgreSQL Rotate Tables - Constraints and Triggers
digoal - March 25, 2020
digoal - February 23, 2022
digoal - February 3, 2020
digoal - May 28, 2019
digoal - September 20, 2019
digoal - May 16, 2019
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreThis solution enables FinTech companies to run workloads on the cloud, bringing greater customer satisfaction with lower latency and higher scalability.
Learn MoreMore Posts by digoal