The UPSERT syntax works in the same manner as the combination of the INSERT and UPDATE syntaxes. If the specified row already exists, the UPDATE syntax is executed. If a row does not exist, the INSERT syntax is executed. When you execute an UPSERT statement, you must specify the column that is used as the primary key. The UPSERT syntax can be used to insert individual rows or multiple rows of data with timestamps in a batch. In this case, the UPSERT syntax is equivalent to the INSERT syntax.
Applicable engines and versions
The UPSERT syntax is applicable to all versions of LindormTable and LindormTSDB.
Comparison between the INSERT operation in rational database and the UPSERT operation in LindormTable
The UPSERT operation in LindormTable is different from the INSERT operation in rational databases.
INSERT operations in traditional rational databases
For a table with a specified primary key, if you perform two consecutive INSERT operations without specifying the ON DUPLICATE KEY extension to write data with the same primary key value, the second INSERT operation fails. In this case, you must explicitly execute the UPDATE statement to update data or specify the ON DUPLICATE KEY extension in the second INSERT operation.
Lindorm
LindormTable: If you perform two consecutive UPSERT operations to write data with the same primary key, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation without errors reported. In this case, LindormTable stores the rows written by the two operation as different versions. By default, when a SELECT statement is executed to query the row of data, the latest version of each column in the row is returned.
LindormTSDB: If you perform two consecutive UPSERT operations to write data with the same primary key, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation.
For the example on how to use the UPSERT syntax to insert rows with the same primary key, see Write rows with the same primary key.
Syntax
upsert_statement ::= { UPSERT | INSERT } [ hint_expression ]
INTO table_identifier columns_delaration
VALUES value_list ( ',' value_list)*
[ ON DUPLICATE KEY column_identifier =
value_literal | IGNORE ]
columns_delaration ::= '(' column_identifier ( ',' column_identifier)* ')'
value_list ::= '(' value_expression( ',' value_expression)* ')'
Parameters
HINT expression (hint_expression)
HINT expressions are supported only by LindormTable.
You can use the _l_ts_ hint in an UPSERT statement to specify a timestamp for the row that you want to update or insert. For more information about the _l_ts_ hint, see Parameters of hintOptions.
ON DUPLICATE KEY
The ON DUPLICATE KEY
clause is supported only by LindormTable.
The ON DUPLICATE KEY
clause is used to check whether the specified row exists and is similar to the CHECK AND PUT operation in HBase. The clause complies to the following rules:
You can use the
ON DUPLICATE KEY
clause followed by the UPDATE keyword to update the values in the specified column. If the specified column already exists, the UPDATE statement is executed to update the values in the column. If the specified column does not exist, no data is updated and no error is reported. For example, if you execute theON DUPLICATE KEY UPDATE c1 = 20
statement and the c1 column exists, the value in the c1 column is updated to 20. If the c1 column does not exist, no data is updated and no error is reported.You can also use the IGNORE keyword after the
ON DUPLICATE KEY
clause. In this case, if the specified column already exists, no data is updated and no error is reported. If the specified column does not exist, data in the specified column is updated.The ON DUPLICATE KEY clause is applicable only to tables for which the CONSISTENCY option is set to
strong
. For more information about the CONSISTENCY option, see CREATE TABLE. To change the value of CONSISTENCY for a table, see ALTER TABLE.
Examples
In the following examples, the sample table sensor is created by executing the following statement:
CREATE TABLE sensor (
device_id VARCHAR NOT NULL,
region VARCHAR NOT NULL,
time TIMESTAMP NOT NULL,
temperature DOUBLE,
humidity BIGINT,
PRIMARY KEY(device_id, region, time)
)WITH(VERSIONS=2);
Write data to a table
UPSERT INTO sensor(device_id, region, time, temperature, humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:00',12.1,45);
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Write data to specific columns
UPSERT INTO sensor(device_id, region, time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2);
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Use ON DUPLICATE KEY IGNORE in the statement to write data to a table
Use the ON DUPLICATE KEY IGNORE
clause to write data to a table in LindormTable.
If the row that meets the conditions specified by device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2
exists, no data is written. If the column does not exist, data is written to the table.
UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY IGNORE;
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Use ON DUPLICATE KEY UPDATE in the statement to update data in a table
Use the ON DUPLICATE KEY UPDATE
clause to write data to a table in LindormTable.
If the row that meets the conditions specified by device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2
exists, the value of the temperature column is updated to 30. If the row does not exist, the value of the temperature column is not updated and no error is reported.
UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY UPDATE temperature = 30;
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Write data with timestamps
Write a row of data to the table sensor in LindormTable and set the timestamp of the row to 111232
.
UPSERT /*+ _l_ts_(111232) */ INTO sensor (device_id ,region ,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:00',12.1);
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Write multiple rows in a batch
UPSERT INTO sensor (device_id ,region ,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:20',10.6), ('F07A1261','south-cn','2021-04-22 15:33:00',18.1), ('F07A1261','south-cn','2021-04-22 15:33:10',19.7);
Verify the result
You can execute the SELECT * FROM sensor;
statement to check whether the data is written to the table.
Write rows with the same primary key
Use the UPSERT statement to consecutively write two rows of data with the same primary key in LindormTable. Then, use hints to query the data written to the table.
If you perform two consecutive UPSERT operations to write data with the same primary key in LindormTSDB, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation.
Write data for the first time.
UPSERT INTO sensor(device_id ,region ,time,temperature,humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2,45);
Query the data written by the first operation.
SELECT * FROM sensor WHERE device_id='F07A1260' AND region='north-cn';
The following result is returned:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 13.2 | 45 | +-----------+----------+-------------------------------+-------------+----------+
Write data for the second time.
UPSERT INTO sensor(device_id ,region ,time,temperature,humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',16.7,52);
Query the data written by the second operation.
SELECT * FROM sensor WHERE device_id='F07A1260' AND region='north-cn';
The following result is returned:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 16.7 | 52 | +-----------+----------+-------------------------------+-------------+----------+
The values of the device_id, region, and time primary key columns in the two operations are the same. According to the results, the value of the temperature column written by the first operation is overwritten by the value written by the second operation.
Use the _l_versions_ hint to query all versions of written data.
SELECT /*+ _l_versions_(2) */ device_id, region, time,temperature,humidity FROM sensor WHERE device_id='F07A1260';
The following result is returned:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 16.7 | 52 | | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 13.2 | 45 | +-----------+----------+-------------------------------+-------------+----------+
According to the results, the values written by the two operations are saved as two versions.