The UPDATE statement updates the values of one or more columns in a Transactional table or a Delta Table.
Prerequisites
You are granted the Select and Update permissions on the transactional table on which you want to execute the DELETE or UPDATE statement. For more information, see MaxCompute permissions.
Limits
The
DELETEstatement,UPDATEstatement, and a transactional table or Delta table on which the DELETE or UPDATE statement is executed have the following limits:NoteFor more information, see Parameters for Transaction Table and Delta Table.
Before you execute the
UPDATE,DELETE, orINSERT OVERWRITEstatement on important data in transactional tables, you must execute theSELECTandINSERTstatements to back up the data to other tables.You cannot execute the
UPDATEstatement to change values in the primary key column of a Delta table.
Precautions
When you use DELETE or UPDATE statement to delete or update data in tables or partitions of the tables, take note of the following items:
In specific scenarios, you may want to execute the DELETE or UPDATE statement for a small amount of data in a table and infrequently perform read and other operations in subsequent procedures. To reduce the storage space that is occupied by the table, we recommend that you merge the base files with all delta files after you execute the
DELETEorUPDATEstatement for the table several times. For more information, see ALTER TABLE COMPACT.In specific scenarios, you may want to delete or update more than 5% of the data in a table or a partition of the table at a low frequency and perform frequent read operations in subsequent procedures. We recommend that you execute the
INSERT OVERWRITEorINSERT INTOstatement in such scenarios. For more information, see Insert data into or overwrite data in a table or a static partition (INSERT INTO and INSERT OVERWRITE).For example, you want to perform delete or update operations for 10% of data 10 times each day. In this case, we recommend that you estimate the total cost and the consumption of the subsequent read performance if you execute the
DELETEorUPDATEstatement on the table. Then, compare the estimated result with that of executing theINSERT OVERWRITEorINSERT INTOstatement. This helps you choose an efficient method.Each time you execute the DELETE statement on a table, a delta file is automatically generated. As a result, the occupied storage space may not be reduced. If you want to execute the
DELETEstatement to delete data to reduce storage usage, you can merge the base files with all delta files. For more information, see ALTER TABLE COMPACT.MaxCompute executes multiple
DELETEandUPDATEstatements in jobs at a time. Each statement consumes resources and incurs fees. We recommend that you delete or update a batch of data at a time. For example, if you run a Python script to generate and submit a large number of row-level update jobs, and each statement is executed for only one row or a small number of rows of data, each statement incurs fees that correspond to the amount of input data scanned by the SQL statement and consumes the related computing resources. When multiple statements are accumulated, the costs are significantly increased and the system efficiency is reduced. Sample statements:We recommend that you execute the following statement:
UPDATE table1 SET col1= (SELECT value1 FROM table2 WHERE table1.id = table2.id AND table1.region = table2.region);We recommend that you do not execute the following statements:
UPDATE table1 SET col1=1 WHERE id='2021063001'AND region='beijing'; UPDATE table1 SET col1=2 WHERE id='2021063002'AND region='beijing';
Syntax
--Syntax 1
UPDATE <table_name> [[AS] alias] SET <col1_name> = <value1> [, <col2_name> = <value2> ...] [WHERE <where_condition>];
--Syntax 2
UPDATE <table_name> [[AS] alias] SET (<col1_name> [, <col2_name> ...]) = (<value1> [, <value2> ...]) [WHERE <where_condition>];
--Syntax 3
UPDATE <table_name> [[AS] alias]
SET <col1_name> = <value1> [, <col2_name> = <value2>, ...]
[FROM <additional_tables>]
[WHERE <where_condition>];
Parameters
-
table_name: Required. The name of the Transactional table or Delta Table on which to run the
UPDATEoperation. -
alias: Optional. The alias of the table.
-
col1_name, col2_name: Required. The name of the column to update. You must update at least one column.
-
value1, value2: Required. The new value for the column. You must update at least one column value.
-
where_condition: Optional. The WHERE clause. This clause filters data based on specified conditions. For more information, see SELECT syntax. If you do not include the WHERE clause, all data in the table is updated.
-
additional_tables: Optional. The FROM clause.
UPDATEsupports the FROM clause. The FROM clause provides a more convenient way to writeUPDATEstatements. The following examples compare the use of `UPDATE` with and without the FROM clause.Scenario
Sample code
Without FROM clause
UPDATE target SET v = (SELECT MIN(v) FROM src GROUP BY k WHERE target.k = src.key) WHERE target.k IN (SELECT k FROM src);With FROM clause
UPDATE target SET v = b.v FROM (SELECT k, MIN(v) AS v FROM src GROUP BY k) b WHERE target.k = b.k;The preceding code examples show that:
-
If you update a row in the target table using multiple rows from the source table, you must write an aggregate operation to ensure the uniqueness of the source data. This is because it is ambiguous which source row to use for the update. Writing this statement without the FROM clause is not concise. In contrast, the syntax that includes the FROM clause is simpler and easier to understand.
-
If you perform an associated update and want to update only the data at the intersection of the tables, you must add an extra WHERE condition if you omit the FROM clause. This method is less concise than using the FROM clause.
-
Examples
-
Example 1: Create a non-partitioned table named acid_update, import data into the table, and use the
UPDATEoperation to update column data for rows that meet specified conditions. Sample commands:--Create a Transactional table named acid_update. CREATE TABLE IF NOT EXISTS acid_update(id BIGINT) tblproperties ("transactional"="true"); --Insert data. INSERT OVERWRITE TABLE acid_update VALUES(1),(2),(3),(2); --View the inserted data. SELECT * FROM acid_update; --The following result is returned: +------------+ | id | +------------+ | 1 | | 2 | | 3 | | 2 | +------------+ --Update the value of the id column to 4 for all rows where id is 2. UPDATE acid_update SET id = 4 WHERE id = 2; --View the updated data. The value 2 is updated to 4. SELECT * FROM acid_update; --The following result is returned: +------------+ | id | +------------+ | 1 | | 3 | | 4 | | 4 | +------------+ -
Example 2: Create a partitioned table named acid_update, import data into the table, and use the
UPDATEoperation to update column data for rows that meet specified conditions. Sample commands:--Create a Transactional table named acid_update_pt. CREATE TABLE IF NOT EXISTS acid_update_pt(id BIGINT) PARTITIONED BY(ds STRING) tblproperties ("transactional"="true"); --Add a partition. ALTER TABLE acid_update_pt ADD IF NOT EXISTS PARTITION (ds= '2019'); --Insert data. INSERT OVERWRITE TABLE acid_update_pt PARTITION (ds='2019') VALUES(1),(2),(3); --View the inserted data. SELECT * FROM acid_update_pt WHERE ds = '2019'; --The following result is returned: +------------+------------+ | id | ds | +------------+------------+ | 1 | 2019 | | 2 | 2019 | | 3 | 2019 | +------------+------------+ --Update a column in the specified rows. Update the value of the id column to 4 for all rows where id is 2 in the partition ds='2019'. UPDATE acid_update_pt SET id = 4 WHERE ds = '2019' AND id = 2; --View the updated data. The value 2 is updated to 4. SELECT * FROM acid_update_pt WHERE ds = '2019'; --The following result is returned: +------------+------------+ | id | ds | +------------+------------+ | 4 | 2019 | | 1 | 2019 | | 3 | 2019 | +------------+------------+ -
Example 3: Create a target table named acid_update_t and an associated table named acid_update_s, and then update multiple column values at the same time. Sample commands:
--Create the target Transactional table acid_update_t and the associated table acid_update_s. CREATE TABLE IF NOT EXISTS acid_update_t(id INT,value1 INT,value2 INT) tblproperties ("transactional"="true"); CREATE TABLE IF NOT EXISTS acid_update_s(id INT,value1 INT,value2 INT); --Insert data. INSERT OVERWRITE TABLE acid_update_t VALUES(2,20,21),(3,30,31),(4,40,41); INSERT OVERWRITE TABLE acid_update_s VALUES(1,100,101),(2,200,201),(3,300,301); --Method 1: Update with constants. UPDATE acid_update_t SET (value1, value2) = (60,61); --Query the data in the target table after the update in Method 1. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 60 | 61 | | 3 | 60 | 61 | | 4 | 60 | 61 | +------------+------------+------------+ --Method 2: Perform an associated update. The rule is to left join the acid_update_t table with the acid_update_s table. UPDATE acid_update_t SET (value1, value2) = (SELECT value1, value2 FROM acid_update_s WHERE acid_update_t.id = acid_update_s.id); --Query the data in the target table after the update in Method 2. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 200 | 201 | | 3 | 300 | 301 | | 4 | NULL | NULL | +------------+------------+------------+ --Method 3 (update based on the result of Method 2): Perform an associated update. The rule is to add a filter condition to update only the intersection. UPDATE acid_update_t SET (value1, value2) = (SELECT value1, value2 FROM acid_update_s WHERE acid_update_t.id = acid_update_s.id) WHERE acid_update_t.id IN (SELECT id FROM acid_update_s); --Query the data in the target table after the update in Method 3. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 200 | 201 | | 3 | 300 | 301 | | 4 | NULL | NULL | +------------+------------+------------+ --Method 4 (update based on the result of Method 3): Perform an associated update using aggregated results. UPDATE acid_update_t SET (id, value1, value2) = (SELECT id, MAX(value1),MAX(value2) FROM acid_update_s WHERE acid_update_t.id = acid_update_s.id GROUP BY acid_update_s.id) WHERE acid_update_t.id IN (SELECT id FROM acid_update_s); --Query the data in the target table after the update in Method 4. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 200 | 201 | | 3 | 300 | 301 | | 4 | NULL | NULL | +------------+------------+------------+ -
Example 4: Perform a simple associated query that involves two tables. Sample commands:
--Create the target table acid_update_t and the associated table acid_update_s. CREATE TABLE IF NOT EXISTS acid_update_t(id BIGINT,value1 BIGINT,value2 BIGINT) tblproperties ("transactional"="true"); CREATE TABLE IF NOT EXISTS acid_update_s(id BIGINT,value1 BIGINT,value2 BIGINT); --Insert data. INSERT OVERWRITE TABLE acid_update_t VALUES(2,20,21),(3,30,31),(4,40,41); INSERT OVERWRITE TABLE acid_update_s VALUES(1,100,101),(2,200,201),(3,300,301); --Query the data in the acid_update_t table. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 20 | 21 | | 3 | 30 | 31 | | 4 | 40 | 41 | +------------+------------+------------+ --Query the data in the acid_update_s table. SELECT * FROM acid_update_s; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 1 | 100 | 101 | | 2 | 200 | 201 | | 3 | 300 | 301 | +------------+------------+------------+ --Perform an associated update. Add a filter condition to the target table to update only the intersection. UPDATE acid_update_t SET value1 = b.value1, value2 = b.value2 FROM acid_update_s b WHERE acid_update_t.id = b.id; -- This is equivalent to the preceding statement. UPDATE acid_update_t a SET a.value1 = b.value1, a.value2 = b.value2 FROM acid_update_s b WHERE a.id = b.id; --View the updated data. 20 is updated to 200, 21 is updated to 201, 30 is updated to 300, and 31 is updated to 301. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 4 | 40 | 41 | | 2 | 200 | 201 | | 3 | 300 | 301 | +------------+------------+------------+ -
Example 5: Perform a complex associated query that involves multiple tables. Sample commands:
--Create the target table acid_update_t and the associated tables acid_update_s and acid_update_m. CREATE TABLE IF NOT EXISTS acid_update_t(id BIGINT,value1 BIGINT,value2 BIGINT) tblproperties ("transactional"="true"); CREATE TABLE IF NOT EXISTS acid_update_s(id BIGINT,value1 BIGINT,value2 BIGINT); CREATE TABLE IF NOT EXISTS acid_update_m(id BIGINT,value1 BIGINT,value2 BIGINT); --Insert data. INSERT OVERWRITE TABLE acid_update_t VALUES(2,20,21),(3,30,31),(4,40,41),(5,50,51); INSERT OVERWRITE TABLE acid_update_s VALUES (1,100,101),(2,200,201),(3,300,301),(4,400,401),(5,500,501); INSERT OVERWRITE TABLE acid_update_m VALUES(3,30,101),(4,400,201),(5,300,301); --Query the data in the acid_update_t table. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 2 | 20 | 21 | | 3 | 30 | 31 | | 4 | 40 | 41 | | 5 | 50 | 51 | +------------+------------+------------+ --Query the data in the acid_update_s table. SELECT * FROM acid_update_s; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 1 | 100 | 101 | | 2 | 200 | 201 | | 3 | 300 | 301 | | 4 | 400 | 401 | | 5 | 500 | 501 | +------------+------------+------------+ --Query the data in the acid_update_m table. SELECT * FROM acid_update_m; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 3 | 30 | 101 | | 4 | 400 | 201 | | 5 | 300 | 301 | +------------+------------+------------+ --Perform an associated update and filter both the source and target tables in the WHERE clause. UPDATE acid_update_t SET value1 = acid_update_s.value1, value2 = acid_update_s.value2 FROM acid_update_s WHERE acid_update_t.id = acid_update_s.id AND acid_update_s.id > 2 AND acid_update_t.value1 NOT IN (SELECT value1 FROM acid_update_m WHERE id = acid_update_t.id) AND acid_update_s.value1 NOT IN (SELECT value1 FROM acid_update_m WHERE id = acid_update_s.id); --View the updated data. In the acid_update_t table, only the data of the row where id is 5 meets the conditions. The value of value1 is updated to 500, and the value of value2 is updated to 501. SELECT * FROM acid_update_t; --The following result is returned: +------------+------------+------------+ | id | value1 | value2 | +------------+------------+------------+ | 5 | 500 | 501 | | 2 | 20 | 21 | | 3 | 30 | 31 | | 4 | 40 | 41 | +------------+------------+------------+ -
Example 6: Use the following command to create a Delta table named mf_dt, import data, and execute an
UPDATEoperation to delete rows that meet a specified condition:--Create the target Delta Table mf_dt. CREATE TABLE IF NOT EXISTS mf_dt (pk BIGINT NOT NULL PRIMARY KEY, val BIGINT NOT NULL) PARTITIONED BY(dd STRING, hh STRING) tblproperties ("transactional"="true"); --Insert data. INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='02') VALUES (1, 1), (2, 2), (3, 3); --View the inserted data. SELECT * FROM mf_dt WHERE dd='01' AND hh='02'; --The following result is returned: +------------+------------+----+----+ | pk | val | dd | hh | +------------+------------+----+----+ | 1 | 1 | 01 | 02 | | 3 | 3 | 01 | 02 | | 2 | 2 | 01 | 02 | +------------+------------+----+----+ --Update a column in the specified rows. For all rows where pk is 3 in the partition dd='01' and hh='02', update the value of the val column to 30. --Method 1 UPDATE mf_dt SET val = 30 WHERE pk = 3 AND dd='01' AND hh='02'; --Method 2 UPDATE mf_dt SET val = delta.val FROM (SELECT pk, val FROM VALUES (3, 30) t (pk, val)) delta WHERE delta.pk = mf_dt.pk AND mf_dt.dd='01' AND mf_dt.hh='02'; --View the updated data. SELECT * FROM mf_dt WHERE dd='01' AND hh='02'; --The following result is returned. The value of the val column for the row where pk is 3 is updated to 30. +------------+------------+----+----+ | pk | val | dd | hh | +------------+------------+----+----+ | 1 | 1 | 01 | 02 | | 3 | 30 | 01 | 02 | | 2 | 2 | 01 | 02 | +------------+------------+----+----+
Related statements
-
DELETE: Deletes one or more rows that meet specified conditions from a Transactional table or Delta Table.
-
ALTER TABLE: Merges the files of a Transactional table.