×
Community Blog Updating Data in MaxCompute Non-transactional Tables

Updating Data in MaxCompute Non-transactional Tables

This article describes methods to update data in MaxCompute non-transactional tables through the insert overwrite statement.

By Long Qingyun

Background

It is very complex for most storage formats in big data to support random updates, because it needs to scan large files. Therefore, MaxCompute provides the latest feature: transactional tables to update and delete statements.

However, update and delete statements do not apply to frequent updates, data deletion, or real-time writing to target tables. Moreover, the statements are not supported for non-transactional tables. This article describes methods to update data through the insert overwrite statement.

Create Table and Insert Data

create table update_table(ID int,
 tranValue string,
 last_update_user string) PARTITIONED by(dt STRING ) LIFECYCLE 1;
INSERT INTO update_table PARTITION (dt="20210510") VALUES
(1, 'value_01', 'creation'),
(2, 'value_02', 'creation'),
(3, 'value_03', 'creation'),
(4, 'value_04', 'creation'),
(5, 'value_05', 'creation'),
(6, 'value_06', 'creation'),
(7, 'value_07', 'creation'),
(8, 'value_08', 'creation'),
(9, 'value_09', 'creation'),
(10, 'value_10','creation');

Update a Data Record

Update the string to value_011 if the id is 1.

--Update a data record
INSERT OVERWRITE TABLE update_table PARTITION( dt)
SELECT  id
        ,CASE    WHEN id=1 THEN "value_011" 
                 ELSE TranValue 
         END TranValue
        ,last_update_user
        ,dt
FROM    update_table
WHERE   dt = "20210510"
;

Update Multiple Data Records

Create an incremental table and insert data based on the update.

create table update_table_inc(ID int,
 TranValue string,
 last_update_user string) LIFECYCLE 1;
INSERT INTO update_table_inc VALUES 
(5, 'value_11', 'creation'),
(6, NULL, '20170410'),
(7, 'value22', '20170413');

If the id is 5 and 7, update the TranValue, because the TranValue is null and not updated when the id is 6

INSERT OVERWRITE TABLE update_table PARTITION( dt)
SELECT  a.id
        ,CASE    WHEN a.id=b.id  and b.TranValue is not null THEN b.TranValue 
                 ELSE a.TranValue 
         END TranValue
        ,CASE    WHEN a.id=b.id and b.TranValue is not null THEN b.last_update_user 
                 ELSE a.last_update_user 
         END last_update_user
         ,dt
FROM    update_table a
LEFT JOIN update_table_inc b
ON      a.id = b.id
WHERE   a.dt = "20210510"
;

Delete Data

--Delete data

INSERT OVERWRITE TABLE update_table PARTITION( dt)
SELECT *
       
FROM    update_table
WHERE   dt = "20210510" and id !=4
;
0 1 0
Share on

Alibaba Cloud MaxCompute

137 posts | 19 followers

You may also like

Comments

Alibaba Cloud MaxCompute

137 posts | 19 followers

Related Products