All Products
Search
Document Center

MaxCompute:DELETE

Last Updated:Aug 02, 2024

Deletes one or more rows of data that meets the specified conditions from a partitioned transactional table or a non-partitioned transactional table.

Prerequisites

You are granted the Select and Update permissions on the transactional table on which you want to execute the DELETE or UPDATE statement. For more information, see MaxCompute permissions.

Limits

The DELETE statement, UPDATE statement, and transactional table on which the DELETE or UPDATE statement is executed have the following limits:

  • You can execute the DELETE and UPDATE statements only on transactional tables. For more information about how to create a transactional table, see CREATE TABLE.

  • MaxCompute allows you to set a table to a transactional table only when you create the table. If you execute the ALTER TABLE statement to change an existing table to a transactional table, an error is returned.

    alter table not_txn_tbl set tblproperties("transactional"="true");
    -- The following error is returned: 
    FAILED: Catalog Service Failed, ErrorCode: 151, Error Message: Set transactional is not supported
  • When you create a clustered table or an external table, you cannot set it to a transactional table.

  • You cannot convert between transactional tables and MaxCompute internal tables, external tables, or clustered tables.

  • Transactional table files cannot be automatically merged. You must manually merge transactional table files. For more information, see the ALTER TABLE COMPACT section in UPDATE and DELETE.

  • The MERGE PARTITION operation is not supported.

  • Specific limits are imposed on access to transactional tables from jobs of other systems. For example, if your job is a Graph job, you cannot use the job to read data from or write data to a transactional table. If your job is a Spark job or a Platform for AI (PAI) job, you can use the job to only read data from a transactional table and you cannot use the job to write data to a transactional table.

  • Before you execute the UPDATE, DELETE, or INSERT OVERWRITE statement on important data in transactional tables, you must execute the SELECT and INSERT statements to back up the data to other tables.

Precautions

When you execute the DELETE or UPDATE statement to delete or update data in tables or partitions of the tables, take note of the following items:

  • In specific scenarios, you may want to execute the DELETE or UPDATE statement for a small amount of data in a table and infrequently perform read and other operations in subsequent procedures. To reduce the storage space that is occupied by the table, we recommend that you merge the base files with all delta files after you execute the DELETE or UPDATE statement for the table several times. For more information, see Merge files of a transactional table.

  • In specific scenarios, you may want to delete or update more than 5% of the data in a table or a partition of the table at a low frequency and perform frequent read operations in subsequent procedures. We recommend that you execute the INSERT OVERWRITE or INSERT INTO statement in such scenarios. For more information, see INSERT INTO|OVERWRITE.

    For example, you want to perform delete or update operations for 10% of data 10 times each day. In this case, we recommend that you estimate the total cost and the consumption of the subsequent read performance if you execute the DELETE or UPDATE statement on the table. Then, compare the estimated result with that of executing the INSERT OVERWRITE or INSERT INTO statement. This helps you choose an efficient method.

  • MaxCompute executes multiple DELETE and UPDATE statements in jobs at a time. Each statement consumes resources and incurs fees. We recommend that you delete or update a batch of data at a time. For example, if you run a Python script to generate and submit a large number of row-level update jobs, and each statement is executed for only one row or a small number of rows of data, each statement incurs fees that correspond to the amount of input data scanned by the SQL statement and consumes the related computing resources. When multiple statements are accumulated, the costs are significantly increased and the system efficiency is reduced. Examples:

    -- We recommend that you execute the following statement: 
    update table1 set col1= (select value1 from table2 where table1.id = table2.id and table1.region = table2.region);
    
    -- We recommend that you do not execute the following statements: 
    update table1 set col1=1 where id='2021063001'and region='beijing';                  
    update table1 set col1=2 where id='2021063002'and region='beijing';

Syntax

delete from <table_name> [where <where_condition>];

Parameters

Parameter

Required

Description

table_name

Yes

The name of the transactional table on which you want to execute the DELETE statement.

where_condition

No

A WHERE clause that is used to filter data based on conditions. For more information about the WHERE clause, see WHERE clause (where_condition). If you execute the UPDATE statement on a table without a WHERE clause, all data in the table is deleted.

Examples

  • Example 1: Create a non-partitioned transactional table named acid_delete and insert data into the table. Then, execute the DELETE statement to delete the rows that meet the specified conditions from the table. Sample statements:

    -- Create a non-partitioned transactional table named acid_delete. 
    create table if not exists acid_delete(id bigint) tblproperties ("transactional"="true"); 
    
    -- Insert data into the destination table and associated table. 
    insert overwrite table acid_delete values(1),(2),(3),(2); 
    
    -- Query the table to check whether data is inserted. 
    select * from acid_delete; 
    
    +------------+
    | id         |
    +------------+
    | 1          |
    | 2          |
    | 3          |
    | 2          |
    +------------+
    
    -- Delete the rows whose value of the id column is 2. If you execute the statement on the MaxCompute client (odpscmd), you must enter yes or no to confirm the deletion. 
    delete from acid_delete where id = 2; 
    
    -- Query the table to check whether the table contains only the rows whose values of the id column are 1 and 3. 
    select * from acid_delete; 
    
    +------------+
    | id         |
    +------------+
    | 1          |
    | 3          |
    +------------+
  • Example 2: Create a partitioned transactional table named acid_delete_pt and insert data into the table. Then, execute the DELETE statement to delete the rows that meet the specified conditions from the table. Sample statements:

    -- Create a partitioned transactional table named acid_delete_pt.  
    create table if not exists acid_delete_pt(id bigint) partitioned by(ds string) tblproperties ("transactional"="true");
    
    -- Add partitions to the table. 
    alter table acid_delete_pt add if not exists partition (ds= '2019');
    alter table acid_delete_pt add if not exists partition (ds= '2018');
    
    -- Insert data into the destination table and associated table. 
    insert overwrite table acid_delete_pt partition (ds='2019') values(1),(2),(3);
    insert overwrite table acid_delete_pt partition (ds='2018') values(1),(2),(3);
    
    -- Query the table to check whether data is inserted. 
    select * from acid_delete_pt;
    
    +------------+------------+
    | id         | ds         |
    +------------+------------+
    | 1          | 2018       |
    | 2          | 2018       |
    | 3          | 2018       |
    | 1          | 2019       |
    | 2          | 2019       |
    | 3          | 2019       |
    +------------+------------+
    
    -- Delete the rows whose values of the id and ds columns are 2 and 2019. If you execute the statement on the MaxCompute client (odpscmd), you must enter yes or no to confirm the deletion. 
    delete from acid_delete_pt where ds='2019' and id = 2;
    
    -- Query the table to check whether the rows whose values of the id and ds columns are 2 and 2019 are deleted. 
    select * from acid_delete_pt;
    
    +------------+------------+
    | id         | ds         |
    +------------+------------+
    | 1          | 2018       |
    | 2          | 2018       |
    | 3          | 2018       |
    | 1          | 2019       |
    | 3          | 2019       |
    +------------+------------+
  • Example 3: Create a destination table named acid_delete_t and an associated table named acid_delete_s. Then, delete the rows that meet the specified conditions from the destination table based on the associated table. Sample statements:

    -- Create a destination table named acid_delete_t and an associated table named acid_delete_s. 
    create table if not exists acid_delete_t(id int,value1 int,value2 int) tblproperties ("transactional"="true");
    create table if not exists acid_delete_s(id int,value1 int,value2 int);
    
    -- Insert data into the destination table and associated table. 
    insert overwrite table acid_delete_t values(2,20,21),(3,30,31),(4,40,41);
    insert overwrite table acid_delete_s values(1,100,101),(2,200,201),(3,300,301);
    
    -- Delete the rows in the acid_delete_t table whose value of the id column does not match that of the rows in the acid_delete_s table. If you want to execute the statement on the MaxCompute client (odpscmd), you must enter yes or no to confirm the deletion. 
    delete from acid_delete_t where not exists (select * from acid_delete_s where acid_delete_t.id=acid_delete_s.id);
    
    -- Query the acid_delete_t table to check whether the table contains only the rows whose values of the id column are 2 and 3. 
    select * from acid_delete_t;
    
    +------------+------------+------------+
    | id         | value1     | value2     |
    +------------+------------+------------+
    | 2          | 20         | 21         |
    | 3          | 30         | 31         |
    +------------+------------+------------+

Related statements

  • UPDATE: updates the values of one or more columns of the rows that meet the specified conditions in partitioned transactional tables or non-partitioned transactional tables.

  • ALTER TABLE: merges transactional table files.