All Products
Search
Document Center

MaxCompute:Insert or overwrite data (INSERT INTO | INSERT OVERWRITE)

Last Updated:Jan 19, 2026

This topic describes how to use the INSERT INTO and INSERT OVERWRITE statements to insert data into or overwrite data in a table or a static partition.

Prerequisites

Before you execute the INSERT INTO or INSERT OVERWRITE statement, ensure that you have the Update permission on the destination table and the Select permission on the metadata of the source table. For more information, see MaxCompute permissions.

Features

You can use the INSERT INTO or INSERT OVERWRITE statement to save the execution results of SELECT statements to a destination table. Differences between the two statements:

  • INSERT INTO: Appends data to a table or a static partition. You can specify values for partition key columns to insert data into a specific partition. For inserting small amounts of test data, use the VALUES clause. See VALUES.

  • INSERT OVERWRITE: Clears the existing data in a specified table or static partition and inserts new data.

    Note
    • MaxCompute INSERT syntax differs from MySQL or Oracle. You must include the TABLE keyword after INSERT OVERWRITE. For INSERT INTO, the TABLE keyword is optional.

    • Repeatedly executing INSERT OVERWRITE on the same partition may result in different file splitting logic, causing variations in file size when viewed with the DESC command. However, the total data size remains consistent and storage costs are unaffected.

For information about inserting data into dynamic partitions, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION).

Limitations

  • General limits:

    • INSERT INTO: Cannot be used to insert data into clustered tables.

    • INSERT OVERWRITE: Does not support specifying target columns. To insert data into specific columns, use INSERT INTO. Example: INSERT INTO t(a) VALUES ('1'); inserts '1' into column 'a' and NULL (or default value) into other columns.

    • MaxCompute does not provide table locking during INSERT operations. Avoid executing multiple INSERT INTO or INSERT OVERWRITE statements on the same table simultaneously.

  • Limits for Delta tables:

    • INSERT OVERWRITE: Automatically deduplicates data based on primary keys, keeping only the first record. It overwrites the entire table or partition. Default deduplication ensures primary key uniqueness.

    • INSERT INTO: By default, inserts all data without deduplication, even if primary keys conflict. To enable deduplication, set odps.sql.insert.acidtable.deduplicate.enable to true.

Syntax

INSERT {INTO|OVERWRITE} TABLE <table_name> [PARTITION (<pt_spec>)] [(<col_name> [,<col_name> ...)]]
<select_statement>
FROM <from_statement>
[ZORDER BY <zcol_name> [, <zcol_name> ...]];

Parameters:

Parameter

Required

Description

table_name

Yes

The name of the destination table.

pt_spec

No

The destination partition. Must use constants (expressions are not supported). Format: (partition_col1 = value1, partition_col2 = value2, ...).

col_name

No

The target column name. Not supported with INSERT OVERWRITE.

select_statement

Yes

The SELECT clause querying the source data. See SELECT syntax.

Note
  • Mappings are based on column order, not column names.

  • For static partitions, do not include partition key columns in the select_statement.

from_statement

Yes

The FROM clause specifying the data source (e.g., source table name).

ZORDER BY <zcol_name> [, <zcol_name> ...]

No

Co-locates rows with similar data records to improve query filtering and compression. ZORDER BY x, y groups rows with similar x and y values.

ZORDER BY vs. SORT BY:

  • ZORDER BY: Supports local (default) and global modes. Local Z-Ordering sorts data within a single file. Global Z-Ordering (requires set odps.sql.default.zorder.type=global;) sorts data globally for better data skipping.

    ZORDER BY limits:

    • Sorts only one partition at a time.

    • Supports 2 to 4 fields.

    • Not supported for clustered tables.

    • Can be used with DISTRIBUTE BY but not with ORDER BY, CLUSTER BY, or SORT BY.

    Note

    Z-Ordering consumes more resources and time during insertion.

  • SORT BY: Sorts data within a single file. If SORT BY is not specified, local Z-Ordering applies by default.

Examples: Standard tables

  • Example 1: Append data to a non-partitioned table

    -- Create tables
    CREATE TABLE IF NOT EXISTS websites (id INT, name STRING, url STRING);
    CREATE TABLE IF NOT EXISTS apps (id INT, app_name STRING, url STRING);
    
    -- Append data to the apps table
    INSERT INTO apps (id, app_name, url) VALUES (1, 'Aliyun', 'https://www.aliyun.com');
    
    -- Copy data from apps to websites
    INSERT INTO websites (id, name, url) SELECT id, app_name, url FROM apps;
    
    -- Verify data
    SELECT * FROM websites;

    Result:

    +------------+------------+------------+
    | id         | name       | url        |
    +------------+------------+------------+
    | 1          | Aliyun     | https://www.aliyun.com |
    +------------+------------+------------+
  • Example 2: Append data to a partitioned table

    -- Create table
    CREATE TABLE IF NOT EXISTS sale_detail (
      shop_name     STRING,
      customer_id   STRING,
      total_price   DOUBLE
    ) PARTITIONED BY (sale_date STRING, region STRING);
    
    -- Add partition (optional, can be created automatically)
    ALTER TABLE sale_detail ADD PARTITION (sale_date='2013', region='china');
    
    -- Append data
    INSERT INTO sale_detail PARTITION (sale_date='2013', region='china') 
    VALUES ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
    
    -- Verify data
    SET odps.sql.allow.fullscan=true; 
    SELECT * FROM sale_detail;

    Result:

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    +------------+-------------+-------------+------------+------------+
  • Example 3: Overwrite data

    -- Create table with same schema
    CREATE TABLE sale_detail_insert LIKE sale_detail;
    
    -- Add partition (optional)
    ALTER TABLE sale_detail_insert ADD PARTITION (sale_date='2013', region='china');
    
    -- Overwrite data. Partition columns (sale_date, region) are specified in PARTITION(), so do not include them in SELECT.
    SET odps.sql.allow.fullscan=true;
    INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china')
      SELECT shop_name, customer_id, total_price 
      FROM sale_detail
      ZORDER BY customer_id, total_price;
    
    -- Verify data
    SELECT * FROM sale_detail_insert;

    Result:

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    +------------+-------------+-------------+------------+------------+
  • Example 4: Overwrite data with adjusted column order

    Data mapping is based on column order, not names.

    SET odps.sql.allow.fullscan=true;
    INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china')
        SELECT customer_id, shop_name, total_price FROM sale_detail;    
    
    SELECT * FROM sale_detail_insert;                  

    Result (Notice the swapped columns):

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | c1         | s1          | 100.1       | 2013       | china      |
    | c2         | s2          | 100.2       | 2013       | china      |
    | c3         | s3          | 100.3       | 2013       | china      |
    +------------+-------------+-------------+------------+------------+
  • Example 5: Incorrect usage (Partition columns in SELECT)

    Do not include partition key columns in the SELECT clause when inserting into a static partition.

    -- This will fail because sale_date and region are partition keys
    INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china')
       SELECT shop_name, customer_id, total_price, sale_date, region FROM sale_detail;
  • Example 6: Incorrect usage (Expressions in PARTITION)

    PARTITION() values must be constants.

    -- This will fail because datepart() is an expression
    INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china')
       SELECT shop_name, customer_id, total_price FROM sale_detail;
  • Example 7: Global Z-Ordering

    -- Create source table and insert data
    CREATE TABLE mf_src (key STRING, value STRING);
    INSERT OVERWRITE TABLE mf_src
    SELECT a, b FROM VALUES ('1', '1'),('3', '3'),('2', '2') AS t(a, b);
    
    -- Create target table
    CREATE TABLE mf_zorder_src LIKE mf_src;
    
    -- Insert with global Z-Ordering
    SET odps.sql.default.zorder.type=global;
    INSERT OVERWRITE TABLE mf_zorder_src
    SELECT key, value FROM mf_src 
    ZORDER BY key, value;
    
    SELECT * FROM mf_zorder_src;

    Result:

    +-----+-------+
    | key | value |
    +-----+-------+
    | 1   | 1     |
    | 2   | 2     |
    | 3   | 3     |
    +-----+-------+
  • Example 8: Overwrite with Z-Ordering

    SET odps.sql.default.zorder.type=global;
    INSERT OVERWRITE TABLE target
    SELECT key, value FROM target 
    ZORDER BY key, value;

Examples: Delta tables

Operations on Delta tables, including automatic deduplication.

-- Create a Delta table
CREATE TABLE IF NOT EXISTS mf_dt (
  pk BIGINT NOT NULL PRIMARY KEY, 
  val BIGINT NOT NULL
) PARTITIONED BY (dd STRING, hh STRING) 
TBLPROPERTIES ("transactional"="true");
                  
-- INSERT OVERWRITE (Auto-deduplication: keeps one row for pk=1, 2, 3)
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='01') 
                 VALUES (1, 1), (2, 2), (3, 3);
            
SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result:
+------------+------------+----+----+
| pk         | val        | dd | hh |
+------------+------------+----+----+
| 1          | 1          | 01 | 01 |
| 3          | 3          | 01 | 01 |
| 2          | 2          | 01 | 01 |
+------------+------------+----+----+

-- INSERT INTO (Appends data, no deduplication by default)
INSERT INTO TABLE mf_dt PARTITION(dd='01', hh='01') 
            VALUES (3, 30), (4, 4), (5, 5);
            
SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result (Note: pk=3 is duplicated):
+------------+------------+----+----+
| pk         | val        | dd | hh |
+------------+------------+----+----+
| 1          | 1          | 01 | 01 |
| 3          | 30         | 01 | 01 |
| 4          | 4          | 01 | 01 |
| 5          | 5          | 01 | 01 |
| 2          | 2          | 01 | 01 |
| 3          | 3          | 01 | 01 |
+------------+------------+----+----+

-- INSERT OVERWRITE (Overwrites partition, removes duplicates)
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='01') 
                 VALUES (1, 1), (2, 2), (3, 3);

SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result: 
+------------+------------+----+----+
| pk         | val        | dd | hh |
+------------+------------+----+----+
| 1          | 1          | 01 | 01 |
| 3          | 3          | 01 | 01 |
| 2          | 2          | 01 | 01 |
+------------+------------+----+----+

Best practices for Z-Ordering

Evaluate the trade-off between sorting costs (compute) and query benefits (storage/downstream compute). Z-Ordering is not a one-size-fits-all solution.

When to prefer Clustered Indexing (ORDER BY)

  • Prefix filtering: If filters always use a prefix (e.g., `a`, `a AND b`, `a AND b AND c`), `ORDER BY a, b, c` is more effective than `ZORDER BY` because `ORDER BY` prioritizes the first column.

  • Join optimization: For frequent join keys, hash clustering or range clustering works better. Clustered indexing helps the SQL engine optimize joins (e.g., sort-merge join), whereas Z-Ordering is often invisible to the engine for such optimizations.

  • Grouping/Sorting: If you frequently `GROUP BY` or `ORDER BY` specific fields, clustered indexing provides better performance.

Suggestions for using Z-Ordering

  • Select fields frequently used in filters, especially those used in joins for filtering.

  • Limit the number of ZORDER columns to 4. Too many columns degrade performance. For single-column sorting, use clustered indexing.

  • Avoid fields with very low cardinality (e.g., gender) or extremely high cardinality. Low cardinality offers little sorting benefit. High cardinality increases sorting memory costs.

  • Ensure the data volume is moderate. Small tables don't benefit much; very large tables incur high sorting latency.