This topic describes how to use the INSERT INTO and INSERT OVERWRITE statements to insert data into or overwrite data in a table or a static partition.
Supported platforms:
Prerequisites
Before you execute the INSERT INTO or INSERT OVERWRITE statement, ensure that you have the Update permission on the destination table and the Select permission on the metadata of the source table. For more information, see MaxCompute permissions.
Features
You can use the INSERT INTO or INSERT OVERWRITE statement to save the execution results of SELECT statements to a destination table. Differences between the two statements:
INSERT INTO: Appends data to a table or a static partition. You can specify values for partition key columns to insert data into a specific partition. For inserting small amounts of test data, use theVALUESclause. See VALUES.INSERT OVERWRITE: Clears the existing data in a specified table or static partition and inserts new data.NoteMaxCompute
INSERTsyntax differs from MySQL or Oracle. You must include theTABLEkeyword afterINSERT OVERWRITE. ForINSERT INTO, theTABLEkeyword is optional.Repeatedly executing
INSERT OVERWRITEon the same partition may result in different file splitting logic, causing variations in file size when viewed with theDESCcommand. However, the total data size remains consistent and storage costs are unaffected.
For information about inserting data into dynamic partitions, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION).
Limitations
General limits:
INSERT INTO: Cannot be used to insert data into clustered tables.INSERT OVERWRITE: Does not support specifying target columns. To insert data into specific columns, useINSERT INTO. Example:INSERT INTO t(a) VALUES ('1');inserts '1' into column 'a' and NULL (or default value) into other columns.MaxCompute does not provide table locking during INSERT operations. Avoid executing multiple
INSERT INTOorINSERT OVERWRITEstatements on the same table simultaneously.
Limits for Delta tables:
INSERT OVERWRITE: Automatically deduplicates data based on primary keys, keeping only the first record. It overwrites the entire table or partition. Default deduplication ensures primary key uniqueness.INSERT INTO: By default, inserts all data without deduplication, even if primary keys conflict. To enable deduplication, setodps.sql.insert.acidtable.deduplicate.enabletotrue.
Syntax
INSERT {INTO|OVERWRITE} TABLE <table_name> [PARTITION (<pt_spec>)] [(<col_name> [,<col_name> ...)]]
<select_statement>
FROM <from_statement>
[ZORDER BY <zcol_name> [, <zcol_name> ...]];Parameters:
Parameter | Required | Description |
table_name | Yes | The name of the destination table. |
pt_spec | No | The destination partition. Must use constants (expressions are not supported). Format: |
col_name | No | The target column name. Not supported with |
select_statement | Yes | The Note
|
from_statement | Yes | The |
ZORDER BY <zcol_name> [, <zcol_name> ...] | No | Co-locates rows with similar data records to improve query filtering and compression. |
ZORDER BY vs. SORT BY:
ZORDER BY: Supports local (default) and global modes. Local Z-Ordering sorts data within a single file. Global Z-Ordering (requiresset odps.sql.default.zorder.type=global;) sorts data globally for better data skipping.ZORDER BYlimits:Sorts only one partition at a time.
Supports 2 to 4 fields.
Not supported for clustered tables.
Can be used with
DISTRIBUTE BYbut not withORDER BY,CLUSTER BY, orSORT BY.
NoteZ-Ordering consumes more resources and time during insertion.
SORT BY: Sorts data within a single file. IfSORT BYis not specified, local Z-Ordering applies by default.
Examples: Standard tables
Example 1: Append data to a non-partitioned table
-- Create tables CREATE TABLE IF NOT EXISTS websites (id INT, name STRING, url STRING); CREATE TABLE IF NOT EXISTS apps (id INT, app_name STRING, url STRING); -- Append data to the apps table INSERT INTO apps (id, app_name, url) VALUES (1, 'Aliyun', 'https://www.aliyun.com'); -- Copy data from apps to websites INSERT INTO websites (id, name, url) SELECT id, app_name, url FROM apps; -- Verify data SELECT * FROM websites;Result:
+------------+------------+------------+ | id | name | url | +------------+------------+------------+ | 1 | Aliyun | https://www.aliyun.com | +------------+------------+------------+Example 2: Append data to a partitioned table
-- Create table CREATE TABLE IF NOT EXISTS sale_detail ( shop_name STRING, customer_id STRING, total_price DOUBLE ) PARTITIONED BY (sale_date STRING, region STRING); -- Add partition (optional, can be created automatically) ALTER TABLE sale_detail ADD PARTITION (sale_date='2013', region='china'); -- Append data INSERT INTO sale_detail PARTITION (sale_date='2013', region='china') VALUES ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3); -- Verify data SET odps.sql.allow.fullscan=true; SELECT * FROM sale_detail;Result:
+------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+Example 3: Overwrite data
-- Create table with same schema CREATE TABLE sale_detail_insert LIKE sale_detail; -- Add partition (optional) ALTER TABLE sale_detail_insert ADD PARTITION (sale_date='2013', region='china'); -- Overwrite data. Partition columns (sale_date, region) are specified in PARTITION(), so do not include them in SELECT. SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china') SELECT shop_name, customer_id, total_price FROM sale_detail ZORDER BY customer_id, total_price; -- Verify data SELECT * FROM sale_detail_insert;Result:
+------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+Example 4: Overwrite data with adjusted column order
Data mapping is based on column order, not names.
SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china') SELECT customer_id, shop_name, total_price FROM sale_detail; SELECT * FROM sale_detail_insert;Result (Notice the swapped columns):
+------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | c1 | s1 | 100.1 | 2013 | china | | c2 | s2 | 100.2 | 2013 | china | | c3 | s3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+Example 5: Incorrect usage (Partition columns in SELECT)
Do not include partition key columns in the
SELECTclause when inserting into a static partition.-- This will fail because sale_date and region are partition keys INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date='2013', region='china') SELECT shop_name, customer_id, total_price, sale_date, region FROM sale_detail;Example 6: Incorrect usage (Expressions in PARTITION)
PARTITION()values must be constants.-- This will fail because datepart() is an expression INSERT OVERWRITE TABLE sale_detail_insert PARTITION (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china') SELECT shop_name, customer_id, total_price FROM sale_detail;
Example 7: Global Z-Ordering
-- Create source table and insert data CREATE TABLE mf_src (key STRING, value STRING); INSERT OVERWRITE TABLE mf_src SELECT a, b FROM VALUES ('1', '1'),('3', '3'),('2', '2') AS t(a, b); -- Create target table CREATE TABLE mf_zorder_src LIKE mf_src; -- Insert with global Z-Ordering SET odps.sql.default.zorder.type=global; INSERT OVERWRITE TABLE mf_zorder_src SELECT key, value FROM mf_src ZORDER BY key, value; SELECT * FROM mf_zorder_src;Result:
+-----+-------+ | key | value | +-----+-------+ | 1 | 1 | | 2 | 2 | | 3 | 3 | +-----+-------+Example 8: Overwrite with Z-Ordering
SET odps.sql.default.zorder.type=global; INSERT OVERWRITE TABLE target SELECT key, value FROM target ZORDER BY key, value;
Examples: Delta tables
Operations on Delta tables, including automatic deduplication.
-- Create a Delta table
CREATE TABLE IF NOT EXISTS mf_dt (
pk BIGINT NOT NULL PRIMARY KEY,
val BIGINT NOT NULL
) PARTITIONED BY (dd STRING, hh STRING)
TBLPROPERTIES ("transactional"="true");
-- INSERT OVERWRITE (Auto-deduplication: keeps one row for pk=1, 2, 3)
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='01')
VALUES (1, 1), (2, 2), (3, 3);
SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result:
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
| 1 | 1 | 01 | 01 |
| 3 | 3 | 01 | 01 |
| 2 | 2 | 01 | 01 |
+------------+------------+----+----+
-- INSERT INTO (Appends data, no deduplication by default)
INSERT INTO TABLE mf_dt PARTITION(dd='01', hh='01')
VALUES (3, 30), (4, 4), (5, 5);
SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result (Note: pk=3 is duplicated):
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
| 1 | 1 | 01 | 01 |
| 3 | 30 | 01 | 01 |
| 4 | 4 | 01 | 01 |
| 5 | 5 | 01 | 01 |
| 2 | 2 | 01 | 01 |
| 3 | 3 | 01 | 01 |
+------------+------------+----+----+
-- INSERT OVERWRITE (Overwrites partition, removes duplicates)
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='01')
VALUES (1, 1), (2, 2), (3, 3);
SELECT * FROM mf_dt WHERE dd='01' AND hh='01';
-- Result:
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
| 1 | 1 | 01 | 01 |
| 3 | 3 | 01 | 01 |
| 2 | 2 | 01 | 01 |
+------------+------------+----+----+Best practices for Z-Ordering
Evaluate the trade-off between sorting costs (compute) and query benefits (storage/downstream compute). Z-Ordering is not a one-size-fits-all solution.
When to prefer Clustered Indexing (ORDER BY)
Prefix filtering: If filters always use a prefix (e.g., `a`, `a AND b`, `a AND b AND c`), `ORDER BY a, b, c` is more effective than `ZORDER BY` because `ORDER BY` prioritizes the first column.
Join optimization: For frequent join keys, hash clustering or range clustering works better. Clustered indexing helps the SQL engine optimize joins (e.g., sort-merge join), whereas Z-Ordering is often invisible to the engine for such optimizations.
Grouping/Sorting: If you frequently `GROUP BY` or `ORDER BY` specific fields, clustered indexing provides better performance.
Suggestions for using Z-Ordering
Select fields frequently used in filters, especially those used in joins for filtering.
Limit the number of ZORDER columns to 4. Too many columns degrade performance. For single-column sorting, use clustered indexing.
Avoid fields with very low cardinality (e.g., gender) or extremely high cardinality. Low cardinality offers little sorting benefit. High cardinality increases sorting memory costs.
Ensure the data volume is moderate. Small tables don't benefit much; very large tables incur high sorting latency.