MaxCompute allows you to execute the INSERT INTO
or INSERT OVERWRITE
statement to insert or update data into a table or a static partition.
Prerequisites
Before you execute the INSERT INTO
or INSERT OVERWRITE
statement, make sure that you are granted the Alter permission on the destination
table and the Describe permission on the metadata of the source table. For more information,
see Permissions.
Description
INSERT INTO
or INSERT OVERWRITE
statement to save the execution results of the SELECT
statements to the destination table. Differences between the two statements:
INSERT INTO
: inserts data into a table or a static partition of a table. You can specify the values of partition key columns in this statement to insert data into a specified partition. If you want to insert a small amount of test data, you can use this statement together with VALUES.INSERT OVERWRITE
: clears a specified table and inserts data into the table or the static partitions of the table.Note- The
INSERT
syntax in MaxCompute is different from that in MySQL or Oracle. You must add theTABLE
keyword andtable_name
toINSERT OVERWRITE
. You do not need to add theTABLE
keyword toINSERT INTO
. - If you execute the
INSERT OVERWRITE
statement on the same partition several times, the size of the partition into which data is inserted may be different each time you execute theDESC
statement. This is because the logic to split files changes after you execute theSELECT
andINSERT OVERWRITE
statements in sequence for the same partition in a table. After you execute theINSERT OVERWRITE
statement, the total length of data remains the same. This does not affect the storage fees.
- The
For more information about how to insert data into a dynamic partition, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION).
Limits
INSERT INTO
statement or the INSERT OVERWRITE
statement to insert or update data into a table or a static partition of a table,
take note of the following limits:
INSERT INTO
: This statement cannot be used to insert data into a clustered table.INSERT OVERWRITE
: This statement does not allow you to specify the columns into which you want to insert data. You can execute theINSERT INTO
statement to specify the columns. For example, if you executecreate table t(a string, b string); insert into t(a) values ('1');
, 1 is inserted into Column a, and NULL or the default value is inserted into Column b.- MaxCompute does not provide the locking mechanism for the tables for which INSERT
operations are being performed. We recommend that you do not execute the
INSERT INTO
statement or theINSERT OVERWRITE
statement for a table at the same time.
Syntax
insert {into|overwrite} table <table_name> [partition (<pt_spec>)] [(<col_name> [,<col_name> ...)]]
<select_statement>
from <from_statement>
[zorder by <zcol_name> [, <zcol_name> ...]];
- table_name: required. The name of the table into which you want to insert data.
- pt_spec: optional. The partition into which you want to insert data. Only constants are allowed.
Expressions, such as functions, are not allowed. The value of this parameter is in
the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. - col_name: optional. The name of the column in the table into which you want to insert data.
INSERT OVERWRITE
does not allow you to specify[(<col_name> [,<col_name> ...)]
. - select_statement: required. The
SELECT
clause that is used to query the data that you want to insert into the destination table from the source table. For more information aboutSELECT
clauses, see SELECT syntax.Note- The mappings between the source and destination tables are based on the column sequence
in
select_statement
, instead of the mappings between column names in the tables. - If the destination table has static partitions and you want to insert data into a
static partition, partition key columns cannot be included in
select_statement
.
- The mappings between the source and destination tables are based on the column sequence
in
- from_statement: required. The
FROM
clause. This clause indicates the data source. For example, you can specify the name of a source table in this clause. - zorder by <zcol_name> [, <zcol_name> ...]: optional. If you write data to a table or a partition, you can use this clause to
co-locate rows with similar data records based on the columns specified in select_statement. This improves filtering performance for queries and reduces storage costs. The
ORDER BY x, y
clause sorts data records based on the sequence of x coming before y. TheZORDER BY x, y
clause co-locates rows with similar x values and rows with similar y values. For column-based data filtering and sorting in an SQL SELECT statement, theORDER BY
clause filters and sorts data based on x, whereas theZORDER BY
clause filters and sorts data based on x or based on both x and y. This increases the column compression ratio.When you useZORDER BY
, take note of the following limits:- If the destination table is a clustered table, the
ZORDER BY
clause is not supported. ZORDER BY
can be used together withDISTRIBUTE BY
but cannot be used together withORDER BY
,CLUSTER BY
, orSORT BY
.
Note If you use theZORDER BY
clause to write data, more resources and time are consumed. - If the destination table is a clustered table, the
Examples
- Example 1: Execute the
INSERT INTO
statement to append data to a non-partitioned table namedwebsites
. Sample statements:-- Create a non-partitioned table named websites. create table if not exists websites (id int, name string, url string ); -- Create a non-partitioned table named apps. create table if not exists apps (id int, app_name string, url string ); -- Append data to the apps table. The abbreviated form of
INSERT INTO TABLE table_name
isINSERT INTO table_name
. insert into apps (id,app_name,url) values (1,'Aliyun','https://www.aliyun.com'); -- Copy data from the apps table and append the data to the websites table. insert into websites (id,name,url) select id,app_name,url from apps; -- Execute theSELECT
statement to view data in the websites table. select * from websites; -- The following result is returned: +------------+------------+------------+ | id | name | url | +------------+------------+------------+ | 1 | Aliyun | https://www.aliyun.com | +------------+------------+------------+ - Example 2: Execute the
INSERT INTO
statement to append data to a partitioned table namedsale_detail
. Sample statements:-- Create a partitioned table named sale_detail. create table if not exists sale_detail ( shop_name string, customer_id string, total_price double ) partitioned by (sale_date string, region string); -- Add partitions to the sale_detail table. alter table sale_detail add partition (sale_date='2013', region='china'); -- Insert data into the sale_detail table. The abbreviated form of
INSERT INTO TABLE table_name
isINSERT INTO table_name
. TheTABLE
keyword inINSERT OVERWRITE TABLE table_name
cannot be omitted. insert into sale_detail partition (sale_date='2013', region='china') values ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3); -- Enable a full table scan only for the current session. Execute theSELECT
statement to view data in the sale_detail table. set odps.sql.allow.fullscan=true; select * from sale_detail; -- The following result is returned: +------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+ - Example 3: Execute the
INSERT OVERWRITE
statement to update the data in thesale_detail_insert
table. Sample statement:-- Create the sale_detail_insert table that has the same schema as the sale_detail table. create table sale_detail_insert like sale_detail; -- Add partitions to the sale_detail_insert table. alter table sale_detail_insert add partition (sale_date='2013', region='china'); -- Extract data from the sale_detail table and insert the data into the sale_detail_insert table. Names of partition key columns in the sale_detail_insert table do not need to be declared and cannot be rearranged. -- If the sale_detail_insert table contains static partitions, the values of partition key columns are declared in PARTITION(). These values do not need to be included in select_statement. You need to search for column names based only on the sequence of common columns in the sale_detail_insert table and sequentially map the declared column values to the columns in the sale_detail_insert table. If the sale_detail_insert table contains dynamic partitions, the names of partition key columns must be included in
select_statement
. For more information, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION). insert overwrite table sale_detail_insert partition (sale_date='2013', region='china') select shop_name, customer_id, total_price from sale_detail zorder by customer_id, total_price; -- Enable a full table scan only for the current session. Execute theSELECT
statement to view the data in the sale_detail_insert table. set odps.sql.allow.fullscan=true; select * from sale_detail_insert; -- The following result is returned: +------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+ - Example 4: Execute the
INSERT OVERWRITE
statement to update the data in thesale_detail_insert
table and adjust the sequence of columns inselect_statement
. The mappings between the source and destination tables are based on the sequence of columns inselect_statement
, instead of the mappings between column names in the two tables. Sample statement:
The following code shows a sample query result:insert overwrite table sale_detail_insert partition (sale_date='2013', region='china') select customer_id, shop_name, total_price from sale_detail; select * from sale_detail_insert;
+------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | c1 | s1 | 100.1 | 2013 | china | | c2 | s2 | 100.2 | 2013 | china | | c3 | s3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+
When you create the
sale_detail_insert
table, the column sequence is defined asshop_name string, customer_id string, and then total_price bigint
. However, data is inserted from thesale_detail
table into thesale_detail_insert
table based on the sequence ofcustomer_id, shop_name, and then total_price
. As a result, the data in thesale_detail.customer_id
column is inserted into thesale_detail_insert.shop_name
column, and the data in thesale_detail.shop_name
column is inserted into thesale_detail_insert.customer_id
column. - Example 5: If you insert data into a partition, the partition key columns of the partition
cannot be included in
select_statement
. After the following statement is executed, an error is returned. This is becausesale_date
andregion
are partition key columns. These columns cannot be included inselect_statement
if INSERT OVERWRITE or INSERT INTO statement is used to insert or update data into a static partition. Sample statement of incorrect usage:insert overwrite table sale_detail_insert partition (sale_date='2013', region='china') select shop_name, customer_id, total_price, sale_date, region from sale_detail;
- Example 6: pt_spec in
PARTITION()
must be constants instead of expressions. Sample statement of incorrect usage:insert overwrite table sale_detail_insert partition (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china') select shop_name, customer_id, total_price from sale_detail;