Updates information about a table.
Change the owner of a table
MaxCompute allows you to change the owner of a table.
Syntax
alter table <table_name> changeowner to <new_owner>;
Parameters
table_name: required. The name of the table whose owner you want to change.
new_owner: required. The new owner of the table.
Examples
-- Change the owner of the test1 table to ALIYUN$xxx@aliyun.com. alter table test1 changeowner to 'ALIYUN$xxx@aliyun.com';
Modify the comment of a table
MaxCompute allows you to modify the comment of a table.
Syntax
alter table <table_name> set comment '<new_comment>';
Parameters
table_name: required. The name of the table whose comment you want to modify.
new_comment: required. The new comment of the table.
Examples
alter table sale_detail set comment 'new coments for table sale_detail';
You can execute the
DESC table_name
statement of MaxCompute to view the modification result of thecomment
in the table.
Change the value of LastModifiedTime
MaxCompute SQL allows you to execute the TOUCH
statement to change the value of LastModifiedTime
. You can change the value of LastModifiedTime
to the current time. After you execute this statement to change the value of LastModifiedTime
, MaxCompute determines that the table data has changed, and restarts the lifecycle of the table from the time that is specified by LastModifiedTime.
Syntax
alter table <table_name> touch;
Parameters
table_name: required. The name of the table whose LastModifiedTime you want to modify.
Examples
alter table sale_detail touch;
Modify the clustering attribute of a table
MaxCompute allows you to add or remove the clustering attribute to or from a table by executing the ALTER TABLE
statement.
Syntax
Syntax of the statement that is used to add the hash clustering attribute to a table:
alter table <table_name> [clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
Syntax of the statement that is used to remove the hash clustering attribute from a table:
alter table <table_name> not clustered;
If you do not specify the number of buckets in a range-clustered table, MaxCompute automatically determines the optimal number based on the amount of data. Syntax:
alter table <table_name> [range clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
Syntax of the statement that is used to remove the range clustering attribute from a table or partition:
alter table <table_name> not clustered; alter table <table_name> partition [<pt_spec>] not clustered;
NoteThe
ALTER TABLE
statement can modify the clustering attribute only for a partitioned table. The clustering attribute of a non-partitioned table cannot be modified after the table is created. TheALTER TABLE
statement is suitable for existing tables. After you specify the clustering attribute, new partitions are stored based on the clustering attribute that you specified.ALTER TABLE
takes effect only on the new partitions in a partitioned table. The new partitions include those generated by usingINSERT OVERWRITE
and are stored based on the new clustering attribute. The clustering attribute and storage method remain unchanged for the original partitions. After you specify the clustering attribute for a table, you can remove the clustering attribute and add a clustering attribute for the table again. You can specify different clustering columns, sort columns, and numbers of buckets for new partitions.ALTER TABLE
takes effect only on the new partitions. Therefore, this statement cannot be used to specify partitions.
Parameters
For more information about the parameters, see CREATE TABLE.
Examples
-- Create a partitioned table. create table if not exists sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) partitioned by (sale_date STRING, region STRING); -- Modify the clustering attribute of the table. alter table sale_detail clustered by (customer_id) sorted by (customer_id) into 10 buckets;
For more information about the clustering attribute, see Hash clustering and Range clustering.
Rename a table
MaxCompute allows you to rename a table. After you rename a table, only the name of the table is changed. Data in the table is not changed.
Syntax
alter table <table_name> rename to <new_table_name>;
Parameters
table_name: required. The name of the table that you want to rename.
new_table_name: required. The new name of the table. If the name specified by the new_table_name parameter exists, an error is returned.
Examples
alter table sale_detail rename to sale_detail_rename;
Change the lifecycle of a table
MaxCompute allows you to change the lifecycle configured for an existing partitioned table or a non-partitioned table.
Syntax
alter table <table_name> set lifecycle <days>;
Parameters
table_name: required. The name of the table whose lifecycle you want to change.
days: required. The lifecycle after modification. The value must be a positive integer. Unit: days.
Examples
-- Change the lifecycle configured for the test_lifecycle table to 50 days. alter table test_lifecycle set lifecycle 50;
Disable or restore the lifecycle
MaxCompute allows you to disable or restore the lifecycle configured for a specified table or partition.
Syntax
alter table <table_name> partition [<pt_spec>] {enable|disable} lifecycle;
Parameters
table_name: required. The name of the table whose lifecycle you want to disable or restore.
pt_spec: optional. The partition of the table whose lifecycle you want to disable or restore. The value of this parameter is in the
partition_col1=col1_value1, partition_col2=col2_value1...
format. If a table has multi-level partitions, you must specify the values of all the partition key columns.enable: restores the lifecycle of a table or a specified partition of a table.
A table and its partitions can be reclaimed again based on the lifecycle. The lifecycle settings of the current table and its partitions are used by default.
Before you enable the lifecycle feature for a table, you can modify the lifecycle settings of the table and its partitions. This prevents data from being mistakenly reclaimed due to the use of previous settings.
disable: disables the lifecycle feature for a table or a specified partition.
Disabling the lifecycle of a table and its partitions takes precedence over restoring the lifecycle of a table and its partitions. If you specify
table disable lifecycle
,pt_spec enable lifecycle
becomes invalid.After you disable the lifecycle feature for a table, the lifecycle settings for the table, and the enable or disable flag for table partitions are retained.
After the lifecycle feature is disabled for a table, you can still change the lifecycle settings of the table and its partitions.
Examples
Example 1: Disable the lifecycle feature for the trans table.
alter table trans disable lifecycle;
Example 2: Disable the lifecycle feature for the dt='20141111' partition in the trans table.
alter table trans partition (dt='20141111') disable lifecycle;
Add partitions
MaxCompute allows you to add partitions to an existing partitioned table.
Limits
A MaxCompute table can have a maximum of 60,000 partitions.
To add the values of partition key columns to a table that has multi-level partitions, you must specify all the partitions.
This operation can add only the values of partition key columns. The names of partition key columns cannot be added.
Syntax
alter table <table_name> add [if not exists] partition <pt_spec> [partition <pt_spec> partition <pt_spec>...];
Parameters
table_name: required. The name of the partitioned table to which you want to add partitions.
if not exists: optional. If you do not configure the if not exists parameter and a partition with the same name already exists, this operation fails and an error is returned.
pt_spec: required. The partitions that you want to add. The value of this parameter is in the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. The names of partition key columns are not case-sensitive, but their values are case-sensitive.
Examples
Example 1: Add a partition to the sale_detail table. The partition stores the sales records in the China (Hangzhou) region in December 2013.
alter table sale_detail add if not exists partition (sale_date='201312', region='hangzhou');
Example 2: Add two partitions to the sale_detail table. The partitions store the sales records in the China (Beijing) and China (Shanghai) regions in December 2013.
alter table sale_detail add if not exists partition (sale_date='201312', region='beijing') partition (sale_date='201312', region='shanghai');
Example 3: Add one or more partitions to the sale_detail table and specify only the partition key column sale_date for the partitions. An error is returned because you must specify the two partition key columns sale_date and region for the partitions.
alter table sale_detail add if not exists partition (sale_date='20111011');
Drop partitions
MaxCompute allows you to drop partitions from an existing partitioned table.
MaxCompute allows you to drop partitions based on the specified filter condition. If you want to drop multiple partitions that meet the specified filter condition at a time, you can use an expression to specify the filter condition, use the filter condition to match partitions, and drop the partitions at a time.
Limits
You can specify the information of only one partition key column in a PARTITION (<partition_filtercondition>) clause.
If you use an expression to specify PARTITION (<partition_filtercondition>), the function used in the expression must be a built-in scalar function.
Precautions
After you drop a partition from a table in your MaxCompute project, the volume of stored data in your MaxCompute project decreases.
You can specify a lifecycle for a partitioned table. This way, MaxCompute automatically reclaims partitions whose data is not updated within the time specified by the lifecycle. For more information about the lifecycle, see Lifecycle.
Syntax
The filter condition is not specified.
-- Drop one partition at a time. alter table <table_name> drop [if exists] partition <pt_spec>; -- Drop multiple partitions at a time. alter table <table_name> drop [if exists] partition <pt_spec>,partition <pt_spec>[,partition <pt_spec>....];
The filter condition is specified.
alter table <table_name> drop [if exists] partition <partition_filtercondition>;
Parameters
table_name: required. The name of the partitioned table from which you want to drop partitions.
if exists: optional. If you do not specify the if exists option and the partition that you want to delete does not exist, an error is returned.
pt_spec: required. The partitions that you want to delete. The value of this parameter is in the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. The names of partition key columns are not case-sensitive, but their values are case-sensitive.partition_filtercondition: the filter condition. This parameter is required when you specify the filter condition. This parameter is not case-sensitive. The parameter is in the following format:
partition_filtercondition : partition (<partition_col> <relational_operators> <partition_col_value>) | partition (scalar(<partition_col>) <relational_operators> <partition_col_value>) | partition (<partition_filtercondition1> AND|OR <partition_filtercondition2>) | partition (NOT <partition_filtercondition>) | partition (<partition_filtercondition1>)[,partition (<partition_filtercondition2>), ...]
partition_col: the name of a partition key column.
relational_operators: the relational operator. For more information, see Operator.
partition_col_value: a value in the partition key column. The value may be a comparison value or regular expression. The data type of this value must be the same as the data type of the partition key column.
scalar(): the scalar function. The scalar function generates a scalar based on the input value, processes the values in the column specified by partition_col, and uses relational_operators to compare the processed values with partition_col_value.
The filter conditions that are used to drop partitions support the logical operators NOT, AND, and OR. You can use PARTITION (<NOT partition_filtercondition>) to obtain the complementary set of the filter conditions that you specified. You can use PARTITION (<partition_filtercondition1 ANDOR partition_filtercondition2>) to obtain the condition that is used to match the partitions you want to drop.
Multiple PARTITION (<partition_filtercondition>) clauses are supported. If these clauses are separated by commas (,), the logical relationship between the clauses is OR. The filter condition is obtained based on the OR logical relationship and used to match the partitions that you want to drop.
Examples
Clear the data from a partition with the filter condition not specified.
-- Drop a partition from the sale_detail table. The partition stores the sales record in the China (Hangzhou) region in December 2013. alter table sale_detail drop if exists partition(sale_date='201312',region='hangzhou'); -- Drop two partitions from the sale_detail table. The partitions store the sales records in the China (Hangzhou) and China (Shanghai) regions in December 2013. alter table sale_detail drop if exists partition(sale_date='201312',region='hangzhou'),partition(sale_date='201312',region='shanghai');
Clear the data from a partition with the filter condition specified.
-- Create a partitioned table named sale_detail. create table if not exists sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) partitioned by (sale_date STRING); -- Add a partition to the table. alter table sale_detail add if not exists partition (sale_date= '201910') partition (sale_date= '201911') partition (sale_date= '201912') partition (sale_date= '202001') partition (sale_date= '202002') partition (sale_date= '202003') partition (sale_date= '202004') partition (sale_date= '202005') partition (sale_date= '202006') partition (sale_date= '202007'); -- Drop partitions from the table at a time. alter table sale_detail drop if exists partition(sale_date < '201911'); alter table sale_detail drop if exists partition(sale_date >= '202007'); alter table sale_detail drop if exists partition(sale_date LIKE '20191%'); alter table sale_detail drop if exists partition(sale_date IN ('202002','202004','202006')); alter table sale_detail drop if exists partition(sale_date BETWEEN '202001' AND '202007'); alter table sale_detail drop if exists partition(substr(sale_date, 1, 4) = '2020'); alter table sale_detail drop if exists partition(sale_date < '201912' OR sale_date >= '202006'); alter table sale_detail drop if exists partition(sale_date > '201912' AND sale_date <= '202004'); alter table sale_detail drop if exists partition(NOT sale_date > '202004'); -- Drop partitions by using multiple PARTITION (<partition_filtercondition>) clauses. The logical relationship between these clauses is OR. alter table sale_detail drop if exists partition(sale_date < '201911'), partition(sale_date >= '202007'); -- Add partitions in other formats. alter table sale_detail add IF NOT EXISTS partition (sale_date= '2019-10-05') partition (sale_date= '2019-10-06') partition (sale_date= '2019-10-07'); -- Use regular expressions to match the partitions that you want to drop and drop these partitions at a time. alter table sale_detail drop if exists partition(sale_date RLIKE '2019-\\d+-\\d+'); -- Create a table named region_sale_detail. The table contains multi-level partitions. create table if not exists region_sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) partitioned by (sale_date STRING , region STRING ); -- Add partitions to the region_sale_detail table. alter table region_sale_detail add IF NOT EXISTS partition (sale_date= '201910',region = 'shanghai') partition (sale_date= '201911',region = 'shanghai') partition (sale_date= '201912',region = 'shanghai') partition (sale_date= '202001',region = 'shanghai') partition (sale_date= '202002',region = 'shanghai') partition (sale_date= '201910',region = 'beijing') partition (sale_date= '201911',region = 'beijing') partition (sale_date= '201912',region = 'beijing') partition (sale_date= '202001',region = 'beijing') partition (sale_date= '202002',region = 'beijing'); -- Execute the following statement to drop multi-level partitions at a time. The logical relationship between the two PARTITION (<partition_filtercondition>) clauses is OR. After the statement is executed, the partitions in which the value of the sale_date column is earlier than 201911 or the partitions in which the value of the region column is beijing are dropped. alter table region_sale_detail drop if exists partition(sale_date < '201911'),partition(region = 'beijing'); -- Execute the following statement to drop partitions in which the value of the sale_date column is earlier than 201911 and the value of the region column is beijing. alter table region_sale_detail drop if exists partition(sale_date < '201911', region = 'beijing');
When you drop multi-level partitions at a time, you cannot specify a filter condition that is based on multiple partition key columns in one
PARTITION partition_filtercondition
clause. Otherwise, the following error is returned:FAILED: ODPS-0130071:[1,82] Semantic analysis exception - invalid column reference region, partition expression must have one and only one column reference
.-- If you specify the information of multiple partition key columns in a PARTITION (<partition_filtercondition>) clause, an error is returned. Example of incorrect usage: alter table region_sale_detail drop if exists partition(sale_date < '201911' AND region = 'beijing');
Change LastModifiedTime of a partition
MaxCompute SQL allows you to execute the TOUCH
statement to change LastModifiedTime
of a partition in a partitioned table. This operation can change LastModifiedTime
of a partition to the current time. In this case, MaxCompute considers that data is updated, and a new lifecycle of the partition starts from the time specified by LastDataModifiedTime.
Limits
If a table contains multi-level partitions, you must specify all partition levels when you change LastModifiedTime of a partition in the table.
Syntax
alter table <table_name> touch partition (<pt_spec>);
Parameters
table_name: required. The name of the partitioned table whose LastModifiedTime you want to change. If the table does not exist, an error is returned.
pt_spec: required. The partition whose LastModifiedTime you want to change. The value of this parameter is in the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. If the specified column name or column value does not exist, an error is returned.
Examples
-- Change LastModifiedTime of the partition in which the value of the sale_date column is 201312 and the value of the region column is shanghai in the sale_detail table. alter table sale_detail touch partition (sale_date='201312', region='shanghai');
Change the value of a partition key column
MaxCompute SQL allows you to execute the RENAME
statement to change the value of a partition key column.
Limits
The RENAME statement can change values of partition key columns but cannot change the column names.
If a table contains multi-level partitions, you must specify all partition levels when you change LastModifiedTime of a partition in the table.
Syntax
alter table <table_name> partition (<pt_spec>) rename to partition (<new_pt_spec>);
Parameters
table_name: required. The name of the table in which you want to change column values in a partition.
pt_spec: required. The partition in which you want to change column values. The value of this parameter is in the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. If the specified column name or column value does not exist, an error is returned.new_pt_spec: required. The new partition information. The value of this parameter is in the
(partition_col1 = new_partition_col_value1, partition_col2 = new_partition_col_value2, ...)
format. partition_col indicates the name of a partition key column, and new_partition_col_value indicates the value of the new partition key column.
Examples
-- Change column values in a partition of the sale_detail table. alter table sale_detail partition (sale_date = '201312', region = 'hangzhou') rename to partition (sale_date = '201310', region = 'beijing');
Merge partitions
MaxCompute SQL allows you to execute the MERGE PARTITION
statement to merge multiple partitions of a table into one partition. This operation deletes the dimension information about the merged partitions and migrates the partition data to a specified partition.
Limits
Partitions of external tables cannot be merged. After partitions of a clustered table are merged into one partition, the partition does not have the clustering attribute.
You can merge a maximum of 4,000 partitions at a time.
Syntax
alter table <table_name> merge [if exists] partition (<predicate>) [, partition(<predicate2>) ...] overwrite partition (<fullpartitionSpec>) [purge];
Parameters
table_name: required. The name of the partitioned table whose partitions you want to merge.
if exists: optional. If you do not configure the if exists parameter and the partitions that you want to merge do not exist, an error is returned when you merge partitions. If if exists is specified but no partitions meet the
merge
condition, a new partition cannot be generated after you merge partitions. If you merge partitions and modify source data by using a statement at the same time, an error is returned even if you specify if exists. The statement may beINSERT
,RENAME
, orDROP
.predicate: required. The condition that is used to match the partitions that you want to merge.
fullpartitionSpec: required. The partition that is generated after you merge partitions.
purge: optional. If you specify this parameter, logs in the session directory are deleted. By default, logs that are generated in the last three days are deleted. For more information, see Purge.
Examples
Example 1: Merge the partitions that meet the specified condition into the destination partition.
-- View the partition that is generated in the partitioned table after you merge partitions. show partitions intpstringstringstring; ds=20181101/hh=00/mm=00 ds=20181101/hh=00/mm=10 ds=20181101/hh=10/mm=00 ds=20181101/hh=10/mm=10 -- Merge all partitions that meet the hh='00' condition into the ds='20181101', hh='00', mm='00' partition. alter table intpstringstringstring merge partition(hh='00') overwrite partition(ds='20181101', hh='00', mm='00'); -- View the partition that is generated after you merge partitions. show partitions intpstringstringstring; ds=20181101/hh=00/mm=00 ds=20181101/hh=10/mm=00 ds=20181101/hh=10/mm=10
Example 2: Merge specified partitions into the destination partition.
-- Merge specified partitions into the destination partition. alter table intpstringstringstring merge if exists partition(ds='20181101', hh='00', mm='00'), partition(ds='20181101', hh='10', mm='00'), partition(ds='20181101', hh='10', mm='10') overwrite partition(ds='20181101', hh='00', mm='00') purge; -- View the partitions in a partitioned table. show partitions intpstringstringstring; ds=20181101/hh=00/mm=00
Add columns or column comments
MaxCompute allows you to add columns or column comments to an existing non-partitioned table or partitioned table.
MaxCompute allows you to add columns of the STRUCT type, such as struct<x: string, y: bigint>
and map<string, struct<x: double, y: double>>
.
Syntax
alter table <table_name> add columns (<col_name1> <type1> comment ['<col_comment>'][, <col_name2> <type2> comment '<col_comment>'...]);
Parameters
table_name: required. The name of the table to which you want to add columns. You cannot specify the position of a new column in the table. By default, the new column is added as the last column.
col_name: required. The name of the column that you want to add to the table.
type: required. The data type of the column that you want to add to the table.
col_comment: optional. The comment of the column that you want to add to the table.
Examples
Example 1: Add two columns to the sale_detail table.
alter table sale_detail add columns (customer_name STRING, education BIGINT);
Example 2: Add two columns and their comments to the sale_detail table.
alter table sale_detail add columns (customer_name STRING comment 'Customer', education BIGINT comment 'Education' );
Example 3: Add a column of a complex data type to the sale_detail table.
alter table sale_detail add columns (region struct<province:string, area:string>);
Drop columns
MaxCompute allows you to drop one or more specified columns from an existing non-partitioned table or partitioned table.
If you drop a column from a table in one of the following scenarios, the read and write operations on the table are affected.
If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.
If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.
Spark-2.3.0-odps0.34.0
Spark-3.1.1-odps0.34.0
If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.
If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.
If you drop a column from a table, you cannot run the
CLONE TABLE
command on the table.
When you use MaxCompute Streaming Tunnel to write data to the table whose column is dropped, you cannot change the table schema.
Syntax
alter table <table_name> drop columns <col_name1>[, <col_name2>...];
Parameters
table_name: required. The name of the table from which you want to drop columns.
col_name: required. The name of the column that you want to drop.
Examples
-- Drop the customer_id column from the sale_detail table. Enter yes on the MaxCompute client to drop the column. alter table sale_detail drop columns customer_id; -- Drop the shop_name and customer_id columns from the sale_detail table. Enter yes on the MaxCompute client to drop the columns. alter table sale_detail drop columns shop_name, customer_id;
Change the data type of a column
MaxCompute allows you to change the data type of an existing column.
If you change the data type of a column in one of the following scenarios, the read and write operations on the table are affected.
If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.
If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.
Spark-2.3.0-odps0.34.0
Spark-3.1.1-odps0.34.0
If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.
If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.
If you change the data type of a column in a table, you cannot run the
CLONE TABLE
command on the table.
If you use MaxCompute Streaming Tunnel to write data to a table, you cannot change the table schema.
Syntax
alter table <table_name> change [column] <old_column_name> <new_column_name> <new_data_type>;
Parameters
table_name: required. The name of the table in which you want to change the data type of a column.
old_column_name: required. The name of the column whose data type you want to change.
new_column_name: required. The new name of the column whose data type you want to change. The value of new_column_name can be the same as the value of old_column_name. In this case, the name of the column is not changed. The name specified by the new_column_name parameter cannot be the same as the name of a column except the name specified by the old_column_name parameter.
new_data_type: required. The new data type of the column whose data type you want to change.
Examples
-- Change the data type of the id field in the mf_evol_t3 table from INT to BIGINT. alter table mf_evol_t3 change id id bigint; -- Change the data type of the id field in the mf_evol_t3 table from BIGINT to STRING. alter table mf_evol_t3 change column id id string;
Supported data type conversions
NoteThe following table describes the conversion between data types. Y indicates that the conversion between the data types is supported. N indicates that the conversion between the data types is not supported. - indicates that the conversion between the data types is not involved. Y() indicates that the conversion between the data types is supported only if the conditions in parentheses () are met.
Change the order of a column
MaxCompute allows you to change the order of a column in an existing non-partitioned table or partitioned table.
If you change the order of a column in a table, or add a column to a table and change the order of the new column in the table in one of the following scenarios, the read and write operations on the table are affected.
If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.
If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.
Spark-2.3.0-odps0.34.0
Spark-3.1.1-odps0.34.0
If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.
If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.
If you change the order of columns in a table, you cannot run the
CLONE TABLE
command on the table.
When you use MaxCompute Streaming Tunnel to write data to the table whose column is dropped, you cannot change the table schema.
Syntax
alter table <table_name> change <old_column_name> <new_column_name> <column_type> after <column_name>;
Parameters
table_name: required. The name of the table in which you want to change the column order.
old_column_name: required. The original name of the column whose order you want to change.
new_col_name: required. The new name of the column whose sequence you want to change. The value of new_col_name can be the same as the value of old_column_name. In this case, the name of the column is not changed. The name specified by the new_col_name parameter cannot be the same as the name of a column except the name specified by the old_column_name parameter.
column_type: required. The original data type of the column whose order you want to change. The value of this parameter cannot be changed.
column_name: required. The column after which you want to place the specified column.
Examples
-- Change the name of the customer_id column to customer and place the customer column after the total_price column in the sale_detail table. alter table sale_detail change customer_id customer string after total_price; -- Place the customer_id column after the total_price column in the sale_detail table. The name of the customer_id column is not changed. alter table sale_detail change customer_id customer_id string after total_price;
Change the name of a column
MaxCompute allows you to change the name of a column in an existing non-partitioned table or partitioned table.
Syntax
alter table <table_name> change column <old_col_name> rename to <new_col_name>;
Parameters
table_name: required. The name of the table in which you want to modify the name of a column.
old_col_name: required. The name of the column that you want to modify. The column specified by old_col_name must already exist in the table.
new_col_name: required. The new name of the column. The table does not contain a column named new_col_name.
Examples
-- Change the name of the customer_name column to customer in the sale_detail table. alter table sale_detail change column customer_name rename to customer;
Modify the comment of a column
MaxCompute allows you to modify the comment of a column in an existing non-partitioned table or partitioned table.
Syntax
alter table <table_name> change column <col_name> comment '<col_comment>';
Parameters
table_name: required. The name of the table in which you want to modify the column comment.
col_name: required. The name of the column whose comment you want to modify. The column specified by col_name must already exist in the table.
col_comment: required. The new comment of the column. The new comment must be a valid string that does not exceed 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
Examples
-- Modify the comment of the customer column in the sale_detail table. alter table sale_detail change column customer comment 'customer';
Change the name and comment of a column
MaxCompute allows you to change the name or comment of a column in a non-partitioned table or partitioned table.
Syntax
alter table <table_name> change column <old_col_name> <new_col_name> <column_type> comment '<col_comment>';
Parameters
table_name: required. The name of the table in which you want to change the name and comment of a column.
old_col_name: required. The name of the column whose name and comment you want to change. The column specified by
old_col_name
must already exist in the table.new_col_name: required. The new name of the column. The table does not contain a column named
new_col_name
.column_type: required. The data type of the column.
col_comment: optional. The new comment of the column. The comment can be 1,024 bytes in length at maximum.
Examples
-- Change the name of the customer_name column in the sale_detail table to customer_newname and the comment of the column to customer. alter table sale_detail change column customer_name customer_newname STRING comment 'customer';
Change the non-nullable property of a non-partition key column in a table
MaxCompute allows you to change the non-nullable property of a non-partition key column in a table. If the values of a non-partition key column in a table cannot be null, you can execute the statement to allow the column values to be null.
You can execute the DESC EXTENDED table_name;
statement to view the value of the Nullable
property. This property specifies whether the value of a non-partition key column can be null. If the value of the Nullable
property is true
, the column value can be null. If the value of the Nullable
property is false
, the column value must not be null.
Limits
After you change the value of the Nullable property to true, you cannot restore the property setting. Proceed with caution.
Syntax
alter table <table_name> change column <old_col_name> null;
Parameters
table_name: required. The name of the table in which you want to change the value of the Nullable property for a column.
old_col_name: required. The name of the non-partition key column whose Nullable property you want to change. The column specified by old_col_name must already exist in the table.
Examples
-- Create a partitioned table. The values in the id column must not be null. create table null_test(id int not null, name string) partitioned by (ds string); -- Allow the values in the id column to be null. alter table null_test change column id null;
Merge files of a transactional table
The base files and delta files of a transactional table occupy the physical storage. You cannot directly read such files. If you execute the UPDATE
or DELETE
statement on a transactional table, the data in the base files is not updated, but a delta file is generated for each operation. In this case, the more delete or update operations are performed, the more storage space the table occupies. The number of delta files increases. As a result, you are charged more for storage usage and subsequent queries.
If you execute the UPDATE
or DELETE
statement on a table or a partition of the table multiple times, a large number of delta files are generated. When the system reads data from the table, the system loads the delta files to identify the deleted and updated rows. A large number of delta files reduce the efficiency of reading data. In this case, you can merge the base files with the delta files to reduce storage usage and improve the read efficiency.
Syntax
alter table <table_name> [partition (<partition_key> = '<partition_value>' [, ...])] compact {minor|major};
Parameters
table_name: required. The name of the transactional table for which you want to merge the base files and delta files.
partition_key: optional. The name of a partition key column in the partitioned transactional table.
partition_value: optional. The value of a partition key column in the partitioned transactional table.
major|minor: One of them must be specified. Differences between minor compaction and major compaction:
minor
: merges each base file with all delta files that are generated based on the base file and deletes the delta files.major
: merges each base file with all delta files that are generated based on the base file, deletes the delta files, and merges small base files. If the size of a base file is less than 32 MB or a delta file is generated, the effect of merging files is equivalent to the effect of executing theINSERT OVERWRITE
statement. However, if the size of a base file is greater than or equal to 32 MB and no delta files are generated, the data of the table is not overwritten.
Examples
Example 1: Merge files of the acid_delete table. Sample statement:
alter table acid_delete compact minor;
The following result is returned:
Summary: Nothing found to merge, set odps.merge.cross.paths=true if cross path merge is permitted. OK
Example 2: Merge files of the acid_update_pt table. Sample statement:
alter table acid_update_pt partition (ds = '2019') compact major;
The following result is returned:
Summary: table name: acid_update_pt /ds=2019 instance count: 2 run time: 6 before merge, file count: 8 file size: 2613 file physical size: 7839 after merge, file count: 2 file size: 679 file physical size: 2037 OK
Merge small files
The distributed file system stores data in blocks. Files whose size is smaller than the block size are called small files. Each block is 64 MB in size. The distributed file system inevitably generates small files. For example, when you execute SQL statements in MaxCompute or use Tunnel to collect data, small files are generated. In this case, you can use this feature to merge small files to improve the computing performance.
Syntax
ALTER TABLE <tablename> [PARTITION(<partition_key>=<partition_value>)] MERGE SMALLFILES;
Parameters
table_name: required. The name of the table that corresponds to the files to be merged.
partition_key: optional. The name of a partition key column when the table is a partitioned table.
partition_value: optional. The value of a partition key column when the table is a partitioned table.
Examples
set odps.merge.cross.paths=true; set odps.merge.smallfile.filesize.threshold=128; set odps.merge.max.filenumber.per.instance = 2000; alter table tbcdm.dwd_tb_log_pv_di partition (ds='20151116') merge smallfiles;
Computing resources are consumed to merge small files. If you use a pay-as-you-go instance, fees are incurred. The billing rules are consistent with the pay-as-you-go billing rules for SQL statements. For more information, see Computing pricing.
For more information, see Merge small files.
Related statements
CREATE TABLE: creates a non-partitioned table, a partitioned table, an external table, or a clustered table.
TRUNCATE: clears data from a specified table.
DROP TABLE: drops a partitioned table or a non-partitioned table.
DESC TABLE/VIEW: views the information about MaxCompute internal tables, views, materialized views, external tables, clustered tables, or transactional tables.
SHOW: views SQL DDL statements that are used to create tables, or queries information of all tables and views in a project or information of all partitions in a table.