MaxCompute supports three storage tiers: standard storage, infrequent access (IA) storage, and long-term storage. By default, data is stored at the standard storage tier. Based on how frequently you need to access the data, you can specify the storage tier for a specific table or partition as either IA or long-term. This way, cold (infrequently accessed) and hot (frequently accessed) data are separately stored, reducing data storage costs.
Apply for the use of the feature
New projects use the storage tiers feature by default, including the China (Hong Kong), Japan (Tokyo), Indonesia (Jakarta), Malaysia (Kuala Lumpur), and Germany (Frankfurt) regions.
If you want to use the tiered storage feature for existing projects in additional regions, such as Singapore, click the link to apply for the use of the feature. For more information, see Apply for trial use of new features.
Storage tiers
Storage type | Description |
Standard storage | The default storage type, which is suitable for scenarios where data is frequently accessed, and requires frequent read and write operations. |
IA storage | IA and long-term storage are suitable for data that is infrequently accessed. IA and long-term storage can significantly reduce storage costs. Note
|
Long-term storage |
Tiered storage pricing
Storage type | Billing formula | Pricing | Description |
Standard storage | Daily storage fees = Average daily storage volume × Unit price | If the actual storage volume of a project is 1 GB or less than 1 GB, the data storage is free of charge. If the actual storage volume exceeds 1 GB, you are charged for the extra data volume based on the unit price USD 0.0006/GB/day. | Average daily storage volume = Sum of the collected storage volume/24 |
Infrequent access (IA) storage |
|
|
|
Long-term storage |
|
|
For more information about storage pricing, see Storage pricing.
You can log on to the Expenses and Costs console and choose Bills > Bill Details in the left-side navigation pane. On the Bill Details page, view the fees that are incurred for data storage and access at different storage tiers. For more information, see View billing details.
Precautions
PAI and Hologres cannot directly access data at the IA or long-term storage tier.
If the storage tier of a table or partition is set to IA or long-term storage, take note of the amount of data and the data access frequency. If data is accessed in large amounts or frequently, the fees incurred may be higher than the data at the standard storage tier.
If you access all data in a table or partition at the IA storage tier once in a month, the sum of the data access fees and storage fees are the same as the storage fees of the data at the standard storage tier.
If you access all data in a table or partition at the long-term storage tier once in six months, the sum of the data access fees and storage fees are the same as the storage fees of the data at the standard storage tier.
Specify a storage type
Storage type can be mutually converted. The conversion does not negatively affect data access performance.
Conversion process | Description |
Standard storage to IA storage | Supports both manual and automatic conversion, with no I/O access fees. The fields of the table or the partition, such as "last data update time" and "last data access time," are not updated. |
Standard storage to Long-term storage | |
IA storage to Long-term storage | Supports both manual and automatic conversion. Manual conversion incurs I/O access fees, whereas automatic conversion does not. The fields of the table or the partition, such as "last data update time" and "last data access time," are not updated. |
IA storage to Standard storage | Only supports manual conversion, which incurs I/O access fees. The fields of the table or the partition, such as "last data update time" and "last data access time," are updated. |
Long-term storage to IA storage | |
Long-term storage to Standard storage |
Manual configuration
For non-partitioned tables or partitions in partitioned tables, you can manually specify the IA or long-term storage tier. The setting takes effect immediately after manual configuration.
Syntax
ALTER TABLE <TABLE_NAME> [PARTITION(<PARTITION_SPEC>)]
SET <TBLPROPERTIES|PARTITIONPROPERTIES>("storagetier"="standard|lowfrequency|longterm");
Parameter description
TABLE_NAME: the name of a table. This parameter is required regardless of whether you want to specify a storage tier for a table or a partition.
PARTITION_SPEC: required when you want to specify a storage tier for a partition.
TBLPROPERTIES|PARTITIONPROPERTIES: the properties of a table or partition for which you want to specify a storage tier.
TBLPROPERTIES: the properties of a table.
PARTITIONPROPERTIES: the properties of a partition.
storagetier: required. The storage tier. Valid values:
standard: standard storage. You are charged only storage fees.
lowfrequency: IA storage. You are charged storage fees and the fees incurred for access to data at the IA storage tier.
longterm: long-term storage. You are charged storage fees and the fees incurred for access to data at the long-term storage tier.
For partitioned tables, storage tiers can be specified only at the partition level and cannot be specified at the table level.
Examples
Example 1: Specify IA storage as the storage tier for a non-partitioned table.
ALTER TABLE tablename SET TBLPROPERTIES("storagetier"="lowfrequency");
View the
StorageTier
field in table properties to check the current storage tier.-- View table properties. DESC extended tablename; -- The following result is returned: +-------------------------------------------------------------------+ | Owner: ALIYUN$mofan_****@test.aliyunid.com | | Project: mf_mc_**** | | TableComment: | +-------------------------------------------------------------------+ | CreateTime: 2021-11-18 15:14:00 | | LastDDLTime: 2023-09-11 14:34:55 | | LastModifiedTime: 2023-09-13 15:02:28 | | LastAccessTime: 2023-09-14 10:50:57 | +-------------------------------------------------------------------+ | InternalTable: YES | Size: 1923683131 | +-------------------------------------------------------------------+ | Native Columns: | +-------------------------------------------------------------------+ | Field| Type| Label |ExtendedLabel| Nullable| DefaultValue|Comment | +-------------------------------------------------------------------+ | empno | bigint | | | true | NULL | | | ename | string | | | true | NULL | | | job | string | | | true | NULL | | | mgr | bigint | | | true | NULL | | | hiredate | datetime | | | true | NULL | | | sal | bigint | | | true | NULL | | | comm | bigint | | | true | NULL | | | deptno | bigint | | | true | NULL | | +-------------------------------------------------------------------+ | Extended Info: | +-------------------------------------------------------------------+ | TableID: 8e0cc78c81ab4ad7af30bff7a8e**** | | IsArchived: false | | PhysicalSize: 5771049393 | | FileNum: 3 | | StoredAs: AliOrc | | CompressionStrategy: normal | | odps.timemachine.retention.days: 1 | | ColdStorageStatus: N/A | | encryption_enable: false | | StorageTier: lowfrequency | | StorageTierLastModifiedTime: 2023-09-11 14:34:55 | +-------------------------------------------------------------------+
Example 2: Specify IA storage as the storage tier for a partition in a partitioned table named bank_data_pt.
ALTER TABLE bank_data_pt PARTITION (credit='yes') SET PARTITIONPROPERTIES ("storagetier" = 'lowfrequency');
View the
StorageTier
field in partition properties to check the current storage tier.-- View partition properties. DESC extended bank_data_pt PARTITION(credit='yes'); -- The following result is returned: +------------------------------------------------------------------------------------+ | PartitionSize: 0 | +------------------------------------------------------------------------------------+ | CreateTime: 2024-05-10 10:28:16 | | LastDDLTime: 2024-05-10 10:31:01 | | LastModifiedTime: 2024-05-10 10:28:16 | +------------------------------------------------------------------------------------+ | IsExstore: false | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | ColdStorageStatus: N/A | | StorageTier: LowFrequency | | StorageTierLastModifiedTime: 2024-05-10 10:31:01 | +------------------------------------------------------------------------------------+
Automatic configuration based on lifecycle rules
For projects or partitioned tables, you can define rules for the lifecycles of storage tiers. The system can trigger automatic conversion among storage tiers based on the lifecycle rules.
After you define storage tier lifecycle rules for a project, if all the non-partitioned tables or partitions in the project meet the rules, the system automatically performs storage tier conversion for the non-partitioned tables or partitions.
After you define storage tier lifecycle rules for a partitioned table, if all the partitions in the table meet the rules, the system automatically performs storage tier conversion for the partitions.
Usage notes
You cannot separately configure storage tier lifecycle rules for each partition or non-partitioned table.
The storage tier lifecycle rules for a partitioned table have a higher priority than those for a project to which the partitioned table belongs.
If the lifecycle of a table or partition meets the rules for both the long-term and IA storage tiers, the storage tier is preferentially converted into long-term storage.
If the lifecycle of a table or partition meets the rule for IA storage, the storage tier is converted into IA storage. Subsequently, if the lifecycle of the table or partition meets the rule for long-term storage, the storage tier is converted into long-term storage. For the conversion from IA to long-term storage, fees are incurred for access to data at the IA storage tier. For pricing details, see Tiered storage pricing in this topic.
The platform regularly scans the rules twice a day. Therefore, the conversion may not be immediately performed after the rules are met.
Syntax
Configure lifecycle rules at the project level
SETPROJECT odps.table.lifecycle.config=<lifecycle_config_json_string>;
You can also configure through the MaxCompute console:
Log on to the MaxCompute console. In the top navigation bar, select a region.
In the left-side navigation pane, choose Workspace > Projects, find the project you want to manage and click Manage in the Actions column.
In the Lifecycle Configuration section of the Parameter Configuration tab, configure the parameters for the Last Access Configuration Policy and the Last Modified Configuration Policy.
Last access configuration policy: corresponds to the
DaysAfterLastAccessGreaterThan
parameter.Last modified configuration policy: corresponds to the
DaysAfterLastModificationGreaterThan
parameter.
Configure lifecycle rules at the partitioned table level
Configure rules during table creation:
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> [PRIMARY KEY (<pk_col_name>, <pk_col_name2>),(<col_name> <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...)] PARTITIONED BY (<col_name> <data_type> [comment <col_comment>], ...) tblproperties ('lifecycle_config' = '<lifecycle_config_json_string>') ;
Configure rules by modifying table settings:
ALTER TABLE <TABLE_NAME> SET TBLPROPERTIES ('lifecycle_config' = '<lifecycle_config_json_string>');
View the configurations of the tiered storage lifecycle for the partition table
SHOW CREATE TABLE <table_name>;
Parameter description
This section describes the main parameters. For more information about other parameters, see "Parameters" in Table operations.
lifecycle_config_json_string:
The following code shows the definition of a lifecycle rule at the project level:
{ "TierToLowFrequency": { "DaysAfterLastModificationGreaterThan": <days>, // xx days after the last modification time "DaysAfterLastAccessGreaterThan": <days>, // xx days after the last access time }, "TierToLongterm": { "DaysAfterLastModificationGreaterThan": <days>, "DaysAfterLastAccessGreaterThan": <days> } // Each condition is optional, and multiple conditions have the OR relationship. }
The following code shows the definition of a lifecycle rule at the partitioned table level:
{ \"TierToLowFrequency\": { \"DaysAfterLastModificationGreaterThan\": <days>, // xx days after the last modification time \"DaysAfterLastAccessGreaterThan\": <days>, // xx days after the last access time }, \"TierToLongterm\": { \"DaysAfterLastModificationGreaterThan\": <days>, \"DaysAfterLastAccessGreaterThan\": <days> } // Each condition is optional, and multiple conditions have the OR relationship. }
TierToLowFrequency: the IA storage tier.
TierToLongterm: the long-term storage tier.
DaysAfterLastModificationGreaterThan: Automatic conversion is triggered N days after the last modification time. N is specified by this parameter. This parameter corresponds to LastModifiedTime of the table or partition.
DaysAfterLastAccessGreaterThan: Automatic conversion is triggered N days after the last access time. N is specified by this parameter. If LastAccessTime is left empty for the table or partition, the following principles are applied:
For tables or partitions that you created before October 1, 2023,
2023.10.01 00:00:00
UTC+0 is considered as the last access time.For tables or partitions that you created on or after October 1, 2023, if no data is accessed, the table or partition creation time is considered to be the last access time.
Examples
Example 1: Configure storage tier lifecycle rules at the project level.
setproject odps.table.lifecycle.config={"TierToLongterm":{"DaysAfterLastAccessGreaterThan":180},"TierToLowFrequency":{"DaysAfterLastAccessGreaterThan":120}};
Example 2: Remove the configurations of storage tier lifecycle rules at the project level.
setproject odps.table.lifecycle.config=;
Example 3: Configure storage tier lifecycle rules at the partitioned table level.
-- Configure rules during partitioned table creation. CREATE TABLE lifecycle_part_t (key string) PARTITIONED BY (ds STRING) tblproperties ('lifecycle_config' = '{\"TierToLowFrequency\": {\"DaysAfterLastModificationGreaterThan\": 2,\"DaysAfterLastAccessGreaterThan\": 2},\"TierToLongterm\": {\"DaysAfterLastModificationGreaterThan\": 4,\"DaysAfterLastTierModificationGreaterThan\": 7}}') ; -- Configure rules by modifying table settings. ALTER TABLE lifecycle_part_t SET tblproperties ('lifecycle_config'='{\"TierToLowFrequency\": {\"DaysAfterLastModificationGreaterThan\": 90,\"DaysAfterLastAccessGreaterThan\": 30},\"TierToLongterm\": {\"DaysAfterLastModificationGreaterThan\": 180,\"DaysAfterLastTierModificationGreaterThan\": 7}}');
Example 4: Remove the configurations of the tiered storage lifecycle for the partition table.
ALTER TABLE lifecycle_part_t SET tblproperties ('lifecycle_config'='{}');
Configure access privileges for tiered storage data
Access incurs fees when retrieving data from the IA or cold storage tiers. You can leverage the GET_PARTITION_META function to manage permissions for data within the IA or cold storage tiers based on the Row-level access control feature, thereby effectively regulating data access.
GET_PARTITION_META
GET_PARTITION_META is a special function that can only be used with row-level permissions and is not applicable in standard SQL queries.
Syntax
struct GET_PARTITION_META(<tableName>, <pt_col1>, <pt_col2>, ..., <pt_col_n>);
Parameter description
Parameter | Description |
tableName | The name of the table, which must be a partitioned table. The value is the String type and can be in the format |
pt_col | Each parameter from pt_col1 to pt_col_n corresponds to a partition level of the partitioned table. Each parameter must be a column reference. |
Return value description
Returns a Struct (struct<storagetier:string>
) type value. The Struct contains one String field that describes the storage class of the corresponding partition.
Precautions
When adding row-level rules to a table, you need to consider the access behavior of other users. If other users have accessed the table, set explicit rules for them to prevent unexpected access denial. For more details, see Row-level access control.
MaxCompute tables can be accessed through SQL and external engines, such as Spark or Flink. However, the GET_PARTITION_META function is currently supported only by the MaxCompute SQL engine. Tables using the GET_PARTITION_META function in row-level access control are inaccessible by other engines.
To access data, you must have data access privileges in addition to row-level permissions, specifically the SELECT permission on the data.
Filter conditions in GET_PARTITION_META result in different partition pruning effects depending on the scenario:
The filter conditions involve only partition fields. For example, allowing access to standard storage with the first-level partition value being
2024
.GET_PARTITION_META('storage_table', pt1, pt2).storagetier == 'standard') AND pt1='2024'
When utilizing SQL to query
storage_table
, if the WHERE clause lacks partition conditions, the system automatically supports partition pruning to prevent a full table scan. This ensures that only the partitions meeting the requirements instorage_table
are accessed for read-only mode.The filter condition contains a non-partition field value and connects two filter conditions by using an AND operation. For example, allowing access to standard storage while concurrently applying an AND operation for non-partition values where
a > 100
.GET_PARTITION_META('storage_table', pt1, pt2).storagetier == 'standard') AND a > 100
When utilizing SQL to query
storage_table
, if the WHERE clause lacks partition conditions, partition pruning is supported. In this case, only the partitions designated asstandard
are accessed.The filter condition contains a non-partition field value and connects two filter conditions by using the OR operation. For example, allowing access to standard storage while concurrently applying an OR operation for non-partition values where
a>100
.get_partition_meta('storage_table', pt1, pt2).storagetier == 'standard') OR a > 100
When utilizing SQL to query
storage_table
, you may encounter one of two scenarios:If the WHERE clause lacks partition conditions, this results in a full table scan to retrieve data satisfying the condition
a>100
.If the WHERE clause includes partition conditions, only data within the relevant partition that meets the condition
standard
ora>100
is scanned.
Examples
Define the partitioned table storage_table
.
CREATE TABLE storage_table(a BIGINT, b BIGINT) PARTITIONED BY (pt1 STRING, pt2 STRING);
Example 1:
Grant the
policy01
permission to the default user to access standard storage data instorage_table
. Users cannot access IA or long-term storage data, and null values are returned.CREATE ROW ACCESS POLICY policy01 ON storage_table TO DEFAULT FILTER USING (get_partition_meta('storage_table', pt1, pt2).storagetier == 'standard');
Example 2:
Grant the
policy02
permission to useruser_x
to access IA and long-term storage data instorage_table
.CREATE ROW ACCESS POLICY policy02 ON storage_table TO USER (user_x) -- You can also grant permissions to a role by using the TO role rolename clause and then grant the role to the user. FILTER USING (get_partition_meta('storage_table', pt1, pt2).storagetier IN ('lowfrequency','longterm'));
Two possible scenarios:
When the
policy01
permission is configured for the project, all users (except user_x) are granted access to the standard storage data instorage_table
. However, they do not have access to retrieve IA or long-term storage data, resulting in null return values. User user_x has access to the IA and long-term storage data withinstorage_table
, but is denied access to the standard storage data, which also results in null values.When the
policy01
permission is not configured for the project, all users (except user_x) are denied access to any storage data instorage_table
, resulting in null return values. Only user user_x is granted access to the IA and long-term storage data, but is denied access to the standard storage data, also resulting in null values.
Example 3:
Grant the
policy03
permission to useruser_y
to access all data instorage_table
.CREATE ROW ACCESS POLICY policy03 ON storage_table TO USER (user_y) -- You can also grant permissions to a role by using the TO role rolename clause and then grant the role to the user. FILTER USING (true); -- The constant true indicates that all data in the table can be accessed
Two possible scenarios :
When the
policy01
permission is configured for the project, all users (except for user_y) are granted access to the standard storage data within thestorage_table
. However, they do not have access to retrieve IA or long-term storage data, resulting in null return values. Useruser_y
has access to all storage classes in thestorage_table
.When the
policy01
permission is not configured for the project, all users (except for user_y) are denied to access any storage data within thestorage_table
, resulting in null return values. Onlyuser_y
is granted access to all storage class data in thestorage_table
.