All Products
Search
Document Center

MaxCompute:Configure storage tiers for storage resources

Last Updated:Dec 23, 2024

MaxCompute supports three storage tiers: standard storage, infrequent access (IA) storage, and long-term storage. By default, data is stored at the standard storage tier. Based on how frequently you need to access the data, you can specify the storage tier for a specific table or partition as either IA or long-term. This way, cold (infrequently accessed) and hot (frequently accessed) data are separately stored, reducing data storage costs.

Apply for the use of the feature

  • New projects use the storage tiers feature by default, including the China (Hong Kong), Japan (Tokyo), Indonesia (Jakarta), Malaysia (Kuala Lumpur), and Germany (Frankfurt) regions.

  • If you want to use the tiered storage feature for existing projects in additional regions, such as Singapore, click the link to apply for the use of the feature. For more information, see Apply for trial use of new features.

Storage tiers

Storage type

Description

Standard storage

The default storage type, which is suitable for scenarios where data is frequently accessed, and requires frequent read and write operations.

IA storage

IA and long-term storage are suitable for data that is infrequently accessed. IA and long-term storage can significantly reduce storage costs.

Note
  • Platform for AI (PAI) and Hologres cannot directly access data at the IA or long-term storage tier.

  • If data is accessed in large amounts or frequently, the fees incurred may be higher than the data at the standard storage tier. The following content is for demonstration purposes:

    • If you access all data in a table or partition at the IA storage tier once in a month, the sum of the data access fees and storage fees is the same as the storage fees of the data at the standard storage tier.

    • If you access all data in a table or partition at the long-term storage tier once in six months, the sum of the data access fees and storage fees is the same as the storage fees of the data at the standard storage tier.

Long-term storage

Tiered storage pricing

Storage type

Billing formula

Pricing

Description

Standard storage

Daily storage fees = Average daily storage volume × Unit price

If the actual storage volume of a project is 1 GB or less than 1 GB, the data storage is free of charge. If the actual storage volume exceeds 1 GB, you are charged for the extra data volume based on the unit price USD 0.0006/GB/day.

Average daily storage volume = Sum of the collected storage volume/24

Infrequent access (IA) storage

  • Daily storage fees = Average daily storage volume × Unit price

  • Data access fees = Amount of accessed data × Unit price

  • Unit price for IA storage: USD 0.000412/GB/day.

  • Unit price for access to data of IA storage: USD 0.006191/GB.

  • Average daily storage volume = Sum of the collected storage volume/24

  • If the storage type of a table or partition is changed from IA to standard or long-term storage, you are charged data access fees for IA storage of the entire table or partition. No data access fees are charged when IA storage is automatically converted to long-term through the tiered storage lifecycle.

  • If the storage type of a table or partition is changed from long-term to standard or IA storage, you are charged data access fees for long-term storage of the entire table or partition.

Long-term storage

  • Daily storage fees = Average daily storage volume × Unit price

  • Data access fees = Amount of accessed data × Unit price

  • Unit price for long-term storage: USD 0.00017/GB/day.

  • Unit price for access to data of long-term storage: USD 0.080793/GB.

  • For more information about storage pricing, see Storage pricing.

  • You can log on to the Expenses and Costs console and choose Bills > Bill Details in the left-side navigation pane. On the Bill Details page, view the fees that are incurred for data storage and access at different storage tiers. For more information, see View billing details.

Precautions

  • PAI and Hologres cannot directly access data at the IA or long-term storage tier.

  • If the storage tier of a table or partition is set to IA or long-term storage, take note of the amount of data and the data access frequency. If data is accessed in large amounts or frequently, the fees incurred may be higher than the data at the standard storage tier.

    • If you access all data in a table or partition at the IA storage tier once in a month, the sum of the data access fees and storage fees are the same as the storage fees of the data at the standard storage tier.

    • If you access all data in a table or partition at the long-term storage tier once in six months, the sum of the data access fees and storage fees are the same as the storage fees of the data at the standard storage tier.

Specify a storage type

Storage type can be mutually converted. The conversion does not negatively affect data access performance.

image

Conversion process

Description

Standard storage to IA storage

Supports both manual and automatic conversion, with no I/O access fees. The fields of the table or the partition, such as "last data update time" and "last data access time," are not updated.

Standard storage to Long-term storage

IA storage to Long-term storage

Supports both manual and automatic conversion. Manual conversion incurs I/O access fees, whereas automatic conversion does not. The fields of the table or the partition, such as "last data update time" and "last data access time," are not updated.

IA storage to Standard storage

Only supports manual conversion, which incurs I/O access fees. The fields of the table or the partition, such as "last data update time" and "last data access time," are updated.

Long-term storage to IA storage

Long-term storage to Standard storage

Manual configuration

For non-partitioned tables or partitions in partitioned tables, you can manually specify the IA or long-term storage tier. The setting takes effect immediately after manual configuration.

Syntax

ALTER TABLE <TABLE_NAME> [PARTITION(<PARTITION_SPEC>)]
        SET <TBLPROPERTIES|PARTITIONPROPERTIES>("storagetier"="standard|lowfrequency|longterm");

Parameter description

  • TABLE_NAME: the name of a table. This parameter is required regardless of whether you want to specify a storage tier for a table or a partition.

  • PARTITION_SPEC: required when you want to specify a storage tier for a partition.

  • TBLPROPERTIES|PARTITIONPROPERTIES: the properties of a table or partition for which you want to specify a storage tier.

    • TBLPROPERTIES: the properties of a table.

    • PARTITIONPROPERTIES: the properties of a partition.

  • storagetier: required. The storage tier. Valid values:

    • standard: standard storage. You are charged only storage fees.

    • lowfrequency: IA storage. You are charged storage fees and the fees incurred for access to data at the IA storage tier.

    • longterm: long-term storage. You are charged storage fees and the fees incurred for access to data at the long-term storage tier.

Note

For partitioned tables, storage tiers can be specified only at the partition level and cannot be specified at the table level.

Examples

  • Example 1: Specify IA storage as the storage tier for a non-partitioned table.

    ALTER TABLE tablename
            SET TBLPROPERTIES("storagetier"="lowfrequency");

    View the StorageTier field in table properties to check the current storage tier.

    -- View table properties.
    DESC extended tablename;  
    
    -- The following result is returned:
    +-------------------------------------------------------------------+
    | Owner:                    ALIYUN$mofan_****@test.aliyunid.com      |
    | Project:                  mf_mc_****                                |
    | TableComment:                                                     |
    +-------------------------------------------------------------------+
    | CreateTime:               2021-11-18 15:14:00                     |
    | LastDDLTime:              2023-09-11 14:34:55                     |
    | LastModifiedTime:         2023-09-13 15:02:28                     |
    | LastAccessTime:           2023-09-14 10:50:57                     |
    +-------------------------------------------------------------------+
    | InternalTable: YES      | Size: 1923683131                        |
    +-------------------------------------------------------------------+
    | Native Columns:                                                   |
    +-------------------------------------------------------------------+
    | Field| Type| Label |ExtendedLabel| Nullable| DefaultValue|Comment |
    +-------------------------------------------------------------------+
    | empno    | bigint |       |               | true     | NULL  |    |
    | ename    | string |       |               | true     | NULL  |    |
    | job      | string |       |               | true     | NULL  |    |
    | mgr      | bigint |       |               | true     | NULL  |    |
    | hiredate | datetime |     |               | true     | NULL  |    |
    | sal      | bigint |       |               | true     | NULL  |    |
    | comm     | bigint |       |               | true     | NULL  |    |
    | deptno   | bigint |       |               | true     | NULL  |    |
    +-------------------------------------------------------------------+
    | Extended Info:                                                    |
    +-------------------------------------------------------------------+
    | TableID:                  8e0cc78c81ab4ad7af30bff7a8e****         |
    | IsArchived:               false                                   |
    | PhysicalSize:             5771049393                              |
    | FileNum:                  3                                       |
    | StoredAs:                 AliOrc                                  |
    | CompressionStrategy:      normal                                  |
    | odps.timemachine.retention.days: 1                                |
    | ColdStorageStatus:        N/A                                     |
    | encryption_enable:        false                                   |
    | StorageTier:              lowfrequency                            |
    | StorageTierLastModifiedTime:  2023-09-11 14:34:55                 |
    +-------------------------------------------------------------------+
  • Example 2: Specify IA storage as the storage tier for a partition in a partitioned table named bank_data_pt.

    ALTER TABLE bank_data_pt  PARTITION (credit='yes') SET PARTITIONPROPERTIES ("storagetier" = 'lowfrequency');

    View the StorageTier field in partition properties to check the current storage tier.

    -- View partition properties.
    DESC extended bank_data_pt PARTITION(credit='yes');  
    
    -- The following result is returned:
    +------------------------------------------------------------------------------------+
    | PartitionSize: 0                                                                   |
    +------------------------------------------------------------------------------------+
    | CreateTime:               2024-05-10 10:28:16                                      |
    | LastDDLTime:              2024-05-10 10:31:01                                      |
    | LastModifiedTime:         2024-05-10 10:28:16                                      |
    +------------------------------------------------------------------------------------+
    | IsExstore:                false                                                    |
    | IsArchived:               false                                                    |
    | PhysicalSize:             0                                                        |
    | FileNum:                  0                                                        |
    | ColdStorageStatus:        N/A                                                      |
    | StorageTier:              LowFrequency                                             |
    | StorageTierLastModifiedTime:  2024-05-10 10:31:01                                  |
    +------------------------------------------------------------------------------------+

Automatic configuration based on lifecycle rules

For projects or partitioned tables, you can define rules for the lifecycles of storage tiers. The system can trigger automatic conversion among storage tiers based on the lifecycle rules.

  • After you define storage tier lifecycle rules for a project, if all the non-partitioned tables or partitions in the project meet the rules, the system automatically performs storage tier conversion for the non-partitioned tables or partitions.

  • After you define storage tier lifecycle rules for a partitioned table, if all the partitions in the table meet the rules, the system automatically performs storage tier conversion for the partitions.

Usage notes

  • You cannot separately configure storage tier lifecycle rules for each partition or non-partitioned table.

  • The storage tier lifecycle rules for a partitioned table have a higher priority than those for a project to which the partitioned table belongs.

  • If the lifecycle of a table or partition meets the rules for both the long-term and IA storage tiers, the storage tier is preferentially converted into long-term storage.

  • If the lifecycle of a table or partition meets the rule for IA storage, the storage tier is converted into IA storage. Subsequently, if the lifecycle of the table or partition meets the rule for long-term storage, the storage tier is converted into long-term storage. For the conversion from IA to long-term storage, fees are incurred for access to data at the IA storage tier. For pricing details, see Tiered storage pricing in this topic.

  • The platform regularly scans the rules twice a day. Therefore, the conversion may not be immediately performed after the rules are met.

Syntax

  • Configure lifecycle rules at the project level

    SETPROJECT odps.table.lifecycle.config=<lifecycle_config_json_string>;

    You can also configure through the MaxCompute console:

    1. Log on to the MaxCompute console. In the top navigation bar, select a region.

    2. In the left-side navigation pane, choose Workspace > Projects, find the project you want to manage and click Manage in the Actions column.

    3. In the Lifecycle Configuration section of the Parameter Configuration tab, configure the parameters for the Last Access Configuration Policy and the Last Modified Configuration Policy.

      fc44a71b8f05b603d6716bb468198402

      • Last access configuration policy: corresponds to the DaysAfterLastAccessGreaterThan parameter.

      • Last modified configuration policy: corresponds to the DaysAfterLastModificationGreaterThan parameter.

  • Configure lifecycle rules at the partitioned table level

    • Configure rules during table creation:

      CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name>
       [PRIMARY KEY (<pk_col_name>, <pk_col_name2>),(<col_name> <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...)]
       PARTITIONED BY (<col_name> <data_type> [comment <col_comment>], ...)
      tblproperties ('lifecycle_config' = '<lifecycle_config_json_string>')
      ;
    • Configure rules by modifying table settings:

      ALTER TABLE <TABLE_NAME> SET TBLPROPERTIES ('lifecycle_config' = '<lifecycle_config_json_string>');
  • View the configurations of the tiered storage lifecycle for the partition table

    SHOW CREATE TABLE <table_name>;

Parameter description

This section describes the main parameters. For more information about other parameters, see "Parameters" in Table operations.

lifecycle_config_json_string:

  • The following code shows the definition of a lifecycle rule at the project level:

    {
      "TierToLowFrequency": {
        "DaysAfterLastModificationGreaterThan": <days>, // xx days after the last modification time
        "DaysAfterLastAccessGreaterThan": <days>, // xx days after the last access time
      },
      "TierToLongterm": {
        "DaysAfterLastModificationGreaterThan": <days>,
        "DaysAfterLastAccessGreaterThan": <days>
      }
      // Each condition is optional, and multiple conditions have the OR relationship.
    }
  • The following code shows the definition of a lifecycle rule at the partitioned table level:

    {
      \"TierToLowFrequency\": {
        \"DaysAfterLastModificationGreaterThan\": <days>, // xx days after the last modification time
        \"DaysAfterLastAccessGreaterThan\": <days>, // xx days after the last access time
      },
      \"TierToLongterm\": {
        \"DaysAfterLastModificationGreaterThan\": <days>,
        \"DaysAfterLastAccessGreaterThan\": <days>
      }
      // Each condition is optional, and multiple conditions have the OR relationship.
    }
  • TierToLowFrequency: the IA storage tier.

  • TierToLongterm: the long-term storage tier.

  • DaysAfterLastModificationGreaterThan: Automatic conversion is triggered N days after the last modification time. N is specified by this parameter. This parameter corresponds to LastModifiedTime of the table or partition.

  • DaysAfterLastAccessGreaterThan: Automatic conversion is triggered N days after the last access time. N is specified by this parameter. If LastAccessTime is left empty for the table or partition, the following principles are applied:

    • For tables or partitions that you created before October 1, 2023, 2023.10.01 00:00:00 UTC+0 is considered as the last access time.

    • For tables or partitions that you created on or after October 1, 2023, if no data is accessed, the table or partition creation time is considered to be the last access time.

Examples

  • Example 1: Configure storage tier lifecycle rules at the project level.

    setproject odps.table.lifecycle.config={"TierToLongterm":{"DaysAfterLastAccessGreaterThan":180},"TierToLowFrequency":{"DaysAfterLastAccessGreaterThan":120}};
  • Example 2: Remove the configurations of storage tier lifecycle rules at the project level.

    setproject odps.table.lifecycle.config=;
  • Example 3: Configure storage tier lifecycle rules at the partitioned table level.

    -- Configure rules during partitioned table creation.
    CREATE TABLE lifecycle_part_t (key string) 
    PARTITIONED BY (ds  STRING)
    tblproperties ('lifecycle_config' = '{\"TierToLowFrequency\": {\"DaysAfterLastModificationGreaterThan\": 2,\"DaysAfterLastAccessGreaterThan\": 2},\"TierToLongterm\": {\"DaysAfterLastModificationGreaterThan\": 4,\"DaysAfterLastTierModificationGreaterThan\": 7}}')
    ;
    -- Configure rules by modifying table settings.
    ALTER TABLE lifecycle_part_t SET tblproperties ('lifecycle_config'='{\"TierToLowFrequency\": {\"DaysAfterLastModificationGreaterThan\": 90,\"DaysAfterLastAccessGreaterThan\": 30},\"TierToLongterm\": {\"DaysAfterLastModificationGreaterThan\": 180,\"DaysAfterLastTierModificationGreaterThan\": 7}}');
  • Example 4: Remove the configurations of the tiered storage lifecycle for the partition table.

    ALTER TABLE lifecycle_part_t SET tblproperties ('lifecycle_config'='{}');

Configure access privileges for tiered storage data

Access incurs fees when retrieving data from the IA or cold storage tiers. You can leverage the GET_PARTITION_META function to manage permissions for data within the IA or cold storage tiers based on the Row-level access control feature, thereby effectively regulating data access.

GET_PARTITION_META

Note

GET_PARTITION_META is a special function that can only be used with row-level permissions and is not applicable in standard SQL queries.

Syntax

struct GET_PARTITION_META(<tableName>, <pt_col1>, <pt_col2>, ..., <pt_col_n>);

Parameter description

Parameter

Description

tableName

The name of the table, which must be a partitioned table. The value is the String type and can be in the format project.table or project.schema.table.

pt_col

Each parameter from pt_col1 to pt_col_n corresponds to a partition level of the partitioned table. Each parameter must be a column reference.

Return value description

Returns a Struct (struct<storagetier:string>) type value. The Struct contains one String field that describes the storage class of the corresponding partition.

Precautions

  • When adding row-level rules to a table, you need to consider the access behavior of other users. If other users have accessed the table, set explicit rules for them to prevent unexpected access denial. For more details, see Row-level access control.

  • MaxCompute tables can be accessed through SQL and external engines, such as Spark or Flink. However, the GET_PARTITION_META function is currently supported only by the MaxCompute SQL engine. Tables using the GET_PARTITION_META function in row-level access control are inaccessible by other engines.

  • To access data, you must have data access privileges in addition to row-level permissions, specifically the SELECT permission on the data.

  • Filter conditions in GET_PARTITION_META result in different partition pruning effects depending on the scenario:

    • The filter conditions involve only partition fields. For example, allowing access to standard storage with the first-level partition value being 2024.

      GET_PARTITION_META('storage_table', pt1, pt2).storagetier == 'standard') AND pt1='2024'

      When utilizing SQL to query storage_table, if the WHERE clause lacks partition conditions, the system automatically supports partition pruning to prevent a full table scan. This ensures that only the partitions meeting the requirements in storage_table are accessed for read-only mode.

    • The filter condition contains a non-partition field value and connects two filter conditions by using an AND operation. For example, allowing access to standard storage while concurrently applying an AND operation for non-partition values where a > 100.

      GET_PARTITION_META('storage_table', pt1, pt2).storagetier == 'standard') AND a > 100

      When utilizing SQL to query storage_table, if the WHERE clause lacks partition conditions, partition pruning is supported. In this case, only the partitions designated as standard are accessed.

    • The filter condition contains a non-partition field value and connects two filter conditions by using the OR operation. For example, allowing access to standard storage while concurrently applying an OR operation for non-partition values where a>100.

      get_partition_meta('storage_table', pt1, pt2).storagetier == 'standard') OR a > 100

      When utilizing SQL to query storage_table, you may encounter one of two scenarios:

      • If the WHERE clause lacks partition conditions, this results in a full table scan to retrieve data satisfying the condition a>100.

      • If the WHERE clause includes partition conditions, only data within the relevant partition that meets the condition standard or a>100 is scanned.

Examples

Define the partitioned table storage_table.

CREATE TABLE storage_table(a BIGINT, b BIGINT) PARTITIONED BY (pt1 STRING, pt2 STRING);
  • Example 1:

    Grant the policy01 permission to the default user to access standard storage data in storage_table. Users cannot access IA or long-term storage data, and null values are returned.

    CREATE ROW ACCESS POLICY policy01 
    ON storage_table 
    TO DEFAULT 
    FILTER USING (get_partition_meta('storage_table', pt1, pt2).storagetier == 'standard');
  • Example 2:

    Grant the policy02 permission to user user_x to access IA and long-term storage data in storage_table.

    CREATE ROW ACCESS POLICY policy02 
    ON storage_table 
    TO USER (user_x)  -- You can also grant permissions to a role by using the TO role rolename clause and then grant the role to the user.
    FILTER USING (get_partition_meta('storage_table', pt1, pt2).storagetier IN ('lowfrequency','longterm'));

    Two possible scenarios:

    • When the policy01 permission is configured for the project, all users (except user_x) are granted access to the standard storage data in storage_table. However, they do not have access to retrieve IA or long-term storage data, resulting in null return values. User user_x has access to the IA and long-term storage data within storage_table, but is denied access to the standard storage data, which also results in null values.

    • When the policy01 permission is not configured for the project, all users (except user_x) are denied access to any storage data in storage_table, resulting in null return values. Only user user_x is granted access to the IA and long-term storage data, but is denied access to the standard storage data, also resulting in null values.

  • Example 3:

    Grant the policy03 permission to user user_y to access all data in storage_table.

    CREATE ROW ACCESS POLICY policy03 
    ON storage_table 
    TO USER (user_y)  -- You can also grant permissions to a role by using the TO role rolename clause and then grant the role to the user.
    FILTER USING (true);  -- The constant true indicates that all data in the table can be accessed

    Two possible scenarios :

    • When the policy01 permission is configured for the project, all users (except for user_y) are granted access to the standard storage data within the storage_table. However, they do not have access to retrieve IA or long-term storage data, resulting in null return values. User user_y has access to all storage classes in the storage_table.

    • When the policy01 permission is not configured for the project, all users (except for user_y) are denied to access any storage data within the storage_table, resulting in null return values. Only user_y is granted access to all storage class data in the storage_table.