The lifecycle of a MaxCompute table or a partition starts at the time when the data in the table or partition was last updated. If the data remains unchanged for a specific period of time, MaxCompute automatically reclaims the table or partition. The period of time for which the data remains unchanged is the lifecycle of the table or the partition. You can configure lifecycle rules to implement automatic data cleansing or data retention to reduce storage costs.
Lifecycle rules
The value of a lifecycle is a positive integer. Unit: days.
If the data in a non-partitioned table remains unchanged within the lifecycle of the table, MaxCompute automatically executes a statement that is similar to DROP TABLE to reclaim the table. The lifecycle of a non-partitioned table starts from the time that is specified by the LastModifiedTime parameter. This parameter specifies the time when the data in a non-partitioned table is last modified.
The partitions of a table can be separately reclaimed. MaxCompute automatically reclaims partitions whose data remains unchanged within the lifecycle. The lifecycle of a partition starts from the time that is specified by the LastModifiedTime parameter. This parameter specifies the time when the data in a partition is last modified. Unlike non-partitioned tables, a partitioned table is not dropped even if all of its partitions have been reclaimed.
NoteA lifecycle-based table scan is performed at a scheduled time each day to scan all the partitions of a table. A partition can be reclaimed only if the period after the time specified by the LastModifiedTime parameter exceeds the lifecycle.
For example, the lifecycle of a partitioned table is one day and the data in one of the table partitions was last modified at 15:00 on February 17, 2020. If MaxCompute scans this table before 15:00 of February 18, 2020, the table partition is not reclaimed because the period after the time specified by LastModifiedTime is less than the one-day lifecycle. If MaxCompute scans this table on February 19, 2020, the table partition is reclaimed because the period after the time specified by LastModifiedTime exceeds the one-day lifecycle.
The lifecycle feature allows MaxCompute to periodically reclaim a table or a partition. The availability of the system determines whether MaxCompute can immediately reclaim a table or a partition when the period after the time specified by LastModifiedTime exceeds the lifecycle of the table or the partition. Therefore, MaxCompute cannot immediately reclaim a table or a partition after the lifecycle elapsed.
After a table is dropped, all properties of the table are dropped, including the lifecycle. If you create another table that has the same name as the dropped table, the lifecycle properties of the table that you create take effect.
You can specify a lifecycle for tables. You cannot specify a lifecycle for partitions. The lifecycle that you specify for a partitioned table applies to all partitions of the table. You can specify a lifecycle when you create a table.
If you do not specify a lifecycle for a table, the table will not be automatically reclaimed based on lifecycle rules.
A table or partition is reclaimed based on lifecycle rules by the Alibaba Cloud service
maxcompute.aliyuncs.com
. You can view the operation record about the reclamation in the ActionTrail console. For more information, see Get started with the event query feature. Example:The table
bettergithubanalytics.test_lifecycle
is automatically reclaimed based on lifecycle rules by MaxCompute. The following figures show the operation record about the reclamation.The partition
sale_date=2013/region=china
of the tablebettergithubanalytics.sale_detail
is automatically reclaimed based on lifecycle rules by MaxCompute. The following figures show the operation record about the reclamation.You can also use the Data Map service of DataWorks to view the operation record of reclaiming a table or partition. For more information, see Overview. The account that is used to reclaim the table or partition is the system account of MaxCompute
odps_user@aliyun.com
. The name of the Alibaba Cloud service ismaxcompute.aliyuncs.com
.
References
For more information about table operations, such as specifying a lifecycle, modifying the lifecycle rules of a table, and changing the value of the
LastModifiedTime
parameter for a table, see Table operations.For more information about lifecycle-related operations, such as specifying a lifecycle for a table, and disabling or restoring a lifecycle, see Lifecycle management operations.