AnalyticDB for PostgreSQL provides Tiered Storage that helps you reduce storage costs. You can move infrequently accessed data from Hot Tables to Object Storage Service (OSS) as Cold Tables. This topic describes the limitations and usage of Tiered Storage.
In this topic, tables stored on local disks are referred to as hot tables, and tables stored in remote OSS are referred to as cold tables.
Requirements
AnalyticDB for PostgreSQL V6.0 instances running minor version 6.3.11.1 or later.
AnalyticDB for PostgreSQL V7.0 instances running minor version 7.0.3.0 or later.
You can view the minor version on the Basic Information page of an instance in the AnalyticDB for PostgreSQL console. If your instance does not meet the required versions, update the minor version of the instance.
Limitations
Serverless mode is not supported.
The limitations of the tiered storage feature vary slightly between AnalyticDB for PostgreSQL V6.0 and AnalyticDB for PostgreSQL V7.0 instances. Review the limitations that apply to your instance version.
v6.0
You can convert a non-partitioned table to a cold table.
You can convert an entire partitioned table to a cold table, or convert specific subpartitions to cold partitions.
You cannot convert subpartitions of an indexed partitioned table to cold partitions, nor can you create an Index on a partitioned table that already contains cold partitions.
When a hot table is converted to a cold table, associated objects such as primary keys, indexes, sequences, rules, and comments are automatically deleted and cannot be recovered.
Data in cold tables or cold partitions is read-only. You cannot perform write, delete, or update operations. Data Definition Language (DDL) operations, such as
ALTER COLUMNandDROP COLUMN, are also not supported. You can, however, drop a cold table with theDROP TABLEstatement.AnalyticDB for PostgreSQL does not support directly converting cold tables or cold partitions to their hot equivalents. You can run the
CREATE TABLE AS SELECTstatement to create a new hot table and migrate the data from the cold table to the new hot table.
v7.0
You can convert a non-partitioned table to a cold table.
You can convert an entire partitioned table to a cold table, or convert specific subpartitions to cold partitions.
SELECTandINSERToperations are allowed on cold tables and cold partitions.DELETEandUPDATEoperations are not supported. Some DDL operations on cold tables, such asALTER COLUMNandDROP COLUMN, are in invitation-only beta. To use this feature, submit a ticket.You can convert subpartitions of a partitioned table with regular Indexes to cold subpartitions, but not those with a primary key or a unique index.
When a hot table is converted to a cold table, associated objects such as primary keys, indexes, sequences, rules, and comments are automatically deleted and cannot be recovered.
AnalyticDB for PostgreSQL does not support directly converting cold tables or cold partitions to their hot equivalents. You can run the
CREATE TABLE AS SELECTstatement to create a new hot table and migrate the data from the cold table to the new hot table.
Billing
After you convert a hot table to a cold table, the data is stored in OSS, which incurs storage fees. The billing rules are as follows:
Cold storage is billed on a pay-as-you-go basis.
Cold storage usage is collected every 5 minutes and billed hourly.
The price is the same as the price for OSS standard storage. For more information, see OSS Pricing.
For example, in the Chinese mainland region, the OSS price is USD 0.017 per GB/month. The hourly price is USD 0.0000236111 per GB. The price on your bill is final.
You can view the billing details for cold storage by choosing .
Usage
The process of converting a hot table to a cold table involves creating a temporary table, writing data to it, and uploading the data to OSS. These operations consume local and network I/O resources, which may affect query performance on your instance. Be aware of the potential impact on your business.
After a hot table is converted to a cold table, the local disk space that it occupied is released.
For AnalyticDB for PostgreSQL V6.0 instances, the conversion is scheduled at a specified time. The scheduling and queuing process may take some time. For AnalyticDB for PostgreSQL V7.0 instances, the conversion starts immediately when you execute the statement. The total time required depends on the instance specifications, the number of tables being converted simultaneously, and the data volume. For more information, see Performance.
The usage of the tiered storage feature differs between AnalyticDB for PostgreSQL V6.0 and AnalyticDB for PostgreSQL V7.0 instances. Use the syntax that applies to your instance version.
v6.0
Convert a non-partitioned table
Syntax
ALTER TABLE <tableName> SET ttl interval '<scheduling_interval>' move to storage_cold;Examples
Create a non-partitioned table named tiered_storage_heap and write data to it.
CREATE TABLE tiered_storage_heap (a int, b int);
INSERT INTO tiered_storage_heap SELECT random() * 1000,1 FROM generate_series(1,1000);Example 1: Move the non-partitioned table
tiered_storage_heapto cold storage after three days (3days). For example, if you run theALTER TABLEstatement at 09:00:00 on July 17, 2023, the entire table is moved to cold storage three days later at 09:00:00 on July 20, 2023.ALTER TABLE tiered_storage_heap SET ttl interval '3days' move to storage_cold;Example 2: Move the non-partitioned table
tiered_storage_heapto cold storage at a specific time (16:53:58 on July 28, 2023).ALTER TABLE tiered_storage_heap SET ttl '2023-07-28 16:53:58'::Timestamp move to storage_cold;Example 3: Immediately convert a table by setting the conversion time to a past date.
Set the conversion time to three days ago (-3days). Run the following command to immediately move the table to cold storage.
ALTER TABLE tiered_storage_heap SET ttl interval '-3days' move to storage_cold;Specify a past date and time. If the current time is 16:53:58 on July 17, 2023, you can run the following command to immediately move the table to cold storage.
ALTER TABLE tiered_storage_heap SET ttl '2022-07-16 16:53:58'::Timestamp move to storage_cold;
Convert subpartitions of a partitioned table
Syntax
ALTER TABLE <child_partition_name> SET ttl interval '<scheduling_interval>' move to storage_cold;To view the subpartition names for a partitioned table, run \d+ in psql.
Example
Create a partitioned table named tiered_storage_partition_hdfs.
CREATE TABLE tiered_storage_partition_hdfs(a int,b int) distributed by (a) partition by range(a) (start(1) end(20) every(10));This example creates two subpartitions: tiered_storage_partition_hdfs_1_prt_1 and tiered_storage_partition_hdfs_1_prt_2.
Insert data into the subpartition tiered_storage_partition_hdfs_1_prt_1.
INSERT INTO tiered_storage_partition_hdfs_1_prt_1 values(1, 1), (2, 2), (3, 3), (4, 4);Moves the subpartition tiered_storage_partition_hdfs_1_prt_1 to cold storage after three days. For example, if you run the ALTER TABLE statement at 09:00:00 on July 17, 2023, the tiered_storage_partition_hdfs_1_prt_1 subpartition is moved to cold storage three days later at 09:00:00 on July 20, 2023. Other Subpartitions are not affected.
ALTER TABLE tiered_storage_partition_hdfs_1_prt_1 SET ttl interval '3days' move to storage_cold;Query the storage status of a table
You can use either of the following methods to query the storage status of a table. The query returns cold for a cold table and hot for a hot table.
Method 1:
SELECT pg_tiered_storage_relation_status('<table_name>'::regclass::oid::bigint);Method 2:
SELECT pg_tiered_storage_relation_status(<table_oid>::bigint);Find the table OID by running
SELECT oid FROM pg_class where relname='<table_name>';.
v7.0
Convert a non-partitioned table
Syntax
SELECT pg_tiered_storage_move_table_to_storage_cold('<schema_name>', '<table_name>');Examples
Create a non-partitioned table named tiered_storage_heap_oss in the public schema and insert data into it.
CREATE TABLE tiered_storage_heap_oss (a int, b int) DISTRIBUTED BY(a) ;
INSERT INTO tiered_storage_heap_oss SELECT random() * 1000,1 FROM generate_series(1,100);Example 1: Immediately move the entire table to cold storage.
Run the following statement to immediately move the entire non-partitioned table to cold storage.
SELECT pg_tiered_storage_move_table_to_storage_cold('public', 'tiered_storage_heap_oss');Example 2: Use pg_cron to schedule moving an entire table to cold storage.
Assume you want to use the
etl_useraccount to move the non-partitioned tabletiered_storage_heap_ossin theetldatabase to cold storage at 01:00 the next day. Connect to thepostgresdatabase and run the following statement.SELECT cron.schedule('etl_table_transfer_to_cold', '0 1 * * *', 'SELECT pg_tiered_storage_move_table_to_storage_cold(''public'', ''tiered_storage_heap_oss'');', 'etl', 'etl_user');After 01:00 the next day, once you confirm that the move is successful, you can run the following statement to delete the scheduled job.
SELECT cron.unschedule(<job_id>);NoteThe job ID is automatically generated when the job is created. You can view the ID in the
jobidfield of thecron.jobtable.
Convert subpartitions of a partitioned table
Syntax
SELECT pg_tiered_storage_move_table_to_storage_cold('<schema_name>', '<child_partition_name>');To view the subpartition names for a partitioned table, run \d+ in psql.
Examples
Example 1: Immediately move a subpartition to cold storage.
Create a partitioned table named
tiered_storage_partition_ossin thepublicschema.CREATE TABLE tiered_storage_partition_oss(a int,b int) DISTRIBUTED BY (a) PARTITION BY range(a) (start(1) end(20) every(10));NoteThis example creates two subpartitions:
tiered_storage_partition_oss_1_prt_1andtiered_storage_partition_oss_1_prt_2.Insert data into the subpartition
tiered_storage_partition_oss_1_prt_1.INSERT INTO tiered_storage_partition_oss_1_prt_1 VALUES(1, 1), (2, 2), (3, 3), (4, 4);Immediately move the subpartition to cold storage.
SELECT pg_tiered_storage_move_table_to_storage_cold('public', 'tiered_storage_partition_oss_1_prt_1');Example 2: Use pg_cron to schedule daily subpartitions for conversion.
Create a daily partitioned table named
daily_log_detailsin theetldatabase.CREATE TABLE daily_log_details (id INT, log_message text, created_date character varying(64)) PARTITION BY LIST (created_date) ( PARTITION p20230601 VALUES ('20230601'), PARTITION p20230602 VALUES ('20230602'), PARTITION p20230603 VALUES ('20230603'), PARTITION p20230604 VALUES ('20230604'), PARTITION p20230605 VALUES ('20230605'), PARTITION p20230606 VALUES ('20230606'), PARTITION p20230607 VALUES ('20230607'), PARTITION p20230608 VALUES ('20230608'), PARTITION p20230609 VALUES ('20230609'), PARTITION p20230610 VALUES ('20230610'), PARTITION p20230611 VALUES ('20230611'), DEFAULT PARTITION others );Set a job to convert subpartitions older than 10 days to Cold Storage. The job runs at 03:00 and is executed by the
etl_user. Follow these steps.Create a cleanup function in the
etldatabase.CREATE OR REPLACE FUNCTION pg_tiered_storage_move_partition_daily_table_to_cold_storage(schemaname text, tablename text) RETURNS void AS $$ DECLARE fetch_overdue_partition_sql text; cold_storage_sql text; target record; BEGIN fetch_overdue_partition_sql := 'WITH targetpartitions AS (SELECT * FROM pg_partitions WHERE tablename = $1 AND schemaname = $2 AND partitionlevel = 1 AND partitionisdefault = FALSE) SELECT partitiontablename FROM targetpartitions WHERE to_date(substring(targetpartitions.partitionname FROM 2), ''YYYYMMDD'') <= current_date - INTERVAL ''10 days'''; -- Fetch overdue partitions FOR target IN EXECUTE fetch_overdue_partition_sql USING tablename, schemaname LOOP cold_storage_sql := 'SELECT pg_tiered_storage_move_table_to_storage_cold($1::text, $2::text)'; raise notice 'sql %', cold_storage_sql; EXECUTE cold_storage_sql USING schemaname, target.partitiontablename; END LOOP; END; $$ LANGUAGE plpgsql;Connect to the
postgresdatabase and run the statement.SELECT cron.schedule('etl_daily_transfer_to_cold', '0 3 * * *', 'SELECT pg_tiered_storage_move_partition_daily_table_to_cold_storage(''public'', ''daily_log_details'');', 'etl', 'etl_user');
Example 3: Use pg_cron to schedule monthly subpartitions for conversion.
Create a monthly partitioned table named
month_log_detailsin theetldatabase.CREATE TABLE month_log_details (id INT, log_message text, created_date character varying(64)) PARTITION BY LIST (created_date) ( PARTITION p202306 VALUES ('202306'), PARTITION p202307 VALUES ('202307'), PARTITION p202308 VALUES ('202308'), PARTITION p202309 VALUES ('202309'), PARTITION p202310 VALUES ('202310'), DEFAULT PARTITION others );Set a job to convert subpartitions older than 3 months to Cold Storage. The job runs at 05:00 on the first day of each month and is executed by the
etl_user. Follow these steps.Create a cleanup function in the
etldatabase.CREATE OR REPLACE FUNCTION pg_tiered_storage_move_partition_table_to_cold_storage(schemaname text, tablename text) RETURNS void AS $$ DECLARE fetch_overdue_partition_sql text; cold_storage_sql text; target record; BEGIN fetch_overdue_partition_sql := 'WITH targetpartitions AS (SELECT * FROM pg_partitions WHERE tablename = $1 AND schemaname = $2 AND partitionlevel = 1 AND partitionisdefault = FALSE) SELECT partitiontablename FROM targetpartitions WHERE to_date(substring(targetpartitions.partitionname FROM 2), ''YYYYMM'') <= current_date - INTERVAL ''3 months'''; -- Fetch overdue partitions FOR target IN EXECUTE fetch_overdue_partition_sql USING tablename, schemaname LOOP cold_storage_sql := 'SELECT pg_tiered_storage_move_table_to_storage_cold($1::text, $2::text)'; raise notice 'sql %', cold_storage_sql; EXECUTE cold_storage_sql USING schemaname, target.partitiontablename; END LOOP; END; $$ LANGUAGE plpgsql;Connect to the
postgresdatabase and run the statement.SELECT cron.schedule('etl_month_transfer_to_cold', '0 5 1 * *', 'SELECT pg_tiered_storage_move_partition_table_to_cold_storage(''public'', ''month_log_details'');', 'etl', 'etl_user');
Query the storage status of a table
Run the following statement to query the storage status of a table. The query returns cold for a cold table and hot for a hot table.
SELECT pg_tiered_storage_table_status('<schema_name>', '<table_name>|<child_partition_name>')View hot and cold storage
Log on to the AnalyticDB for PostgreSQL console. On the Basic Information page, view the Hot Storage and Cold Storage on the Instance Status card.
Backup and restoration
AnalyticDB for PostgreSQL tiered storage supports backup and restoration. Data restoration is subject to the following rules:
v6.0
When a full backup is available, you can restore data to any point in time. The storage status (hot or cold) of the data at the time of restoration matches its status at the time of backup.
v7.0
When a full backup is available, you can restore data to a specified point in time. The storage status (hot or cold) of the data at the time of restoration matches its status at the time of backup. AnalyticDB for PostgreSQL V7.0 has the following limitations:
If no data is written to the table after it is converted to a cold table, you can restore the table to any point in time.
If data is written to the table after it is converted to a cold table, you can restore the table to any backup point created before the conversion. If you restore to a point-in-time after the conversion, you can only restore to the time of the last write.
To support backup and restoration, the OSS space used by a dropped table is not immediately released. Instead, it is retained for a period that is specified by the Data Backup Retention Period setting. You are charged for the OSS space during this retention period.
For more information about backup and restoration, see Backup and restoration.
Scaling
When you scale in an AnalyticDB for PostgreSQL V6.0 instance, cold tables are moved to local temporary tables for data redistribution. After the scale-in is complete, the data is re-uploaded to OSS, and the local temporary tables are cleared. To ensure a successful scale-in, the total remaining disk space on all nodes after the operation must be greater than the total space occupied by the cold tables. During the scale-in process, data is downloaded from OSS. The time required for the scale-in is limited by the OSS download bandwidth. Therefore, you should carefully evaluate the scale-in duration.
For AnalyticDB for PostgreSQL V7.0 instances, scaling does not affect the data in cold storage. Data redistribution or transferring data back to local storage is not required. You do not need to consider the disk space occupied by cold storage.
Performance
Tests were performed on a 4-node instance and an 8-node instance, both with 2-core 8 GB nodes, using the following statements.
CREATE TABLE t333 (a int, b int);
INSERT INTO t333 SELECT random() * 1000000, random()*1000000 FROM generate_series(1,3000000000);v6.0 instances executed: ALTER TABLE t333 SET ttl interval '-3days' move to storage_cold;
v7.0 instances executed: SELECT pg_tiered_storage_move_table_to_storage_cold('public', 't333');
The following table shows the test results for single-table conversion time in seconds.
Hot table size (GB) | Conversion time (4-node, seconds) | Conversion time (8-node, seconds) | ||
v6.0 | v7.0 | v6.0 | v7.0 | |
1 | 10 | 5 | 5 | 2.8 |
10 | 96 | 48 | 42 | 25.2 |
100 | 848 | 490 | 333 | 243 |