You can use the cold data archiving feature in a PolarDB for MySQL cluster to archive data that is infrequently accessed and rarely updated to Object Storage Service (OSS). This can significantly reduce data storage costs. This topic describes the cold data archiving feature.
How it works
The following figure shows how cold data is archived in the CSV or ORC format. The hot data of a PolarDB cluster is stored in PolarStore. Cluster nodes access the data in PolarStore by using PolarFileSystem. To reduce data storage costs and separate hot and cold data, you can transfer cold data from PolarStore to OSS.
For common tables, you can manually archive cold data to OSS in the CSV or ORC format. PolarDB nodes can access cold data in OSS over the internal network.
For partitioned tables, you can manually archive cold data to OSS in the CSV or ORC format. You can also create data lifecycle management (DLM) policies on the primary node to automatically archive cold data. You can use the cold data archiving feature to archive specific data from a partitioned table to OSS as an OSS foreign table, or directly transfer specific partitions of a partitioned table to OSS.
The archived data is stored in the CSV or ORC format across multiple files in OSS, and the corresponding data in PolarStore is automatically deleted. This reduces storage space and costs.
The cold data archiving feature for partitioned tables is in canary release. To use the feature, go to Quota Center, find the quota name corresponding to the
polardb_mysql_hybrid_partition
quota ID, and then click Apply in the Actions column.After cold data is archived, the archived table in OSS is read-only, and the query performance becomes low. If you use serial queries, the query performance of an archived table without an index is about one-fifth to one-tenth of that on the InnoDB storage engine. We recommend that you use test data to check whether the query performance meets your requirement before you archive data.
Cold data archived in the CSV or ORC format cannot be used for transactions.
Cold data archived in OSS is not backed up when you perform backup operations. Therefore, you cannot restore archived cold data from the backups.
Billing
Cold data is charged based on the space occupied in OSS. The following billing rules are used:
For regions within the Chinese mainland: USD 0.0000325 per GB-hour.
For regions outside the Chinese mainland: USD 0.0000455 per GB-hour.
You can purchase a PolarDB storage plan to offset the storage costs of cold data in OSS. For more information about PolarDB storage plans, see Storage plans.
The following table describes the offset rules of PolarDB storage plans.
Region | Offset ratio | OSS storage usage offset by using a 1-GB storage plan |
Regions in the Chinese mainland | 0.045 | 1/0.045=22.22 GB |
Hong Kong (China) and regions outside China | 0.045 | 1/0.045=22.22 GB |
For example, you purchase a storage plan with a capacity of 100 GB, and 50 GB is left after it offsets the data storage usage. The cold data of the cluster consumes 50 GB of OSS storage space in a region inside the Chinese mainland. In this case, the storage plan automatically uses 2.25 GB (50/2.25) to offset the backup storage usage and the storage plan has 47.75 GB of capacity left.
If the remaining capacity of the storage plan is insufficient to deduct the OSS storage space consumed by the code data, you are charged for additional storage space on a pay-as-you-go basis.
For information about how to view the storage usage of cold data in OSS, see View information about the cold data archived in OSS.
For information about billing, see Consumption by bill.
Supported versions
To archive data in the CSV or ORC format, a cluster of Enterprise Edition must meet the following requirements:
If the cluster is of Cluster Edition, it must run one of the following database engine versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.31 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.9 or later
If the cluster is of Multi-master Cluster (Database/Table) Edition, it must run PolarDB for MySQL 8.0.1.0.13 or later.
Comparison between different file formats
When you archive cold data, you can select an appropriate file format based on your requirements.
File format | Archiving speed | Reading speed | Open source | Storage consumption | How to modify archived data |
CSV | Faster than the ORC format. Note Only single-threaded archiving is supported. | Faster than the ORC format on row store nodes. Note Single-threaded reading and multi-threaded reading are supported. | Yes | Consistent with the storage space occupied by tables that use the InnoDB engine and do not have indexes. | Before you modify data, you must import the data from OSS tables to PolarStore. |
ORC | Slow. Note Only single-threaded archiving is supported. | Suitable for analytical processing (AP) queries on a separate column store node. Note Only single-threaded reading is supported. | Yes | For the same data volume, ORC files occupy 45% of the storage space occupied by CSV files. |
Usage
Archive cold data
To archive cold data, log on to the PolarDB console, enable cold data archiving, connect to the cluster, and then perform archiving operations.
For common tables, you can manually archive cold data to OSS in the CSV or ORC format.
For partitioned tables, you can manually archive cold data to OSS in the CSV or ORC format. You can also create DLM policies on the primary node to automatically archive cold data.
Query archived cold data
After you archive cold data, you can use one of the following methods to query the archived cold data:
Common tables: Use the method that you use to query hot data. You do not need to change the access mode.
Partitioned tables: For information about how to query cold data, see Query data in a hybrid partitioned table.
Archived cold data in a single table is distributed across multiple files. To optimize queries for cold data, you can perform parallel queries. For more information, see Perform multi-file queries for a single OSS foreign table.
Modify archived cold data
If you want to modify the cold data archived in OSS, you can execute the
ALTER ENGINE
statement to import data from OSS to PolarStore. After the data is imported to PolarStore, the cold data in OSS is deleted. After you modify the data, you can re-archive the modified table to OSS. For more information, see Import data from OSS tables to PolarStore.Delete archived cold data
To delete archived cold data, execute the
DROP TABLE
statement to delete the archived table, and then execute theCALL dbms_oss.delete_table_file('database_name', 'table_name');
statement to delete the archived cold data.
Contact us
If you have any questions about the cold data archiving feature, join the DingTalk group 24490017825. The experts in the group can answer your questions. You can also use the chatbot assistant for PolarDB for MySQL in the group that provides 24/7 support.