All Products
Search
Document Center

PolarDB:Cold data archiving

Last Updated:Dec 11, 2024

You can use the cold data archiving feature provided by PolarDB for MySQL to archive data that is infrequently accessed and rarely updated to Object Storage Service (OSS). This can significantly reduce data storage costs.

How it works

PolarDB for MySQL allows you to archive data in either CSV or ORC format. The following figure shows how data is archived in these formats:

image

Data can be manually or automatically archived. The archived data is stored in CSV or ORC format across multiple files in OSS, and the corresponding data in PolarDB storage space is automatically deleted. This reduces storage space and costs. PolarDB nodes can access cold data in OSS over the internal network. For more information, see Manually archive cold data and Automatically archive cold data.

Note

The cold data archiving feature for partitioned tables is in canary release. To use the feature, go to Quota Center, find the quota name corresponding to the Quota ID polardb_mysql_hybrid_partition, and click Apply in the Actions column.

Comparison of file formats

When you archive cold data, you can select an appropriate file format based on your requirements.

Item

CSV

ORC

Open source

Yes

Yes

Archiving method

Manual Archiving:

Archiving speed

Faster than the ORC format.

Note

Only single-threaded archiving is supported.

Slow

Note

Only single-threaded archiving is supported.

Read speed

  • Slow. If you use serial queries, the query performance of an archived table without an index is about one-fifth to one-tenth of that on the InnoDB storage engine.

  • Faster than the ORC format on row store nodes.

Note

Single-threaded reading and multi-threaded reading are supported.

  • Slow. If you use serial queries, the query performance of an archived table without an index is about one-fifth to one-tenth of that on the InnoDB storage engine.

  • Suitable for analytical processing (AP) queries on a separate column store node.

Note

Only single-threaded reading is supported.

Transactions

Not supported

Not supported

Index capability

Not supported

Not supported

How to modify archived data

Archived tables in OSS are read-only. Before you modify data, you must import the data from OSS tables to PolarDB storage space.

Storage consumption

Consistent with the storage space occupied by tables that use the InnoDB engine and do not have indexes.

For the same data volume, ORC files occupy 45% of the storage space occupied by CSV files.

Backup and recovery

Not supported.

Note

Cold data archived in OSS is not backed up when you perform backup operations. Therefore, you cannot restore archived cold data from the backups.

Impact

Note
  • Evaluate the limits of archiving methods and table types before you perform archiving operations to avoid impacts on your business.

  • Archived data is stored in the default OSS bucket, not in an OSS bucket that you specify. You can view the archived data list only in the PolarDB console.

Supported versions

To archive data in the CSV or ORC format, a cluster of Enterprise Edition must meet the following requirements:

  • If the cluster is of Cluster Edition, it must run one of the following database engine versions:

    • PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.31 or later.

    • PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.9 or later.

  • If the cluster is of Multi-master Cluster (Database/Table) Edition, it must run PolarDB for MySQL 8.0.1.0.13 or later.

Usage

  • Query archived cold data

    • For common tables: Use the method that you use to query hot data. You do not need to change the access mode.

    • For partitioned tables:

      • After table partitions are archived to an OSS foreign table, you need to change the access mode to query archived data in a specified archived table.

      • After partitioned tables are archived, the archived table becomes a hybrid partitioned table. For more information about how to query data, see Query data in a hybrid partitioned table.

    Note

    Archived cold data in a single table is distributed across multiple files. To optimize queries for cold data, you can perform parallel queries. For more information, see Perform multi-file queries for a single OSS foreign table.

  • Modify archived cold data

    If you want to modify the cold data archived in OSS, you can execute the ALTER statement to import data from OSS to PolarDB storage space. After the data is imported, the cold data in OSS is deleted. After you modify the data, you can re-archive the modified table to OSS. For more information, see Import data from OSS tables to PolarStore.

  • Delete archived cold data

    Use the following methods to delete archived cold data based on the types of table and how data is archived. For more information, see Delete the corresponding file in OSS.

    • For common tables and OSS foreign tables

      Use the DROP TABLE statement to delete archived tables. Then run the CALL dbms_oss.delete_table_file('database_name', 'table_name'); command to delete associated cold data.

    • For archived partitioned tables

      Run the CALL dbms_oss.delete_table_file('database_name', 'table_name'); command to delete archived cold data.

Billing

Cold data is charged based on the space occupied in OSS. The following billing rules are used. You can purchase a PolarDB storage plan to offset the storage costs of cold data in OSS. For more information, see Cold data archiving.

Regions in the Chinese mainland

China (Hong Kong) and regions outside China

USD 0.0000325 per GB-hour

USD 0.0000455 per GB-hour

Example: You enable and perform cold data archiving for a cluster deployed in the Chinese mainland. The amount of cold data archived in OSS is 100 GB. The hourly cold data storage fee is 100 GB × USD 0.0000325 per GB-hour = USD 0.00325 per hour.

Note

For information about how to view the amount of archived cold data, see View information about the cold data archived in OSS.

Contact us

If you have any questions about the cold data archiving feature, join the DingTalk group 24490017825. The experts in the group can answer your questions. You can also use the chatbot assistant for PolarDB for MySQL in the group that provides 24/7 support.