Cold data tiered storage

Updated at: 2024-12-27 09:56

This topic describes the benefits of implementing tiered storage for hot and cold data and how this feature works in PolarDB for PostgreSQL. Implementing tiered storage by dumping data that is infrequently accessed or updated to OSS can result in significant storage savings.

Benefits

After you enable the cold data tiered storage feature, the unit storage cost is about 90% lower than that of PL1 ESSDs. For more information, see Billing rules.

The cold data tiered storage feature provided by PolarDB for PostgreSQL is easy to use, highly flexible, highly performant, secure, reliable, and applicable to a wide range of scenarios.

  • Easy to use

    • SQL transparency: You can perform SQL operations on data stored in OSS, such as joins on tables and CRUD operations, without any rewriting.

    • Index transparency: You can configure archiving policies for indexes and materialized views with no need to know how they are archived.

  • Highly flexible

    • Several tiered storage policies are available to archive data by tables (covering indexes and materialized views), by partitions, or by specific LOB columns. You can choose to combine two or more of these policies based on your business requirements.

  • Highly performant

    • The tiered storage feature uses a three-layer cache architecture to improve the query performance. This architecture consists of the in-UDF logical object cache, page-level shared cache, and persistent file cache. The architecture reduces the number of data access operations to OSS, thereby minimizing the impact on read/write latency experienced with OSS interactions.

  • Widely applicable

    • The feature can be used to archive a wide variety of data, such as general, spatio-temporal, and time series data. For example, you can use this feature to archive data such as spatio-temporal trajectories and high-precision maps, significantly reducing storage costs.

  • Secure and reliable

    • Cold data in OSS can also be backed up and restored, which reduces backup costs and ensures high availability.

Note
  • The feature is supported by PolarDB for PostgreSQL 14.10.21.0 or later.

  • The access latency of cold storage data increases. We recommend that you do not frequently update or write data. You can choose to store data in OSS.

Supported regions

Location

Region

Location

Region

China

China (Hangzhou)

China (Shanghai)

China (Shenzhen)

China (Guangzhou)

China (Beijing)

China (Zhangjiakou)

China (Ulanqab)

China (Hong Kong)

Asia Pacific

Singapore

Indonesia (Jakarta)

Malaysia (Kuala Lumpur)

How tiered storage works

Tiered storage results in cost savings because it uses the cost-effective OSS as a data storage option. PolarDB for PostgreSQL can also be used with Elastic Block Storage (EBS) and OSS to achieve automatic storage tiering for hot and cold data based on specific usage patterns. This way, PolarDB maintains transparency during SQL CRUD operations and minimizes performance degradation by leveraging a multi-level caching system. The tiered storage architecture is as follows:

image.png

Cold storage modes

Cold storage lets you dump data tables, indexes, or materialized views into OSS. It minimizes disk usage, thereby resulting in significant storage savings. After cold storage, all SQL statements that specify CRUD operations are transparent without additional modifications.

PolarDB supports the following cold storage modes:

  • Store data within an entire table in OSS and retain indexes in cloud disks. This reduces storage costs while keeping access performance high.

  • Store the columns of the LOB type and secondary columns in OSS.

  • Store expired partitions in OSS and retain hot partitions in cloud disks. This is a typical tiered storage mode.

image (2).png

Scenarios

The access latency of OSS is hundreds of times higher than that of cloud disks. Consequently, once data is dumped to OSS, its access performance is reduced. However, there are still customers who require fairly high performance when querying or updating their cold storage data. To meet this requirement, PolarDB for PostgreSQL supports two types of storage tiering as follows:

  • Dump expired partitions into OSS while retaining hot partitions in cloud disks. This minimizes the impact on the query performance and can reduce storage costs. For more information, see Cold storage of partitioned tables.

  • Provide a materialized cache for frequently accessed and updated data in cloud disks and store the entirety of data in OSS. The lifecycle of data in the materialized cache is determined by their access frequency. This can achieve excellent performance and reduce storage costs. For more information, see Materialized cache for cold data.

  • On this page (1, T)
  • Benefits
  • Supported regions
  • How tiered storage works
  • Cold storage modes
  • Scenarios
Feedback