Before we introduce general ESSDs, let's talk about storage. As we all know, a database architecture consists of four layers: storage, engine, service, and network, from bottom to top.
The storage layer handles persistent storage and reading of data, forming the foundation of the entire database system. This layer primarily focuses on data storage and data retrieval and reading. Common storage types at this layer often include local disks, which are based on physical machine hard disks, and cloud disks, which utilize distributed storage architectures.
For local disk storage, data storage and computing resources are located on the same physical server node. Since data reading and writing do not need to be transmitted over a network, I/O latency is usually low and random read/write performance is high.
However, the storage capacity of a local disk is limited by the hardware configuration of a single server. Once a server fails, the security and availability of the data may be compromised, and it is difficult to independently scale out the storage capacity of the local disk. In contrast, cloud disks are based on distributed storage architectures, with data storage resources separated from computing resources. Users can independently adjust their computing and storage resources based on their needs, enhancing resource utilization and flexibility. However, cloud disks provide storage services for computing nodes over a network, which will more or less have some impact on I/O latency.
Moreover, the performance of cloud disks often depends on their capacity. In most cases, improving performance typically requires increasing disk capacity. In addition, cloud disks do not support scale-in.
Through our research and analysis of customer business scenarios, we have observed that in many industries, such as e-commerce, new retail, and gaming, business traffic often fluctuates with noticeable peaks and troughs rather than remaining constant. Ensuring that performance is not compromised by capacity and can be enhanced during peak times, while also achieving optimal performance (with low latency and high durability) at minimal cost has become a primary concern for customers when selecting a database. ApsaraDB RDS was the first to launch general enterprise SSDs (ESSDs), meeting user requirements for low costs, low latency, and high durability.
A general ESSD is first of all a type of cloud disk. The key distinction between general ESSDs and the current mainstream cloud disks is that ESSDs deeply integrate the technological innovations of the PaaS and IaaS layers. This integration transforms the conventional data storage structure by dividing it into a three-layer structure: buffer, data, and cold storage layers. Hot data is stored in buffer disks (High Performance Disk), warm data is stored in data disks (ESSD), and archived data is stored in cold storage disks (OSS).
In this storage structure design, buffer disks serve as a scalable part for achieving I/O acceleration. Thanks to the unique kernel capabilities of AliSQL, handling I/O bursts during database read and write operations becomes straightforward. The structure also supports disk capacities ranging from 10 GB to 64 TB, cloud disk scale-in, and automatic capacity scale-out in seconds. Meanwhile, archiving and storing infrequently accessed cold data in Object Storage Service (OSS) helps reduce usage costs. General ESSDs decouple I/O performance from cloud disk capacity, allowing the data layer to achieve maximum elasticity in both I/O performance and cloud disk capacity.
As illustrated in the diagram above, the storage architecture of a general ESSD comprises three layers from left to right: the buffer layer, data layer, and cold storage layer. Each storage layer serves a specific function. Selecting the appropriate storage medium for each layer can maximize its value.
• Buffer layer: handles requirements for high IOPS and ultra-low I/O latency, typically millions of IOPS and microsecond-level I/O latency. Alibaba Cloud buffer disks (High Performance Disk) are used in combination with the buffer technology of the RDS database engine to implement I/O acceleration during database queries, achieving higher query performance.
• Data layer: handles requirements for low I/O latency and data reliability, typically tens of thousands of IOPS and millisecond-level I/O latency. ESSDs are used to ensure high reliability and security of data. Leveraging the innovative capabilities of the infrastructure and AliSQL kernel, along with the upgraded management and control architecture of the RDS database, I/O performance is decoupled from cloud disk capacity. This enables the database to handle I/O bursts during read and write operations, enhancing the ultimate elasticity of I/O performance and cloud disk capacity.
• Cold storage layer: handles requirements for data persistence and low storage cost. Generally, data stored in the cold storage layer is not frequently accessed but requires ultra-high reliability and persistence. Alibaba Cloud OSS is used in combination with the archiving technology of the RDS database engine to implement the database archiving feature. It also supports switching table files between the data layer and the cold storage layer by executing ALTER TABLE, providing a lower-cost storage option for infrequently accessed data tables.
General ESSD |
Buffer layer |
Data layer |
Cold storage layer |
Storage medium |
High Performance Disk |
ESSD |
OSS |
Performance: maximum IOPS per disk |
1,000,000 |
50,000 |
- |
Performance: maximum throughput per disk (MB/s) |
4,000 |
350 |
- |
Performance: average one-way latency (us) |
35 |
200 |
- |
Performance: network bandwidth (MB/s) |
- |
- |
100 |
Data: reliability |
Non-persistent |
99.9999999% (nine 9s) |
99.9999999999% (twelve 9s) |
Data: capacity (GB) |
64-8192 |
10-65536 |
Pay-as-you-go |
Table 1: Comparison of storage media at three layers of an RDS general ESSD
Before introducing the technology in detail, we have made a comparison of features between the three storage layers of a general ESSD, as shown in the table below.
General ESSD |
Buffer layer |
Data layer |
Cold storage layer |
Storage medium |
High Performance Disk |
ESSD |
OSS |
Core trait |
High performance |
Ultimate elasticity |
Low costs |
Key features |
I/O acceleration |
I/O burst Cloud disk scale-in |
Data archiving |
Core implementation |
1. Buffer pool extension (BPE) 2. Storing temporary tables/files |
1. Decoupling the I/O performance and capacity of a cloud disk 2. Cloud disk scale-in with data copy |
1. Direct access from the database kernel to OSS 2. Access to OSS with JuiceFS |
Table 2: Comparison of the features between the three storage layers of an RDS general ESSD
I/O acceleration is a feature that improves the performance of database queries. This feature is mainly seen in the buffer layer of general ESSDs, leveraging the high I/O performance of buffer disks to improve the overall performance of database queries. Compared with ESSDs, buffer disks have higher IOPS and bandwidth limits, as well as lower I/O latency.
The buffer layer implements I/O acceleration in two ways:
• BPE: Use buffer disks to extend the buffer pool (BP), with the aim of improving the buffer hit ratio and reducing accesses to ESSDs at the data layer, thereby accelerating database queries.
• Storing temporary tables/files: Use buffer disks to store temporary tables/files, with the aim of speeding up access to temporary tables/files, thereby improving the performance of database queries.
Because of this, the I/O acceleration feature is more suitable for business scenarios where read load is high and temporary tables/files are frequently used. Currently, the ApsaraDB RDS for MySQL engine supports both of the above implementation ways, while the ApsaraDB RDS for PostgreSQL engine can only implement I/O acceleration by storing temporary tables/files.
The BP is a component of the database engine memory area. It temporarily stores frequently used data and index pages, thereby reducing disk accesses and improving the performance and efficiency of database operations. As a result, a larger BP leads to a higher buffer hit ratio, fewer database accesses to disks, and better overall query performance and efficiency.
However, the BP size is constrained by memory resources, and database systems often face significant pressure on memory usage. To expand the buffer without requiring additional memory resources and to fully leverage the high I/O performance of buffer disks, the ApsaraDB RDS for MySQL engine implements the BPE feature. This extension allows the BP to be expanded in memory, further improving the buffer hit ratio and overall database query performance.
Figure 1: How the BPE works
According to the working principle of the BPE shown in the figure above, the process of reading a data page is as follows:
1. The client sends a request to read a data page.
2. The request arrives at the BP in memory to locate the specified data page:
• If the data page is found in the BP, a result is returned to the client, and the request to query and read ends.
• If the data page is not found in the BP, the system proceeds with Step 3.
3. The request arrives at the BPE to locate the specified data page:
• If the data page is found in the BPE, the data page is returned to the BP, a result is returned to the client, and the request to query and read ends.
• If the data page is not found in BPE, the system proceeds with Step 4.
4. The request arrives at data table files in the ESSD to locate the specified data page. When the data page is found, it is returned to the BP, and then a result is returned to the client.
5. The request to query and read ends.
For a database engine, both the BP in memory and temporary tables/files on disks can compromise the performance of database queries. Temporary tables/files are used in current sessions or queries to store intermediate results generated during the data query process or results that exceed the memory limit. Generally, they do not need to be persistently stored in data files and will be deleted after use.
Since temporary tables/files do not need to be persisted for business purposes, RDS general ESSDs, unlike current ESSDs, have changed the storage location of these tables/files. Previously, temporary tables/files for the RDS database were stored in ESSDs. However, with the launch of general ESSDs, these tables/files can now be stored in buffer disks. This change improves the efficiency of database access to temporary tables/files, thereby speeding up database queries.
Taking the 8-core 16 GB specification as an example, the test results for the I/O acceleration feature are as follows:
• read_only QPS is improved by 80%.
• write_only QPS is improved by 33%.
• read_write QPS is improved by 103%.
The data layer of general ESSDs uses ESSD as the storage medium. With the innovative capabilities of ESSDs, the I/O performance and capacity of a cloud disk are decoupled. By leveraging the management and control architecture benefits and data copy capabilities of the RDS database, cloud disks can be scaled in, thus achieving ultimate elasticity in both I/O performance and cloud disk capacity for the database.
High I/O load and fluctuating I/O load are two common business scenarios. Previously, the RDS database used PL1, PL2, and PL3 ESSDs, whose I/O performance and capacity were deeply bound. Additionally, the IOPS and bandwidth limits of these ESSDs were significantly constrained by their storage capacity. Due to this limitation, cloud disk scale-out becomes the only solution for achieving higher cloud disk I/O performance to handle I/O peak scenarios with high I/O load and fluctuating I/O load.
With technological innovations in the database, RDS general ESSDs decouple I/O performance of cloud disks from their storage capacity, providing the I/O performance burst feature and supporting dynamic adjustment of the maximum disk I/O based on actual business usage. When the I/O load is high, an I/O burst is automatically triggered to increase the maximum I/O. Once the I/O load decreases, the maximum I/O is automatically restored, achieving the ultimate elasticity in I/O performance and avoiding waste of I/O performance and cost.
As shown in the following example, when the I/O burst feature is enabled, the IOPS usage of the RDS instance can exceed 100% during business I/O peaks.
Figure 2: I/O burst test for RDS general ESSDs
Currently, the I/O burst feature of RDS general ESSDs supports three RDS engines: MySQL, PostgreSQL, and SQL Server.
As we all know, cloud disks do not support scale-in. However, RDS general ESSDs, by leveraging the benefits of the RDS database management and control architecture along with the AliSQL kernel, and combining the data copying feature, achieve cloud disk scale-in. Meanwhile, with the capability of cloud disks to scale out in seconds, customers can freely adjust their cloud disk capacity based on their business needs, achieving ultimate elasticity in cloud disk capacity.
The cold storage layer of general ESSDs uses OSS as the storage medium, and archives table-level data in OSS, without affecting normal user queries of the archived data. After enabling the data archiving feature, users can switch table files between ESSD and OSS by executing ALTER TABLE. The archived tables in OSS support normal queries at the same time.
The storage cost of OSS is much lower than that of ESSD, so the data archiving feature significantly reduces the storage cost for tables that users do not frequently access.
The data archiving feature is suitable for business scenarios with tables that are not frequently accessed or modified. The cold storage layer achieves data archiving through direct access to OSS from the database kernel or access to OSS with JuiceFS.
This method archives table data to OSS through PUT and accesses the archived data through GET. The ApsaraDB RDS for MySQL engine mainly adopts this method. It is worth noting that, in order to ensure the compatibility of archived tables, the AliSQL kernel uses the storage format of the InnoDB engine. Therefore, the BPE can still be applied to archived tables to speed up their queries.
After enabling the data archiving feature, users can upload a normal table to OSS and convert it into an archived table by executing ALTER TABLE. They can also convert an archived table back into a normal table by executing ALTER TABLE. Archived tables currently support only read operations. Users can access data in archived tables by executing a SELECT statement. The MySQL kernel will perform the following operations on a table to be archived:
• Split the .ibd file of the table to be archived into file blocks of a size specified by oss_block_size (2 MB by default).
• Use the SDK of OSS to upload the file blocks to OSS.
• Keep space header files in ESSDs to speed up instance startup and table file scanning.
Figure 3: Data archiving in ApsaraDB RDS for MySQL
This method is mainly used by the ApsaraDB RDS for PostgreSQL engine for data archiving.
When the data archiving feature is enabled, in addition to the data directory, an archive directory named /cold-jfs is automatically created in the ApsaraDB RDS for PostgreSQL instance, and the corresponding tablespace rds_oss is created. Users can transfer the table to be archived to the rds_oss tablespace by executing an ALTER TABLE statement. Data in the rds_oss tablespace is uploaded to OSS through the JuiceFS file system, ensuring that it does not occupy the storage space of ESSDs. All tables in the rds_oss tablespace are archived tables, and users can query data in those tables with normal query statements.
Figure 4: Data archiving in ApsaraDB RDS for PostgreSQL
The following table describes the read-only performance test results for archived tables in ApsaraDB RDS for MySQL.
The QPS for archived tables is about 15% of that for normal tables.
Average TPS | Average QPS | |
Archived table | 366.36 | 5861.82 |
Normal table | 2671.37 | 42741.99 |
General ESSDs provide the following capabilities at three storage layers: I/O acceleration, I/O burst, and data archiving. These capabilities enable the management of hot, warm, and cold data without requiring extensive technological expertise. By enhancing storage management efficiency, these capabilities improve key business performance while achieving maximum cost-effectiveness. Compared with ESSDs, general ESSDs reduce storage costs by more than 30%, making them ideal for those seeking cost-effectiveness.
Try out database products for free:
Skyrocketing Data Storage Costs? Do Not Worry! RDS Data Archiving Is Here for You
ApsaraDB - March 3, 2020
ApsaraDB - October 19, 2020
ApsaraDB - November 17, 2020
ApsaraDB - June 27, 2022
Alibaba Clouder - January 11, 2021
ApsaraDB - June 5, 2024
An on-demand database hosting service for MySQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreApsaraDB RDS for MariaDB supports multiple storage engines, including MySQL InnoDB to meet different user requirements.
Learn MoreAn on-demand database hosting service for SQL Server with automated monitoring, backup and disaster recovery capabilities
Learn MoreMore Posts by ApsaraDB