By Daoke
As one of the optimization solutions for large tables, partition tables are becoming increasingly popular. PolarDB for MySQL's partition tables are fully compatible with MySQL's syntax and features. PolarDB for MySQL has been optimized and enhanced in terms of features and performance. For more information, refer to the Partition Table Overview of PolarDB for MySQL.
The partition table divides a logical large table into multiple physical small tables based on partitioning strategies. These small tables are called partitions. When managing a large amount of data, a single table becomes a bottleneck. Partitioning allows data to be divided into smaller shards for independent management, ensuring high performance for large tables. Partitioning strategies include RANGE, LIST, and HASH, where you can specify a partition key and strategy to divide data based on the partition key field's value. INTERVAL partitioning is a type of RANGE partitioning and an extension of the RANGE partitioning feature. The difference is that with RANGE partitioning, if inserted data exceeds an existing partition's range, it cannot be inserted and an error is returned. However, with INTERVAL RANGE partitioning, the PolarDB database automatically creates a new partition and continues inserting the data.
The INTERVAL partition table can automatically create new partitions. Similar to a RANGE partition table, the INTERVAL partition appends new partitions in backward sequence since all partitions are sorted by partition range. The rules for creating new partitions are determined by the maximum range of the previous partition and the partition range specified by the user. The partition range for INTERVAL is defined using the new INTERVAL syntax, as shown in the following figure.
INTERVAL(type, expr)
This syntax contains two parameters: type and expr. The type parameter can be used to specify the time type, and the expr parameter can be used to specify the specific partition range.
For example, when you create an INTERVAL partition table, the specified partition range is one day.
CREATE TABLE t(a int, b datetime)
PARTITION BY RANGE COLUMNS(b) INTERVAL(DAY, 1) (
PARTITION p20211101 VALUES LESS THAN('2021-11-01 00:00:00')
);
The range of each partition created automatically by this INTERVAL partition is also one day. For more information, see Create an INTERVAL RANGE Partition.
This should distinguish the type of the original table.
There are two scenarios. One is that the original table is a non-partition table, and the other is that the original table is a partition table but not a RANGE partition table. In both cases, the switch needs to be completed through a syntax, such as ALTER TABLE table_name partition_options. This switch usually requires locking the table, and the locking duration of the table depends on the amount of data in the table. To ensure business continuity, PolarDB also provides the syntax for a quick switch from a non-partition table to an INTERVAL partition table. For more information, see Switch a Common Table to a Range Partitioned Table. The example provided in the document is a switch from a common table to a RANGE partition table. To switch a common table to an INTERVAL partition table, you only need to add the INTERVAL syntax. Example:
CREATE TABLE t (
`id` int ,
`gmt_create` datetime);
insert into t values(1, NOW()), (2, NOW());
alter table t partition by range columns(gmt_create) interval(DAY, 1) (
partition p20230101 values less than ('2023-01-01'),
partition p20230102 values less than ('2023-01-02')
) WITHOUT VALIDATION;
Make sure that all data in a common table is within the range defined by the first partition of the partition table. Otherwise, data that does not comply with the rules may not be queried after the table is switched to a partition table.
You can switch a RANGE partition table to an INTERVAL partition table by using the ALTER TABLE table_name SET INTERVAL(type, expr) syntax. You do not need to lock tables in this switch process. For more information, see Switch an Interval Range Partitioned Table.
The INTERVAL partition table automatically generates partitions with specific naming rules. The prefix of the partition name must be _p, followed by the upper boundary of the partition range and the type of the partition key. For example, if the partition key is of type DATETIME and the partition range is VALUES LESS THAN ('2023-01-01'), the partition name will be _p20230101000000. Similarly, if the partition key is of type DATE and the partition range is VALUES LESS THAN ('2023-01-01'), the partition name will be _p20230101.
You can also create new partitions for the INTERVAL partition table using the ADD PARTITION syntax. To avoid conflicts between the names of automatically generated partitions and user-defined partitions, user-defined partition names should either not have the _p prefix or follow the naming rules of automatically generated partitions. In order to help users understand the naming rules for automatically generated partitions, if a user-defined partition name does not conform to the rules, the system will provide the correct name as a prompt.
The feature of database partition tables allows you to use partitions to improve the access performance of large tables and flexibly manage partitioned data. However, in many scenarios, partitions need to be maintained regularly. For example, you need to create new partitions and delete old partitions regularly. These operations usually need to be performed by DBA in the early hours of the morning. INTERVAL partitioning combined with scheduled tasks can realize the automated management of partitions.
This solution is suitable for scenarios where the partition key is the time column. Assume you have a business table named orders, which uses the order time as the partition key. You need a partition every day to save the orders of the day.
CREATE TABLE orders(
id int,
ordertime datetime
)
PARTITION BY RANGE COLUMNS(ordertime) INTERVAL(DAY, 1)
(
PARTITION _p20230101000000 VALUES LESS THAN('2023-01-01'),
PARTITION _p20230102000000 VALUES LESS THAN('2023-01-02')
PARTITION _p20230103000000 VALUES LESS THAN('2023-01-03')
PARTITION _p20230104000000 VALUES LESS THAN('2023-01-04')
);
You can use INTERVAL partitioning to automatically add a new partition for each day. If the user order information only needs to be stored for 3 days, you can create an event to delete the old partition 3 days ago regularly, which can save storage space without affecting the performance of other partitions. The following example shows how to create a scheduled task:
DELIMITER ||
CREATE EVENT IF NOT EXISTS drop_partition ON SCHEDULE
EVERY 1 DAY STARTS '2023-01-04 02:00:00'
ON COMPLETION PRESERVE
DO
BEGIN
set @pname = concat('alter table orders drop partition _p', date_format(date_add(curdate(), INTERVAL -3 DAY), '%Y%m%d000000'));
prepare stmt_drop_partition from @pname;
execute stmt_drop_partition;
deallocate prepare stmt_drop_partition;
END ||
DELIMITER ;
For more information, see Automated Management of Partitions.
INTERVAL partitioning combined with scheduled tasks can implement a function similar to data lifecycle management.
Best Practices for Upgrading PolarDB for MySQL 5.7/MySQL 5.7 to PolarDB for MySQL 8.0
7 posts | 0 followers
FollowApsaraDB - April 26, 2023
ApsaraDB - August 9, 2024
ApsaraDB - April 20, 2023
ApsaraDB - January 15, 2024
ApsaraDB - October 20, 2023
ApsaraDB - March 3, 2020
7 posts | 0 followers
FollowAlibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn MoreAlibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAlibaba Cloud PolarDB for Xscale (PolarDB-X) is a cloud-native high-performance distributed database service independently developed by Alibaba Cloud.
Learn MoreAnalyticDB for MySQL is a real-time data warehousing service that can process petabytes of data with high concurrency and low latency.
Learn MoreMore Posts by Morningking