This topic describes how to archive cold data in the comma-separated values (CSV) or Optimized Row Columnar (ORC) format and import data back from Object Storage Service (OSS) tables to PolarStore.
Prerequisites
To archive cold data in the CSV format for a cluster, make sure that the cluster meets the following engine version requirements. For information about how to query the engine version of a cluster, see Query the engine version.
If the cluster is of Cluster Edition, it must run one of the following database engine versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.32 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.10 or later
NoteManual archiving of cold data is not recorded in binary logs if the cluster runs one of the following database engine versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.33 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.11.1 or later
If the cluster is of Multi-master Cluster (Database/Table) Edition, it must run PolarDB for MySQL 8.0.1.0.13 or later.
To archive cold data in the ORC format for a cluster, make sure that the cluster runs PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.16.2 or later. For information about how to query the engine version of a cluster, see Query the engine version.
The cold data archiving feature is enabled. For more information, see Enable cold data archiving.
The cluster is connected. For more information, see Connected to a cluster.
Usage notes
Archive common tables
Cold data archiving is the process of preserving cold data within tables. These tables that store the cold data are referred to as archived tables (read-only) and utilize the OSS engine. The data files corresponding to these archived tables are stored in OSS. After cold data is archived, the PolarFileSystem space occupied by the original table is released.
Syntax
Archive cold data in the CSV format:
ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server';
If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, you can also execute the following statement to manually archive cold data:
ALTER TABLE table_name ENGINE = CSV STORAGE OSS;
Archive cold data in the ORC format:
ALTER TABLE table_name ENGINE = ORC STORAGE OSS;
If OSS contains a file that has the same name as the file to be archived, the system returns an error message indicating that the file already exists in OSS. Example:
Target file for archived table exists on oss.
You can add the FORCE keyword to the preceding statement to overwrite the existing file in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:
ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server' , FORCE;
Parameters
Parameter | Description |
table_name | The name of the table to be archived to OSS. |
Precautions
Only data in the tables that use the InnoDB storage engine and X-Engine can be archived.
You cannot modify the table by executing DDL or DML statements during cold data archiving.
Data files cannot be archived to custom OSS servers.
To archive cold data in a table that uses the InnoDB storage engine, make sure that the table has a primary key.
After cold data is archived, the table in OSS becomes read-only, and the query speed for this table is slow. Before you perform data archiving, we recommend that you use test data to evaluate whether the query performance on the archived table meets your business requirements.
Examples
Archive data in the t
table to OSS in the CSV or ORC format.
Create a table named
t
in theoss_test
database. The table uses the InnoDB storage engine.Insert data into the
t
table.Execute the
ALTER
statement to archive cold data.Archive data in the CSV format:
ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';
If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, execute the following statement:
ALTER TABLE t ENGINE = CSV STORAGE OSS;
Archive data in the ORC format:
ALTER TABLE t ENGINE = ORC STORAGE OSS;
After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the archived tables.
To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster that you want to manage and choose to view information about the archived databases and tables in OSS.
To view data in the archived tables, you can execute SQL statements in the same way as you execute SQL statements before the archiving operations. Example:
SELECT * FROM t;
CREATE TABLE t(a int, b int, c int, primary key(a)) ENGINE = INNODB;
INSERT INTO t VALUES (1,2,3);
Archive table partitions to an OSS foreign table
The cluster must run PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.25 or later.
Set the
partition_level_mdl_enabled
cluster parameter to ON in the PolarDB console to enable the partition-level metadata lock (MDL) feature. For more information about how to configure cluster parameters, see Configure cluster and node parameters.Set the
loose_use_oss_meta
cluster parameter to ON in the PolarDB console to enable theuse_oss_meta
feature. For more information about how to configure cluster parameters, see Configure cluster and node parameters.
Syntax
CALL dbms_dlm.archive_partition2table('source_db', 'source_tb', 'source_part', 'archive_db', 'archive_table', 'archive_table_file_filter');
Parameters
Parameter | Description |
source_db | The name of the source database. |
source_tb | The name of the source table. |
source_part | The partitions of the source table that you want to archive. Separate multiple partitions with commas (,). |
archive_db | The name of the destination database. |
archive_table | The name of the destination table. |
archive_table_file_filter | Specifies whether to create a file filter for the destination table. |
Precautions
If the destination OSS table does not exist, the system automatically creates the OSS table and creates an
OSS file filter
on the columns specified in thearchive_table_file_filter
parameter. The file filter accelerates query performance. The primary key and partition key are automatically added to thearchive_table_file_filter
parameter to accelerate queries.If the destination OSS table exists, the system checks whether the column names and types in the source and destination tables are the same. If the column names and types are the same, the specified partitions are archived to the destination table. Otherwise, the archiving operation fails. You can use DDL statements to ensure the definition consistency between the two tables. For more information, see Perform DDL operations on cold data. If an OSS file filter created on the destination table has a different definition from the
call dbms_dlm.archive_partition2table
table, the OSS file filter on the destination table takes precedence.If the destination OSS table does not exist but a file that has the same name as the destination OSS table exists on OSS, an error message that is similar to the following message is returned when you perform the archiving operation: Sample message:
mysql> CALL dbms_dlm.archive_partition2table('test', 'sales', 'p0', 'test', 'sales_history', 'id'); ERROR 8181 (HY000): [Data Lifecycle Management] errmsg: Target file for archived table exists on oss, please remove it first, or use flag 'FORCE' to overwrite on existing files.
If you no longer require the remnant file, you can use a storage procedure to delete the file from OSS and then perform data archiving.
-- Delete OSS data. mysql> CALL dbms_oss.delete_table_file('test', 'sales_history'); Query OK, 0 rows affected (0.76 sec) -- Perform data archiving. mysql> CALL dbms_dlm.archive_partition2table('test', 'sales', 'p0', 'test', 'sales_history', 'id'); Query OK, 0 rows affected (4.24 sec)
The destination OSS tables can archive cold data only in the CSV format.
The archived partitioned table must contain at least one InnoDB partition.
The archived partition data are combined into one table. If you want to restore the data to the partitioned table, execute the
INSERT SELECT
statement.You cannot archive individual level-2 partitions. However, you can archive all level-2 partitions under a level-1 partition.
The following table describes the partition types that can be archived to OSS foreign tables.
Level-1 partitions
Level-2 partitions
Archivable to OSS foreign tables
HASH
All types
No
LIST
All types
Yes
RANGE
All types
Yes
KEY
All types
Yes
All types
Examples
Create an InnoDB partitioned table and insert data into the table.
DROP TABLE IF EXISTS `sales`; -- Create a partitioned table. CREATE TABLE `sales` ( `id` int DEFAULT NULL, `name` varchar(20) DEFAULT NULL, `order_time` datetime NOT NULL, primary key (order_time) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 PARTITION BY RANGE COLUMNS(order_time) INTERVAL(month, 1) (PARTITION p0 VALUES LESS THAN ('2022-01-01') ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN ('2023-01-01 00:00:00') ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN ('2024-01-01 00:00:00') ENGINE = InnoDB); DROP PROCEDURE IF EXISTS proc_batch_insert; delimiter $$ CREATE PROCEDURE proc_batch_insert(IN begin INT, IN end INT, IN name VARCHAR(20)) BEGIN SET @insert_stmt = concat('INSERT INTO ', name, ' VALUES(? , ?, ?);'); PREPARE stmt from @insert_stmt; WHILE begin <= end DO SET @ID1 = begin; SET @NAME = CONCAT(begin+begin*281313, '@stiven'); SET @TIME = from_days(begin + 738368); EXECUTE stmt using @ID1, @NAME, @TIME; SET begin = begin + 1; END WHILE; END; $$ delimiter ; CALL proc_batch_insert(1, 1000, 'sales');
Archive the p0 partition of the partitioned table to an OSS table.
To view the schema of the sales table, execute the following statement:
-- View the status of the InnoDB table. mysql> SHOW CREATE TABLE sales;
Sample result:
*************************** 1. row *************************** Table: sales Create Table: CREATE TABLE `sales` ( `id` int(11) DEFAULT NULL, `name` varchar(20) DEFAULT NULL, `order_time` datetime NOT NULL, PRIMARY KEY (`order_time`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!50500 PARTITION BY RANGE COLUMNS(order_time) */ /*!99990 800020200 INTERVAL(MONTH, 1) */ /*!50500 (PARTITION p0 VALUES LESS THAN ('2022-01-01') ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN ('2023-01-01 00:00:00') ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN ('2024-01-01 00:00:00') ENGINE = InnoDB, PARTITION _p20240201000000 VALUES LESS THAN ('2024-02-01 00:00:00') ENGINE = InnoDB, PARTITION _p20240301000000 VALUES LESS THAN ('2024-03-01 00:00:00') ENGINE = InnoDB, PARTITION _p20240401000000 VALUES LESS THAN ('2024-04-01 00:00:00') ENGINE = InnoDB, PARTITION _p20240501000000 VALUES LESS THAN ('2024-05-01 00:00:00') ENGINE = InnoDB) */ 1 row in set (0.03 sec)
Execute the following statement to archive the P0 partition to an OSS table named sales_history:
-- Archive the p0 partition to the sales_history OSS table and create an OSS file filter on the id column of the table. mysql> CALL dbms_dlm.archive_partition2table('test', 'sales', 'p0', 'test', 'sales_history', 'id'); Query OK, 0 rows affected (1.86 sec)
To view the schema of the sales_history table, execute the following statement:
SHOW CREATE TABLE sales_history;
Sample result:
*************************** 1. row ***************************; Table: sales_history Create Table: CREATE TABLE `sales_history` ( `id` int(11) DEFAULT NULL, `name` varchar(20) DEFAULT NULL, `order_time` datetime DEFAULT NULL, PRIMARY KEY (`order_time`) ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ /*!99990 800020223 OSS META=1 */ /*!99990 800020224 OSS_FILE_FILTER='id,order_time' */ 1 row in set (0.00 sec)
Query data on the OSS table.
NoteYou can enable the OSS file filter feature to accelerate queries. For more information, see Use the OSS file filter feature to accelerate cold data queries.
mysql> explain SELECT * FROM sales_history WHERE id = 100; +----+-------------+---------------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------------------+ | 1 | SIMPLE | sales_history | NULL | ALL | NULL | NULL | NULL | NULL | 152 | 10.00 | Using where; With pushed engine condition (`test`.`sales_history`.`id` = 100) | +----+-------------+---------------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> SELECT * FROM sales_history WHERE id = 100; +------+-----------------+---------------------+ | id | name | order_time | +------+-----------------+---------------------+ | 100 | 28131400@stiven | 2021-11-09 00:00:00 | +------+-----------------+---------------------+ 1 row in set (0.24 sec)
Archive partitioned tables
The partitioned table archiving feature is in canary release. To use this feature, go to Quota Center. Find the quota name that corresponds to the
polardb_mysql_hybrid_partition
quota ID. Then, clickApply
in the Actions column.Your cluster must run PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.14 or later.
Partitioned table archiving is the process of archiving data in partitions. The archived table is a hybrid partitioned table. The data files of the archived partitions are stored in OSS. After cold data is archived, the PolarStore space occupied by the original partitioned table is released.
Syntax
Archive cold data in the CSV format:
ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV;
Archive cold data in the ORC format:
ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = ORC;
If OSS contains a file that has the same name as the file to be archived, the system returns an error message indicating that the file already exists in OSS. Example:
Target file for archived table exists on oss.
You can add the FORCE keyword to the preceding statement to overwrite the existing file in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:
ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV FORCE;
Parameters
Parameter | Description |
table_name | The name of the table to be archived to OSS. |
part_name | The name of the partition to be archived to OSS. |
Precautions
The cold data archiving feature is supported only for partitioned tables that use the InnoDB storage engine.
When you archive data in a partitioned table, make sure that at least one partition remains on the InnoDB storage engine. This means that you cannot archive data in the last partition on the InnoDB storage engine.
The archived table is a hybrid partitioned table. You cannot execute DDL statements on a hybrid partitioned table. For more information, see Create a hybrid partitioned table.
You cannot modify the data of archived partitions.
Cold data archiving is not supported for subpartitions in a partitioned table.
You cannot archive cold data in the default partitions of a list default hash partitioned table.
You cannot archive cold data in a hash or key partitioned table.
You cannot archive cold data for the entire partitioned table.
If OSS contains a file that has the same name as the file in the partition to be archived, the execution fails and an error message that is similar to the following message is returned:
Target file for archived table exists on oss.
Execute the following statement to overwrite the existing file in OSS: Before you execute the statement, you must replace
table_name
andpart_name
based on your environment.ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV/ORC FORCE;
Examples
Archive data in the p1
and p2
partitions of the t
table to OSS in the CSV format.
Create a table named
t
in the database. The table uses the InnoDB storage engine.CREATE TABLE t(a int, b int, c int, primary key(a)) PARTITION BY RANGE(a) (PARTITION p1 values less than(100), PARTITION p2 values less than(200), PARTITION p3 values less than MAXVALUE );
Insert data into the
t
table.INSERT INTO t VALUES(1,1,1); INSERT INTO t VALUES(10,10,10); INSERT INTO t VALUES(100,100,100); INSERT INTO t VALUES(150,150,150); INSERT INTO t VALUES(200,200,200); INSERT INTO t VALUES(1000,1000,1000);
Execute the following statements to archive data in the
p1
andp2
partitions to OSS:Archive data in the CSV format:
ALTER TABLE t CHANGE PARTITION p1 ENGINE = csv; ALTER TABLE t CHANGE PARTITION p2 ENGINE = csv;
Archive data in the ORC format:
ALTER TABLE t CHANGE PARTITION p1 ENGINE = ORC; ALTER TABLE t CHANGE PARTITION p2 ENGINE = ORC;
After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the hybrid partitioned table.
To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster that you want to manage and choose to view information about the archived databases and tables in OSS.
For information about how to view data in a hybrid partitioned table, see Query data in a hybrid partitioned table.
Import data from OSS tables to PolarStore
Import data from a common table that is archived in OSS
Before you modify the cold data archived in OSS tables, you can execute the ALTER ENGINE
statement to import data from OSS to PolarStore. Then, you can modify the data in the imported table. After data is imported to PolarStore, cold data in OSS is deleted. After the data is modified, you can archive the modified table to OSS again.
Syntax
ALTER TABLE table_name ENGINE[=]engine_name;
Parameters
Parameter | Description |
table_name | The name of the OSS table to be imported. |
engine_name | The engine of the imported table. |
Precautions
OSS tables are in the read-only state and cannot be modified by executing the INSERT
, UPDATE
, or DELETE
statement. To modify archived cold data, convert OSS tables into tables that can be read and written, such as InnoDB tables. When you modify an OSS table in the read-only state, the following error message is reported:
1036 - Table 't1' is read only
Examples
Import data from the OSS table named t
to PolarStore in the oss_test
database.
ALTER TABLE `oss_test`.`t` ENGINE = InnoDB;
Modify the data in the InnoDB table named t
. After the data is modified, archive the t
table in the InnoDB engine to OSS again. Examples:
ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';
Or
ALTER TABLE t ENGINE = CSV STORAGE OSS;
Import data from a hybrid partitioned table that is archived in OSS
Before you modify the data in a hybrid partitioned table that is archived in OSS tables, you can execute the ALTER statement to import data from OSS to PolarStore. After data is imported to PolarStore, cold data in OSS is deleted.
Syntax
ALTER TABLE table_name REORGANIZE PARTITION part_name INTO (partition_definition);
Parameters
Parameter | Description |
table_name | The name of the OSS table to be imported. |
part_name | The name of the partition to be imported. |
partition_definition | The partition definition. The value must be the same as the |
Examples
Import data from the p1
partition of the partitioned table t
that is archived in OSS to PolarStore in the database.
ALTER TABLE t REORGANIZE PARTITION p1 INTO(PARTITION p1 values less than(100));
Delete the corresponding file in OSS
You can delete the corresponding file in OSS only if your PolarDB for MySQL cluster runs one of the following versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.42 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.23 or later
If the revision version of your cluster does not meet the preceding requirements, you cannot delete the corresponding files from OSS. We recommend that you update the revision version of your cluster. For more information, see Minor version update.
After you delete a table from OSS or import a table from OSS to PolarStore, the corresponding file in OSS is retained. If you confirm that the data is no longer required, you can use the following statement to delete the corresponding file from OSS:
CALL dbms_oss.delete_table_file('database_name', 'table_name');
The OSS file is asynchronously deleted. Therefore, you must wait until all nodes in the cluster no longer depend on the OSS file before you delete the file. Take note that a delay occurs during heavy traffic. If the preceding statement fails and the "OSS files are still in use
" error message appears, wait for a period of time before you re-execute the statement.