This topic describes how to separate hot data and cold data in a ClickHouse cluster of Alibaba Cloud E-MapReduce (EMR) by using Object Storage Service (OSS). The operations that are described in this topic help you not only automatically maintain hot data and cold data in a cluster, but also fully utilize computing and storage resources in the cluster to reduce costs. The operations do not affect the read and write performance of the cluster.
Prerequisites
A ClickHouse cluster of EMR V5.7.0 or later is created in the EMR console. For more information, see Create a ClickHouse cluster.
Limits
You can perform the operations that are described in this topic only in a ClickHouse cluster of EMR V5.7.0 or later.
Procedure
Step 1: Add an OSS disk in the EMR console
- Go to the Configure tab on the ClickHouse service page of the cluster.
- Log on to the EMR on ECS console.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- On the EMR on ECS page, find the cluster that you want to manage and click Services in the Actions column.
- On the Services tab, click Configure in the ClickHouse section.
- On the Configure tab, click the server-metrika tab.
- Change the value of the storage_configuration parameter.
- Save the configuration.
- Click Save in the upper-right corner of the Service Configuration section.
- In the Confirm Changes dialog box, configure Description, turn on the Auto-update Configuration switch, and then click OK.
- Deploy the client configuration.
- On the Configure tab of the ClickHouse service page, click Deploy Client Configuration.
- In the Cluster Activities dialog box, configure Description and click OK.
- In the Confirm message, click OK.
Step 2: Check the configuration
- Log on to the ClickHouse cluster in SSH mode. For more information, see Log on to a cluster.
- Run the following command to start the ClickHouse client:
clickhouse-client -h core-1-1 -m
Note In the sample command, core-1-1 indicates the name of the core node that you log on to. If you have multiple core nodes, you can log on to one of the nodes. - Execute the following statement to view disk information:
select * from system.disks;
The following output is returned:┌─name─────┬─path────────────────────────────────┬───────────free_space─┬──────────total_space─┬─keep_free_space─┬─type──┐ │ default │ /var/lib/clickhouse/ │ 83868921856 │ 84014424064 │ 0 │ local │ │ disk1 │ /mnt/disk1/clickhouse/ │ 83858436096 │ 84003938304 │ 10485760 │ local │ │ disk2 │ /mnt/disk2/clickhouse/ │ 83928215552 │ 84003938304 │ 10485760 │ local │ │ disk3 │ /mnt/disk3/clickhouse/ │ 83928301568 │ 84003938304 │ 10485760 │ local │ │ disk4 │ /mnt/disk4/clickhouse/ │ 83928301568 │ 84003938304 │ 10485760 │ local │ │ disk_oss │ /var/lib/clickhouse/disks/disk_oss/ │ 18446744073709551615 │ 18446744073709551615 │ 0 │ oss │ └──────────┴─────────────────────────────────────┴──────────────────────┴──────────────────────┴─────────────────┴───────┘
- Execute the following statement to view storage policies:
select * from system.storage_policies;
The following output is returned:┌─policy_name─┬─volume_name─┬─volume_priority─┬─disks─────────────────────────────┬─volume_type─┬─max_data_part_size─┬─move_factor─┬─prefer_not_to_merge─┐ │ default │ single │ 1 │ ['disk1','disk2','disk3','disk4'] │JBOD │ 0 │ 0 │ 0 │ │ oss_ttl │ local │ 1 │ ['disk1','disk2','disk3','disk4'] │JBOD │ 0 │ 0.2 │ 0 │ │ oss_ttl │ remote │ 2 │ ['disk_oss'] │JBOD │ 0 │ 0.2 │ 0 │ └─────────────┴─────────────┴─────────────────┴───────────────────────────────────┴─────────────┴────────────────────┴─────────────┴─────────────────────┘
If the preceding output is returned, the OSS disk is added.
Step 3: Separate hot data and cold data
Reconstruct an existing table
- Execute the following statement on the ClickHouse client to view the current storage policy:
SELECT storage_policy FROM system.tables WHERE database='<yourDatabaseName>' AND name='<yourTableName>';
In the preceding statement,
<yourDataName>
specifies the database name.<yourTableName>
specifies the table name.If the following output is returned, you must perform the next step to add a volume:<default> <volumes> <single> <disk>disk1</disk> <disk>disk2</disk> <disk>disk3</disk> <disk>disk4</disk> </single> </volumes> </default>
- Add a volume. On the Configure tab of the ClickHouse service page in the EMR console, add a volume to volumes. Sample code:
<default> <volumes> <single> <disk>disk1</disk> <disk>disk2</disk> <disk>disk3</disk> <disk>disk4</disk> </single> <!-- The following volume named remote is added. --> <remote> <disk>disk_oss</disk> </remote> </volumes> <!-- If you want to add multiple volumes, specify move_factor. --> <move_factor>0.2</move_factor> </default>
- Execute the following statement to change the TTL:
ALTER TABLE <yourDataName>.<yourTableName> MODIFY TTL toStartOfMinute(addMinutes(t, 5)) TO VOLUME 'remote';
- Execute the following statement to view the distribution of each data part:
select partition,name,path from system.parts where database='<yourDataName>' and table='<yourTableName>' and active=1
The following output is returned:┌─partition───────────┬─name─────────────────────┬─path──────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ 2022-01-11 19:55:00 │ 1641902100_1_90_3_193 │ /var/lib/clickhouse/disks/disk_oss/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902100_1_90_3_193/ │ │ 2022-01-11 19:55:00 │ 1641902100_91_96_1_193 │ /var/lib/clickhouse/disks/disk_oss/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902100_91_96_1_193/ │ │ 2022-01-11 20:00:00 │ 1641902400_97_124_2_193 │ /mnt/disk3/clickhouse/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902400_97_124_2_193/ │ │ 2022-01-11 20:00:00 │ 1641902400_125_152_2_193 │ /mnt/disk2/clickhouse/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902400_125_152_2_193/ │ │ 2022-01-11 20:00:00 │ 1641902400_153_180_2_193 │ /mnt/disk4/clickhouse/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902400_153_180_2_193/ │ │ 2022-01-11 20:00:00 │ 1641902400_181_186_1_193 │ /mnt/disk3/clickhouse/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902400_181_186_1_193/ │ │ 2022-01-11 20:00:00 │ 1641902400_187_192_1_193 │ /mnt/disk4/clickhouse/store/fc5/fc50a391-4c16-406b-a396-6e1104873f68/1641902400_187_192_1_193/ │ └─────────────────────┴──────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────┘ 7 rows in set. Elapsed: 0.002 sec.
Note If the preceding output is returned, hot data and cold data are separated based on the TTL. Hot data is stored in local disks, and cold data is stored in OSS./var/lib/clickhouse/disks/disk_oss is the default value of the metadata_path parameter for the OSS disk. /mnt/disk{1..4}/clickhouse is the local disk path.
Create a table
- Syntax
CREATE TABLE <yourDataName>.<yourTableName> [ON CLUSTER cluster_emr] ( column1 Type1, column2 Type2, ... ) Engine = MergeTree() -- or Replicated*MergeTree() PARTITION BY <yourPartitionKey> ORDER BY <yourPartitionKey> TTL <yourTtlKey> TO VOLUME 'remote' SETTINGS storage_policy='oss_ttl';
Note In the preceding syntax, <yourPartitionKey> specifies the partition key for a table in the ClickHouse cluster. <yourTtlKey> specifies the TTL that you specify. - Example
CREATE TABLE test.test ( `id` UInt32, `t` DateTime ) ENGINE = MergeTree() PARTITION BY toStartOfFiveMinute(t) ORDER BY id TTL toStartOfMinute(addMinutes(t, 5)) TO VOLUME 'remote' SETTINGS storage_policy='oss_ttl';
Note In this example, the table named test.test stores only data of the previous 5 minutes in local disks. Data is moved to the volume named remote in OSS 5 minutes after the data is generated.
Other parameters
- server-config
merge_tree.allow_remote_fs_zero_copy_replication: Set the value to true. This way, the engine of the Replicated*MergeTree type replicates the metadata that points to the OSS disk to generate multiple metadata replicas for the same shard in the ClickHouse cluster.
- server-users
- profile.${your-profile-name}.s3_min_upload_part_size: If the amount of data in the write buffer is greater than the value of this parameter, data is written to OSS.
- profile.${your-profile-name}.s3_max_single_part_upload_size}: If the amount of data in the write buffer is greater than the value of this parameter, the MultipartUpload operations are used. For more information, see .