If you want to migrate data in an ApsaraDB for ClickHouse Community-compatible Edition cluster to another ApsaraDB for ClickHouse cluster of the same edition, you can use the cluster migration feature in the ApsaraDB for ClickHouse console to migrate data. This feature supports full data migration and incremental data migration to ensure the integrity of your data.
Prerequisites
Both the source and destination clusters must meet the following requirements:
The clusters are ApsaraDB for ClickHouse Community-compatible Edition clusters.
The clusters are in the Running state.
The usernames and passwords of database accounts are created for the clusters.
If you enable tiered storage of hot data and cold data for the source cluster, you must also enable tiered storage of hot data and cold data for the destination cluster. If you disable tiered storage of hot data and cold data for the source cluster, you must also disable tiered storage of hot data and cold data for the destination cluster.
The clusters are deployed in the same region and use the same virtual private cloud (VPC). The IP address of the source cluster is added to the whitelist of the destination cluster. The IP address of the destination cluster is also added to the whitelist of the source cluster. Otherwise, resolve the network issue first. For more information, see the What do I do if a connection fails to be established between the destination cluster and the data source? section of the FAQ topic.
NoteYou can execute the
SELECT * FROM system.clusters;
statement to query the IP address of an ApsaraDB for ClickHouse cluster. For more information about how to configure a whitelist, see Configure a whitelist.
The destination cluster must meet the following additional requirements:
The version of the destination cluster is later than or the same as the version of the source cluster.
The available storage space of the destination cluster is greater than or equal to 1.2 times the occupied storage space of the source cluster.
Each local table in the source cluster corresponds to a unique distributed table.
Usage notes
The destination cluster will stop merging data parts during the migration. The source cluster will not stop merging data parts during the migration.
Migration content:
You can migrate the following objects from the source cluster: the cluster, databases, tables, data dictionaries, materialized views, user permissions, and cluster configurations.
You cannot migrate Kafka or RabbitMQ tables.
ImportantTo ensure that Kafka and RabbitMQ data is not sharded, you must delete the Kafka and RabbitMQ tables in the source cluster and then create corresponding tables in the destination cluster or use different consumer groups.
You can migrate only the schemas of non-MergeTree tables such as external tables and log tables.
NoteIf the source cluster contains non-MergeTree tables, the non-MergeTree tables in the destination cluster have only a table schema and no specific business data after migration. You can use the
remote
function to migrate specific business data. For more information, see the Use the remote function to migrate data section of the Migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster topic.
Impacts of data migration on clusters
During data migration, you can read data from and write data to the tables in the source cluster. However, you cannot perform DDL operations on the source cluster, such as adding, deleting, and modifying the metadata in the databases and tables.
To ensure that the migration task succeeds, the source cluster automatically suspends data write operations when the migration progress reaches 99% and the time falls within the predefined time window for data write suspension.
When all data is migrated or the predefined time window for data write suspension ends, the source cluster automatically resumes data writing.
Procedure
Step 1: Create a migration task
Log on to the ApsaraDB for ClickHouse console.
On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.
In the left-side navigation pane, click Migrate Instance.
On the page that appears, click Create Migration Task.
Configure the source and destination clusters.
Configure the parameters in the Source Instance Information section and the Destination Instance Information section and click Test Connectivity and Proceed.
NoteAfter the connection test succeeds, proceed to the Migration Content step. If the connection test fails, configure the source and destination clusters again as prompted.
Confirm the migration content.
On the page that appears, read the information about the data migration content and click Next: Pre-detect and Start Synchronization.
The system performs prechecks on the migration configuration and then starts the migration task in the backend after the prechecks pass.
The system performs the following prechecks on the source and destination clusters: Instance Status Detection, Storage Space Detection, and Local Table and Distributed Table Detection.
If the prechecks pass, perform the following operations:
Read the information about the impacts of data migration on clusters.
Configure the Time of Stopping Data Writing parameter.
NoteTo ensure the success rate of data migration, we recommend that you specify a value greater than or equal to 30 minutes.
A migration task must end within five days after the task is created and started. Therefore, the end date of Time of Stopping Data Writing must be less than or equal to
the current date plus 5 days
.To reduce the impact of data migration on your business, we recommend that you configure a time range during off-peak hours.
Click Completed.
NoteAfter you click Completed, the task is created and started.
If the prechecks fail, you need to follow the on-screen instructions to resolve the issue and then configure the migration task parameters again. The following table describes the precheck items.
Item
Description
Cluster Status Detection
Before you migrate data, make sure that no management operations, such as scale-out, upgrade, or downgrade operations, are being performed on the source cluster and the destination cluster. If management operations are being performed on the source cluster and the destination cluster, the system cannot start a migration task.
Storage Space Detection
Before a migration task is started, the system checks the storage space of the source cluster and the destination cluster. Make sure that the storage space of the destination cluster is greater than or equal to 1.2 times the storage space of the source cluster.
Local Table and Distributed Table Detection
If no distributed table is created for a local table or multiple distributed tables are created for the same local table of the source cluster, the precheck fails. You must delete redundant distributed tables or create a unique distributed table.
Step 2: View the migration task
On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.
In the left-side navigation pane, click Migrate Instance.
On the page that appears, view the following information about the migration task: Migration Status, Migration Progress, and Data Write-Stop Window.
NoteWhen the migration progress reaches 99% and the migration state is Migrating, the data write suspension for the source cluster is triggered. The following section describes the rules for data write suspension:
If the time when data write suspension is triggered falls within the predefined time window, the source cluster suspends data write operations.
If the time when data write suspension is triggered does not fall within the predefined time window and is less than or equal to
the task creation and start date plus 5 days
, you can modify the time window to continue the migration task.If the time when data write suspension is triggered does not fall within the predefined time window and is greater than
the task creation and start date plus 5 days
, the migration task fails. You must cancel the migration task, clear the migrated data in the destination cluster, and recreate a migration task to migrate data.
(Optional) Step 3: Cancel the migration task
On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.
In the left-side navigation pane, click Migrate Instance.
Click Cancel Migration in the Actions column of the migration task that you want to manage.
In the Cancel Migration message, click OK.
NoteAfter the migration task is canceled, the task state is not updated immediately. We recommend that you refresh the page at intervals to view the task state.
After the task is canceled, the value of the Migration Status parameter for the task changes to Completed.
Before you restart a migration task, you must clear the migrated data in the destination cluster to avoid data duplication.
(Optional) Step 4: Modify the data write-stop time window
On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.
In the left-side navigation pane, click Migrate Instance.
Click Modify Data Write-Stop Time Window in the Actions column of the migration task that you want to manage.
In the Modify Data Write-Stop Time Window dialog box, configure the Time of Stopping Data Writing parameter.
NoteThe rules for setting Time of Stopping Data Writing are the same as those for Time of Stopping Data Writing that is set when you create a migration task.
Click OK.
References
For more information about how to migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster, see Migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster.