All Products
Search
Document Center

ApsaraDB for ClickHouse:Migrate data from a self-managed ClickHouse cluster to ApsaraDB for ClickHouse Enterprise Edition

Last Updated:Feb 10, 2026

This topic describes how to migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse Enterprise Edition cluster using the console or manual migration.

Prerequisites

  • Self-managed cluster: You have created a database account and password. The account must have read permissions on databases and tables and permission to execute SYSTEM commands. To migrate external tables that contain account passwords, the account must also have the displaySecretsInShowAndSelect permission.

  • Destination cluster: You have created a database account and password with full permissions.

  • Network connectivity

    • If the self-managed cluster and the destination cluster are in the same VPC, add the IP addresses of all nodes in the destination cluster and the IPv4 CIDR block of its vSwitch to the whitelist of the self-managed cluster.

      • For information about how to configure a whitelist in ApsaraDB for ClickHouse, see Configure a whitelist.

      • For information about how to configure a whitelist in your self-managed cluster, see the documentation of your product.

      • You can run the SELECT * FROM system.clusters WHERE internal_replication = 1; command to view the IP addresses of all nodes in the ApsaraDB for ClickHouse cluster.

    • If the self-managed cluster and the destination cluster are in different VPCs, or if the self-managed cluster is hosted in an on-premises data center or another cloud provider, resolve network connectivity issues first. For more information, see How do I establish network connectivity between the destination cluster and the data source?.

      Note

      In this case, IP mapping may be required to avoid CIDR block conflicts between different VPCs. If you perform IP mapping, add the mapped IP addresses to the whitelists of both clusters.

Migration validation

Before you start migrating data, create a staging environment to validate business compatibility, performance, and whether the migration can complete successfully. After successful validation, proceed with the migration in the production environment. This step is critical because it helps you identify and resolve potential issues early, ensuring a smooth migration and avoiding unnecessary impact on your production environment.

  1. Create a migration task to migrate data.

  2. Perform performance bottleneck analysis to determine whether the migration can complete.

  3. You can validate compatibility in one of the following ways:

    1. Manual validation. For more information, see Compatibility analysis and resolution.

    2. Console-based validation. For more information, see (Optional) Check SQL compatibility.

Select a solution

Migration solution

Advantages

Disadvantages

Scenarios

Console-based migration

Provides a visual interface and does not require manual metadata migration.

Supports only full and incremental migration of entire clusters. You cannot migrate specific databases, tables, or historical data subsets.

Full cluster migration.

Manual migration

Gives you full control over which databases and tables to migrate.

Requires complex operations and manual metadata migration.

  • Migrate specific databases or tables.

  • Cold data on a single node exceeds 1 TB.

  • Hot data on a single node exceeds 10 TB.

  • Full cluster migration that does not meet the requirements for console-based migration.

Procedure

Console-based migration

Notes

During migration

  • Merge operations are paused for tables being migrated in the destination cluster, but not in the self-managed cluster.

    Note

    If the migration takes too long, metadata accumulates excessively in the destination cluster. We recommend that migration tasks run for no more than five days. Tasks that exceed this duration are automatically canceled.

  • The destination cluster must use the default cluster. If your self-managed cluster uses a different name, the cluster definition in distributed tables is automatically converted to default.

Supported migration content

Note

Some table engines are converted during migration. For details about engine conversions after migration, see the following tables.

  • Database schema: The following table lists supported database engines and their conversion rules.

    Engine name

    Engine conversion description

    Atomic

    Replaced with Replicated database

    Replicated

    No change

    Ordinary

    Replaced with Replicated database

  • Table schema: The following table lists the table engines that support migration.

    Engine name

    Engine conversion description

    MaterializedView

    No change

    View

    GenerateRandom

    Buffer

    URL

    Null

    Merge

    SharedMergeTree

    SharedVersionedCollapsingMergeTree

    SharedSummingMergeTree

    SharedReplacingMergeTree

    SharedAggregatingMergeTree

    SharedCollapsingMergeTree

    SharedGraphiteMergeTree

    MergeTree

    It will be replaced with SharedMergeTree

    ReplicatedMergeTree

    VersionedCollapsingMergeTree

    It will be replaced with SharedVersionedCollapsingMergeTree

    ReplicatedVersionedCollapsingMergeTree

    SummingMergeTree

    It will be replaced with SharedSummingMergeTree

    ReplicatedSummingMergeTree

    ReplacingMergeTree

    It will be replaced with SharedReplacingMergeTree

    ReplicatedReplacingMergeTree

    AggregatingMergeTree

    It will be replaced with SharedAggregatingMergeTree

    ReplicatedAggregatingMergeTree

    ReplicatedCollapsingMergeTree

    It will be replaced with SharedCollapsingMergeTree

    CollapsingMergeTree

    GraphiteMergeTree

    It will be replaced with SharedGraphiteMergeTree

    ReplicatedGraphiteMergeTree

  • Data: Incrementally migrates data from MergeTree-family tables.

Important
  • The database and table schemas listed above can be migrated normally. For other schemas, handle them manually based on warnings or error messages during migration.

  • If your data does not meet the preceding conditions, choose manual migration.

Cluster impact

  • Self-managed cluster

    • CPU and memory usage increase while reading data from the self-managed cluster.

    • DDL operations are not allowed.

  • Destination cluster

    • CPU and memory usage increase while writing data to the destination cluster.

    • DDL operations are not allowed on tables being migrated. Tables not involved in migration are unaffected.

    • Merge operations are paused for tables being migrated. Tables not involved in migration continue merging.

    • After migration ends, the cluster performs frequent merge operations for some time, increasing I/O usage and potentially increasing latency for business requests. We recommend that you calculate the merge duration after migration and plan accordingly to mitigate potential latency impacts.

Step 1: Check the self-managed cluster and enable system tables

Before data migration, modify the config.xml file based on whether system.part_log and system.query_log are enabled in the self-managed cluster to support incremental migration.

If system.part_log and system.query_log are not enabled

If you have not enabled system.part_log and system.query_log, add the following configurations to the config.xml file.

system.part_log
<part_log>
    <database>system</database>
    <table>part_log</table>
    <partition_by>event_date</partition_by>
    <order_by>event_time</order_by>
    <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</part_log>
system.query_log
<query_log>
    <database>system</database>
    <table>query_log</table>
    <partition_by>event_date</partition_by>
    <order_by>event_time</order_by>
    <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>

Enabled system.part_log and system.query_log

  1. Compare the configurations of system.part_log and system.query_log in the config.xml file with the following content. If there are any inconsistencies, modify the configurations to match. Otherwise, the migration may fail or proceed slowly.

    system.part_log
    <part_log>
        <database>system</database>
        <table>part_log</table>
        <partition_by>event_date</partition_by>
        <order_by>event_time</order_by>
        <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </part_log>
    system.query_log
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>event_date</partition_by>
        <order_by>event_time</order_by>
        <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
  2. After you modify the configuration, run the drop table system.part_log and drop table system.query_log statements. The system.part_log and system.query_log tables are automatically recreated when you insert data into a business table.

Step 2: Configure the destination cluster to match the self-managed cluster version

To ensure the destination cluster behaves as similarly as possible to the self-managed cluster, connect to the destination cluster and set the compatibility parameter to match the version of the self-managed cluster.

Important

Setting compatibility to a lower version disables some new features, such as ParallelReplica.

Example:

SELECT currentProfiles(); // Get the profile used by the user
SELECT
    profile_name,
    setting_name,
    value
FROM system.settings_profile_elements
WHERE (setting_name = 'compatibility') AND (profile_name = 'xxxx'); // Query the compatibility setting
ALTER PROFILE XXXX SETTINGS compatibility = '23.8'; // Modify the profile

Step 3: Create a migration task

  1. Log on to the ApsaraDB for ClickHouse console. On the Clusters page, select Enterprise Edition Clusters and click the ID of the destination cluster.

  2. In the navigation pane on the left, choose Data Migration and Synchronization > Migration from ClickHouse.

  3. Click Create Migration Task.

  4. Select source and destination instances.

    Configuration item

    Description

    Example

    Task Name

    The task name can contain only letters and digits. Names are case-insensitive and must be unique.

    MigrationTask1229

    Source Cluster Name

    Run SELECT * FROM system.clusters; to get the cluster name of the self-managed instance.

    default

    VPC IP Address

    The IP address and port of each shard in the cluster, separated by commas. Format: IP:PORT,IP:PORT,....

    Run the following SQL statement to get the IP addresses and ports of the self-managed cluster:

    SELECT shard_num, replica_num, host_address as ip, port FROM system.clusters WHERE cluster = '<cluster_name>' and replica_num = 1;

    Parameters:

    • cluster_name: The name of the destination cluster.

    • replica_num=1 selects the first replica. You can also select other replicas or manually pick one replica per shard.

    Important
    • Do not use the VPC domain name or SLB address of ClickHouse.

    • If IP and port mapping is applied for Alibaba Cloud, configure the mapped IP and port based on network connectivity.

    192.168.0.5:9000,192.168.0.6:9000

    Database Account

    The database account of the self-managed cluster.

    test

    Database Password

    The password of the database account for the self-managed cluster.

    test******

    Source Instance Kernel Version

    Click Get Version.

    22.8.5.29

  5. Based on the source instance version, perform the corresponding operation:

    • If the source instance version is 22.10 or later, click Next.

    • If the source instance version is earlier than 22.10, enter the Destination Instance Information as prompted and click Next.

    • If version retrieval fails, check for errors such as incorrect source instance information or network connectivity issues. Fix the issues and click Get Version again.

    Note

    Due to parameter incompatibilities between older community editions and Enterprise Edition, if the source instance version is earlier than 22.10, you must use push-based synchronization from source to destination. In this case, you must map the destination IP to the self-managed network. If the self-managed network and the Enterprise Edition instance are in the same VPC or connected via VPC peering, you can use the original IP addresses directly.

  6. Check connectivity and configuration.

    1. Click Start Check.

      Click to view check items.

      • Connectivity verification: Full network connectivity between the self-managed and destination instances, with mutual access between all nodes.

      • Account permission verification: The source account and password are correct and can connect to the source instance.

      • Source instance system table check: The self-managed instance must have the following three system tables: system.query_log, system.parts, and system.part_log.

      • Configuration check: The self-managed and destination instances must have the same time zone, and the compatibility setting of the destination instance must match the source version.

    2. During the check, click the image icon in the upper-right corner to view real-time progress.

    3. After the check completes, proceed based on the results.

      Select a result level and check item, then click the image button to view details. Results are interpreted as follows:

      • Success: If all checks pass, click Next to continue.

      • Warning: These are non-blocking issues. Confirm whether they affect your business or migration. You can ignore the warning or fix the issue and click Start Check again to recheck.

      • Error: These are blocking issues. Fix the error and click Start Check again to recheck.

        For error messages and solutions, see FAQ.

  7. Check database and table schemas.

    1. Click Start Check.

    2. During the check, click the image icon in the upper-right corner to view real-time progress.

    3. After the check completes, proceed based on the results.

      For result interpretation, see Step 5.

  8. Migrate database and table schemas.

    1. Click Start Migration.

    2. During migration, click the image icon in the upper-right corner to view real-time progress.

    3. Once the inspection is complete, you can proceed with the next steps based on the inspection results.

      For result interpretation, see Step 5.

  9. (Optional) Check SQL compatibility.

    SQL compatibility checking replays SQL statements from the self-managed instance on the destination instance to verify syntax compatibility across kernel versions. Decide whether to perform this step based on your needs.

    • To skip, click Skip.

    • To perform the check, select a request replay time and click Start Check. After a successful check, click Next. For failed checks, see Step 5 for resolution.

      Important
      • This check validates syntax compatibility only and does not require data. To test with data, migrate partial data in the next step first.

      • False positives may occur due to client version mismatches with the destination instance. Validate SQL statements manually if needed.

  10. Start synchronization.

    1. Click Start Sync.

    2. During synchronization, click the image icon in the upper-right corner to view real-time progress.

      You can control the migration process using the following operations: Stop, Restart, and Cancel Migration. Click to view operation meanings and impacts.

      Operation

      Feature Definition

      Impact

      Scenario

      Stop

      Immediately stops data migration but continues migrating remaining table schemas.

      • Some data might not be migrated completely.

      • Before restarting migration, clean up migrated data in the destination cluster to avoid duplicates.

      • Stop the migration task after full data migration.

      • Test after migrating partial data without stopping writes to the self-managed cluster.

      Restart

      Retries the current step after resolving errors encountered during schema or data migration.

      None

      Resume migration from the breakpoint after fixing an error.

      Cancel migration

      Forcefully cancels the task and skips subsequent steps.

      Important

      After cancellation, the migration task is locked and cannot be modified. Use the Previous, Next, or Refresh buttons to view step results.

      • The migration is terminated. The destination instance may have incomplete schemas or configurations and cannot be used for business.

      • Before restarting migration, clean up migrated data in the destination cluster to avoid duplicates.

      End migration quickly to resume writes to the self-managed cluster.

    3. When the process reaches Migrate Data, switch to the Migrate Data tab and click the image button to view Migration Progress and estimated remaining time.

      Click to learn how to evaluate whether migration can complete.

      Migration success depends on the relationship between migration speed and write speed of the self-managed cluster.

      • Test migration speeds are as follows:

        Average part size

        Source instance type

        Source disk type

        Destination instance type

        Destination storage medium

        Number of cluster nodes

        Migration speed per node

        Overall migration speed

        402.54MB

        8C32G

        PL1

        16CCU

        OSS

        16

        47MB/s

        752.34MB/s

        402.54MB

        80C384G

        PL3

        48CCU

        ESSD_L2

        8

        197.74MB/s

        1581.95MB/s

      • Compare write speeds of the destination and self-managed clusters:

        Migration speed depends on factors such as part size (in tests, average part sizes between 100 MB and 10 GB maintained high speeds), instance type, disk type, and data characteristics. Therefore, test data is for reference only. Determine the actual write speed of the destination cluster by monitoring its disk throughput. For more information, see View cluster monitoring information.

        • If the destination cluster write speed is lower than the self-managed cluster write speed, migration has a high failure rate. Cancel the migration task and use manual migration.

        • If the destination cluster write speed is higher than the self-managed cluster write speed, proceed. To improve success rate, ensure the migration duration calculated as data volume / (migration speed - self-managed write speed) is five days or less.

      Important

      Monitor the Migration Progress closely. Based on the estimated remaining time, stop writes to the self-managed cluster and handle Kafka and RabbitMQ engine tables.

      Click to learn how to estimate when to stop writes to the self-managed cluster.

      To ensure data completeness after migration, stop business writes and delete Kafka and RabbitMQ tables before traffic switchover. Perform the following steps:

      1. Log on to the self-managed cluster and run the following statement to query tables to handle:

        SELECT * FROM system.tables WHERE engine IN ('RabbitMQ', 'Kafka');
      2. View the CREATE TABLE statement for the target table:

        SHOW CREATE TABLE <aim_table_name>;
      3. Connect to the destination cluster and run the CREATE TABLE statement obtained in the previous step.

      4. Log on to the self-managed cluster and delete migrated Kafka and RabbitMQ engine tables.

        Important

        When deleting Kafka tables, also delete materialized views that reference them. Otherwise, materialized view migration fails, causing the entire migration to fail.

    4. When Migration Progress reaches 100% and you confirm the source instance has stopped writes, click Stop to end the migration and proceed.

      image

    5. After synchronization completes, click Completed.

      Important

      After the "Start synchronization" step completes, the migration task is locked and cannot be modified. Use the Previous, Next, or Refresh buttons to view step results.

Step 4: Migrate business data for non-MergeTree tables

During migration, non-MergeTree tables support only schema migration (such as MySQL tables) or no migration at all (such as Log tables). After migration, some tables in the destination cluster may have schemas but no data. Complete data migration manually as follows:

  1. Log on to the self-managed cluster and run the following statement to identify non-MergeTree tables requiring data migration:

    SELECT
        `database` AS database_name,
        `name` AS table_name,
        `engine`
    FROM `system`.`tables`
    WHERE (`engine` NOT LIKE '%MergeTree%') AND (`engine` != 'Distributed') AND (`engine` != 'MaterializedView') AND (`engine` NOT IN ('Kafka', 'RabbitMQ')) AND (`database` NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema')) AND (`database` NOT IN (
        SELECT `name`
        FROM `system`.`databases`
        WHERE `engine` IN ('MySQL', 'MaterializedMySQL', 'MaterializeMySQL', 'Lazy', 'PostgreSQL', 'MaterializedPostgreSQL', 'SQLite')
    ))
  2. Log on to the destination cluster and use the remote function to migrate data.

Manual migration

Migrate data from a self-managed ClickHouse cluster to ApsaraDB for ClickHouse Enterprise Edition

image.png

Note

In ApsaraDB for ClickHouse Enterprise Edition, regardless of whether your source table is sharded or replicated, simply create the corresponding destination table (omit the Engine parameter because the system automatically uses the SharedMergeTree table engine). The ApsaraDB for ClickHouse Enterprise Edition cluster handles vertical and horizontal scaling automatically. You do not need to manage replication or sharding implementation.

Overview

The procedure for migrating data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse Enterprise Edition cluster is as follows.

  1. Add a read-only user to the source cluster.

  2. Copy the source table schema to the destination cluster.

  3. If the source cluster supports external network access, pull data from the source cluster to the destination cluster. If not, push data from the source cluster to the destination cluster.

  4. (Optional) You can delete the IP address of the source cluster from the destination cluster.

  5. Remove the read-only user from the source cluster.

Procedure

  1. Perform the following operations on the source cluster (the source table contains data):

    1. Add a read-only user for table db.table.

      CREATE USER exporter
      IDENTIFIED WITH SHA256_PASSWORD BY 'password-here'
      SETTINGS readonly = 1;
      GRANT SELECT ON db.table TO exporter;
    2. Copy the source table schema.

      SELECT create_table_query
      FROM system.tables
      WHERE database = 'db' and table = 'table'
  2. Perform the following operations on the destination cluster.

    1. Create a database.

      CREATE DATABASE db
    2. Create the destination table using the CREATE TABLE statement from the source table.

      Note

      When running the CREATE TABLE statement, change ENGINE to SharedMergeTree but do not include any parameters because the ApsaraDB for ClickHouse Enterprise Edition cluster always replicates tables and provides correct parameters. Retain the ORDER BY, PRIMARY KEY, PARTITION BY, SAMPLE BY, TTL, and SETTINGS clauses to ensure the table is created correctly in the destination ApsaraDB for ClickHouse Enterprise Edition cluster.

      CREATE TABLE db.table ...
    3. Use the Remote function to pull or push data.

      Note

      If the source ClickHouse server is not accessible from external networks, push data instead of pulling it because the Remote function works for both SELECT and INSERT operations.

      • In the destination cluster, use the Remote function to pull data from the source table in the source cluster.

        image.png

        INSERT INTO db.table SELECT * FROM
        remote('source-hostname:9000', db, table, 'exporter', 'password-here')
      • In the source cluster, use the Remote function to push data to the destination cluster.

        image.png

        Note

        To allow the Remote function to connect to your ApsaraDB for ClickHouse Enterprise Edition cluster, add the source cluster IP address to the destination cluster whitelist. For more information, see Configure a whitelist.

        INSERT INTO FUNCTION
        remote('target-hostname:9000', 'db.table',
        'default', 'PASS') SELECT * FROM db.table

FAQ

Error messages and solutions during connectivity and configuration checks

Error message

Meaning

Solution

Tcp connectivity check failed for '{host}:{port}':{error}.

Network connection to the self-managed cluster timed out.

Troubleshoot network issues based on the error message.

No such cluster: {cluster}, please run 'SELECT DISTINCT(cluster) FROM system.clusters;' to check

The cluster configured during migration task creation does not exist in the self-managed cluster.

Query the self-managed cluster for available clusters using SQL and update the migration task configuration.

not exists

One or more of the following system tables are missing in the self-managed cluster: system.query_log, system.parts, system.part_log.

Create the missing system tables in the self-managed cluster.

Timezone mismatch with source, which may cause time data anomalies.

The time zones of the self-managed and destination clusters do not match.

Modify the time zone to match.

Compatibility mismatch with source version, which may cause incompatibility.

The compatibility setting of the destination cluster does not match the version of the self-managed cluster.

Adjust the compatibility setting of the destination cluster to match the self-managed cluster version.

Important

Setting compatibility to a lower version disables some new features, such as ParallelReplica.

Error messages and solutions during schema checks

Error message

Meaning

Solution

ERROR: Not consistent across nodes.

Database tables are inconsistent across nodes in the self-managed cluster.

Check and resolve inconsistencies across nodes in the self-managed cluster.

ERROR: Cannot get secrets (shown as [HIDDEN]), please set display_secrets_in_show_and_select=1 (restart required).

Table passwords are hidden.

Set display_secrets_in_show_and_select=1 and restart.

Note: This requires the account to have the displaySecretsInShowAndSelect permission.

ERROR: Unsupported engine.

The database engine in the self-managed cluster does not support migration.

Consider switching to a destination-supported database engine.

WARN:Unsupported engine, it will be automatically replaced with a Replicated database to bypass migration exceptions.

The database engine in the self-managed cluster does not support migration.

The system automatically replaces it with a Replicated database to bypass migration exceptions.

WARN:Unsupported engine, please replace the data synchronization capability with DTS, or create a same-name database to bypass migration exceptions.

The database engine in the self-managed cluster does not support migration.

Use DTS (Data Transmission Service) for synchronization or create a same-name database to bypass migration exceptions.

WARN:Unsupported engine, it will be automatically ignored during migration.

The database engine in the self-managed cluster does not support migration.

It is automatically ignored during migration.

WARN:It's not recommended to use the Distributed engine as it will cause scaling issues in enterprise instances. Please drop this table and query the underlying MergeTree table directly。

ApsaraDB for ClickHouse Enterprise Edition does not recommend using distributed tables.

Delete the distributed table in the self-managed cluster and query the underlying MergeTree table directly after migration.

WARN:Please confirm referenced IP addresses are accessible.

External engines are referenced. Access issues may occur, but are not guaranteed.

Confirm whether the referenced IP addresses are accessible from the destination instance. If not, establish network connectivity and add the IPs to the whitelist.

WARN:Only structure, does not support data migration.

For some engines, only the table schema is migrated; data migration is not supported.

Migrate data manually using the remote function or other methods.

WARN:Unsupported engine, please create a same-name MergeTree table manually to bypass migration exceptions.

Data migration is not supported for some engines.

Create a same-name MergeTree table in the destination cluster and migrate data manually.

WARN:Ignored engine, please create table manually.

Data migration is not supported for some engines.

See Step 4 in the procedure.

ERROR: Table has data in destination cluster.

During schema checks, the corresponding table in the destination instance must be empty.

Delete data from the corresponding table in the destination instance.

ERROR: Unsupported function origin.

Only user-defined functions (Function(function.origin="SQLUserDefined")) are supported for migration.

Create the corresponding function manually in the destination instance.

Other

For other issues during migration and their solutions, see FAQ.