ApsaraDB for ClickHouse Enterprise Edition is a cloud service that is developed based on open source ClickHouse. The architecture and features of ApsaraDB for ClickHouse Enterprise Edition differ from those of open source ClickHouse. You can visit the official ClickHouse website to obtain the background information about ClickHouse Cloud. This topic describes the architecture and compatibility of ApsaraDB for ClickHouse Enterprise Edition.
Architecture
ApsaraDB for ClickHouse Enterprise Edition significantly simplifies operational overhead and reduces the costs of large-scale running of ClickHouse. You do not need to deploy a cluster of certain specifications, configure a replica for high availability, or perform manual data sharding. ApsaraDB for ClickHouse Enterprise Edition automatically scales up your servers when your workload increases, and scales down your servers when your workload decreases.
ApsaraDB for ClickHouse Enterprise Edition brings the following benefits by using such an architecture:
Computing and storage are separated. Therefore, automatic scaling can be implemented based on different dimensions. You do not need to over-provision storage or computing resources when you configure a static instance.
Tiered storage and multi-level caching based on object storage are used to provide virtually limitless scaling and high cost efficiency. You do not need to determine the size of your storage partition in advance or worry about high storage costs.
By default, high availability is enabled. Replication is transparently managed. You can focus on building your applications or analyzing your data.
By default, automatic scaling for variable continuous workloads is enabled. You do not need to determine your server size in advance because this feature can scale up your servers when your workload increases, and scale down your servers when your workload decreases.
Advanced scaling controls provide the ability to set an auto-scaling maximum for additional cost control or an auto-scaling minimum to reserve computing resources for applications with specialized performance requirements.
Compatibility
ApsaraDB for ClickHouse Enterprise Edition provides a set of critical features that are available in open source ClickHouse. The following section describes some features enabled in ApsaraDB for ClickHouse Enterprise Edition:
DDL syntax
In most cases, the DDL syntax of ApsaraDB for ClickHouse Enterprise Edition matches the syntax that is available in self-managed ClickHouse. However, the following exceptions exist:
ApsaraDB for ClickHouse Enterprise Edition does not support the
CREATE AS SELECT
statement. We recommend that you use theCREATE ... EMPTY ... AS SELECT
statement and insert data into the table created by using the statement. For more information, see Getting Data Into ClickHouse - Part 1.ApsaraDB for ClickHouse Enterprise Edition does not support some experimental syntax, such as the
ALTER TABLE … MODIFY QUERY
statement.For security purposes, ApsaraDB for ClickHouse Enterprise Edition disables some default features, such as the
addressToLine
SQL function.ApsaraDB for ClickHouse Enterprise Edition does not support the
ON CLUSTER
parameter.
Database and table engines
ApsaraDB for ClickHouse Enterprise Edition provides high-availability services by default. The following table engines are supported:
SharedMergeTree (default, when none is specified)
SharedSummingMergeTree
SharedAggregatingMergeTree
SharedReplacingMergeTree
SharedCollapsingMergeTree
SharedVersionedCollapsingMergeTree
MergeTree (converted to SharedMergeTree)
SummingMergeTree (converted to SharedSummingMergeTree)
AggregatingMergeTree (converted to SharedAggregatingMergeTree)
ReplacingMergeTree (converted to SharedReplacingMergeTree)
CollapsingMergeTree (converted to SharedCollapsingMergeTree)
VersionedCollapsingMergeTree (converted to SharedVersionedCollapsingMergeTree)
URL
View
MaterializedView
GenerateRandom
Null
Buffer
Memory
Deltalake
Hudi
MySQL
MongoDB
NATS
PostgreSQL
Kafka
S3
NoteApsaraDB for ClickHouse Enterprise Edition simplifies table creation. Therefore, you do not need to use a distributed table engine.
Dictionaries
ApsaraDB for ClickHouse Enterprise Edition allows you to obtain dictionaries from PostgreSQL, MySQL, remote and local ClickHouse servers, Redis, MongoDB, and HTTP sources. This speeds up lookups in ClickHouse.
Federated queries
ApsaraDB for ClickHouse Enterprise Edition supports federated ClickHouse queries for cross-cluster communication in the cloud, and for communication with external self-managed ClickHouse clusters. The following integration engines are supported:
Deltalake
Hudi
MySQL
MongoDB
NATS
PostgreSQL
OSS
ApsaraDB for ClickHouse Enterprise Edition does not support federated queries with some external database and table engines, such as SQLite, Open Database Connectivity (ODBC), Java Database Connectivity (JDBC), Redis, RabbitMQ, Hadoop Distributed File System (HDFS), and Hive.
Experimental features
Experimental features are used to test new features or implement potential improvements in ClickHouse, such as new SQL syntax, query optimization, and performance optimization. In the development environment, you can enable experimental features. By default, experimental features are disabled in the production environment of ApsaraDB for ClickHouse Enterprise Edition to ensure the stability of the production environment. If you want to enable an experimental feature in your production environment, contact Alibaba Cloud technical support. This ensures that the stability of your production environment is not affected after the feature is enabled.
Default operational settings and considerations
The following section describes the default settings of ApsaraDB for ClickHouse Enterprise Edition clusters. In most cases, default values are used for these settings to ensure the normal operation of the service. You can also change the default values in special cases.
Limits settings
max_parts_in_total: 10,000
The default value of the max_parts_in_total parameter for MergeTree tables has been lowered from 100,000 to 10,000. The reason for this change is that a large number of data parts may cause a slow startup of services in the cloud. A large number of data parts usually indicate that you have chosen a partition key that is excessively granular, which is typically done accidentally and should be avoided. This change allows for the detection of these cases earlier. This parameter specifies the maximum number of data parts in a table.
System settings
ApsaraDB for ClickHouse Enterprise Edition is tuned for variable workloads. Therefore, most system settings cannot be configured. Most users do not need to tune system settings. If you have a question about advanced system tuning, contact Alibaba Cloud technical support.