This topic describes the terms used in ApsaraDB for ClickHouse to help you better understand ApsaraDB for ClickHouse.
Common terms
Region
A region is the geographical location in which the server that runs your ApsaraDB for ClickHouse cluster resides. When you purchase ApsaraDB for ClickHouse, you must specify a region. The region that you specify cannot be changed.
Zone
A zone is a physical area that has an independent power supply and network facilities. Zones in the same region are connected to each other over the internal network. The network latency between clusters in the same zone is lower than the network latency between clusters in different zones.
Database
A database is the object that has the highest level in an ApsaraDB for ClickHouse cluster. A database can contain objects such as tables, columns, views, functions, and data types.
Community-compatible Edition
ApsaraDB for ClickHouse cluster
An ApsaraDB for ClickHouse cluster is a distributed database that runs on multiple physical ClickHouse servers. A ClickHouse server can have one or more replicas and one or more shards based on different specifications.
An ApsaraDB for ClickHouse cluster can contain multiple logical database objects.
Edition
ApsaraDB for ClickHouse provides the following cluster editions:
Double-replica:
A shard has two replicas. If one replica in the shard fails, the other replica in the shard takes over the services from the failed replica.
In a Double-replica Edition cluster, each piece of data is replicated to two different replicas. This ensures that the data on both replicas remains consistent.
When you create tables in a Double-replica Edition cluster, make sure that the tables use Replicated table engines from the MergeTree family. If the tables use non-Replicated table engines, the data on the tables is not replicated across replicas. This can lead to data inconsistency.
Single-replica: A shard has only one replica. If the replica fails, the cluster becomes unavailable. If you want the cluster to continue providing stable services, wait until the replica is fully restored.
A Double-replica Edition cluster uses twice the number of resources and costs twice as much as a Single-replica Edition cluster.
A Single-replica Edition cluster uses highly reliable disks to prevent data losses.
Shard
In scenarios in which a large amount of data must be processed, the storage and computing resources of a single server may cause performance bottlenecks. To further improve service efficiency, ApsaraDB for ClickHouse distributes and stores a large amount of data on multiple servers. Each server stores and processes only a specific amount of data. Each server is called a shard.
Replica
To ensure data security and high availability when an error occurs, ApsaraDB for ClickHouse provides replicas. Data on a server can be replicated to two or more servers.
Table
Tables are used to store the basic structure of data. A table consists of rows and columns. Each column represents a field, and each row represents a record.
ApsaraDB for ClickHouse tables are classified into local tables and distributed tables based on data distribution.
Table type | Description | Comparison |
Table type | Description | Comparison |
Local table | When you write data to a local table, the data is written only to the server on which the local table is stored and cannot be distributed to other servers. |
|
Distributed table | A distributed table is a collection of local tables. A distributed table consists of multiple local tables. A distributed table abstracts local tables into a centralized table and allows external users to query and analyze data in the table. When data is written to a distributed table, the data is automatically distributed to each local table of the distributed table. When data in a distributed table is queried, the data in each local table is queried. The query results of all local tables are aggregated and returned. |
ApsaraDB for ClickHouse tables are classified into non-replicated tables and replicated tables based on the storage engine.
Table type | Description | Comparison |
Table type | Description | Comparison |
Non-replicated table | A non-replicated table is stored on a single server and cannot be replicated to other servers. A non-replicated table has only one replica. |
|
Replicated table | Data in a replicated table is automatically replicated to multiple servers. A replicated table can have multiple replicas. |
Data part
A data part is a data fragment stored on a hard disk and serves as the basic unit of data storage in an ApsaraDB for ClickHouse table. Each time you write data to a table, a new data part is generated. Each data part is self-contained, including all columns and indexes for the segment of the data, and maintains the order of data. A data part allows you to perform merging and compression operations in an efficient manner, which is essential for high-performance data query and processing in AparaDB for ClickHouse.
Enterprise Edition
ApsaraDB for ClickHouse cluster
An ApsaraDB for ClickHouse cluster consists of a specific amount of computing and storage resources and provides a platform as a service (PaaS) service that features data storage and analysis capabilities based on the ClickHouse engine.
Worker node
Worker nodes are replica nodes in an ApsaraDB for ClickHouse cluster, which are involved in the actual computational tasks of the engine.
CCU
ClickHouse Compute Unit (CCU) is the billing unit of computing resources in an ApsaraDB for ClickHouse cluster. One CCU is equal to one virtual CPU (vCPU) and 4 GiB of memory. Standard billing unit: CCU per minute.
Auto scaling of computing resources
The system automatically adjusts the number of CCUs based on the memory usage.
Auto scaling range
You can specify the maximum and minimum numbers of CCUs. The system automatically scales resources within the range that you specify.
Storage resources
Storage resources are the shared storage solution used by Enterprise Edition and are billed based on the pay-as-you-go billing method.