All Products
Search
Document Center

E-MapReduce:ClickHouse

Last Updated:Dec 12, 2024

Alibaba Cloud E-MapReduce (EMR) provides a managed ClickHouse service based on open source ClickHouse. Open source ClickHouse is an online analytical processing (OLAP) engine. EMR ClickHouse supports all features of open source ClickHouse. EMR ClickHouse provides the following features that are developed based on Alibaba Cloud: quick deployment of clusters, cluster management, scaling, and monitoring and alerting. EMR ClickHouse also provides better read and write performance than open source ClickHouse and can be integrated with other EMR components in an efficient manner.

Features

Feature

Description

Column-oriented storage

Column-oriented storage provides better query performance than row-oriented storage. Column-oriented storage features a high data compression ratio that helps save storage space.

Massively parallel processing (MPP) architecture

Each node accesses only its own memory and disks. Nodes communicate with each other in parallel with independent data processing on each node. The MPP architecture provides excellent query performance and high scalability.

Vectorized engine: Data is processed by a column vector, which is a part of a column. Vectorized execution together with column-oriented storage improves CPU utilization.

Support for SQL

ClickHouse supports a declarative query language based on SQL. The query language uses the American National Standards Institute (ANSI) SQL standards in many cases. ClickHouse supports GROUP BY, ORDER BY, FROM, JOIN, and IN queries, and non-correlated subqueries.

Real-time data update

ClickHouse allows you to define a primary key in a table. Data is incrementally sorted and stored in a table engine of the MergeTree type. This way, you can efficiently query data based on primary keys.

ClickHouse supports near real-time data insertion, metric aggregation, and index creation.

Support for indexes

Data can be sorted by primary key. In this case, specific values or data in specific value ranges can be extracted within dozens of milliseconds.

Note

Inapplicable scenarios:

  • Complete transactions are not supported.

  • Data cannot be modified or deleted at a high frequency or a low latency.

  • Data can be modified or deleted only in batches.