All Products
Search
Document Center

ApsaraDB for ClickHouse:CREATE TABLE

Last Updated:Jun 13, 2024

This topic describes how to create a table in ApsaraDB for ClickHouse.

Create a local table

Syntax:

CREATE TABLE [IF NOT EXISTS] [db.]local_table_name ON CLUSTER cluster
(
    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
    ...
    INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
    INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = engine_name()
[PARTITION BY expr]
ORDER BY expr
[PRIMARY KEY expr]
[SAMPLE BY expr]
[SETTINGS name=value, ...];

The following table describes the parameters in the preceding syntax.

Parameter

Description

db

The name of the database. The default value is the name of the current database. In this example, default is used as the database name.

local_table_name

The name of the local table.

ON CLUSTER cluster

Specifies that a local table is created on each node. Set the value to ON CLUSTER default.

name1,name2

The name of the column.

type1,type2

The data type of the column.

Note

For more information about the data types that are supported by ApsaraDB for ClickHouse, see Data types.

ENGINE = engine_name()

The type of the table engine.

When you create a table in a cluster of Double-replica Edition, you must use a table engine that is prefixed by Replicated and supports data replication among the engines in the MergeTree family. If you do not use the table engine that is prefixed by Replicated, data cannot be replicated between replicas and inconsistent data query results are returned. When you use this engine to create a table, use one of the following methods to configure the parameters:

  • ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}'). The settings are fixed and do not need to be modified.

  • ReplicatedMergeTree(). This is equivalent to ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}').

Note

For more information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.

ORDER BY expr

The sort key. This parameter is required. The value can be a tuple of a set of columns or an expression.

[DEFAULT|MATERIALIZED|ALIAS expr]

The default expression.

  • DEFAULT: default expression. If the field is left empty, the default value is automatically generated and populated.

  • MATERIALIZED: materialized expression.

  • ALIAS: alias expression.

GRANULARITY

The index granularity.

[PARTITION BY expr]

The partition key. In most cases, data is partitioned by date. You can specify another field or field expression as the partition key.

[PRIMARY KEY expr]

The primary key. By default, the primary key is the same as the sort key. In most cases, you do not need to use the PRIMARY KEY clause to specify the primary key.

[SAMPLE BY expr]

The sampling expression. If you want to use the sampling expression, include the sampling expression in the primary key.

[SETTINGS name=value, ...]

Additional parameters that affect the performance of creating the table.

Note

For more information about the parameters that can be configured in SETTINGS, see SETTINGS configuration items.

Note

Only table engines in the MergeTree family support the following parameters: ORDER BY, GRANULARITY, PARTITION BY, PRIMARY KEY, SAMPLE BY, and [SETTINGS name=value, ...]. For more information about more parameters, see CREATE TABLE.

Example:

CREATE TABLE local_table ON CLUSTER default
(
    Year UInt16,
    Quarter UInt8,
    Month UInt8,
    DayofMonth UInt8,
    DayOfWeek UInt8,
    FlightDate Date,
    FlightNum String,
    Div5WheelsOff String,
    Div5TailNum String
)ENGINE = MergeTree()
 PARTITION BY toYYYYMM(FlightDate)
 PRIMARY KEY (intHash32(FlightDate))
 ORDER BY (intHash32(FlightDate),FlightNum)
 SAMPLE BY intHash32(FlightDate)
SETTINGS index_granularity= 8192;

Create a distributed table

A distributed table is a collection of local tables. The distributed table abstracts the local tables into a unified table and supports data writes and queries. When data is written to a distributed table, the data is automatically distributed to each local table in the collection. When a distributed table is queried, local tables in the collection are separately queried and the final results are rolled up and returned. Before you create a distributed table, create a local table.

Syntax:

CREATE TABLE [db.]distributed_table_name ON CLUSTER default
 AS db.local_table_name ENGINE = Distributed(cluster, db, local_table_name [, sharding_key])

The following table describes the parameters in the preceding syntax.

Parameter

Description

db

The name of the database. The default value is the name of the current database. In this example, default is used as the database name.

distributed_table_name

The name of the distributed table.

ON CLUSTER cluster

Specifies that a table is created on each node. Set the value to ON CLUSTER default.

local_table_name

The name of the local table that you created.

sharding_key

The sharding expression. The sharding expression determines the shard to which you want to write data.

The value of the sharding_key parameter can be an expression, such as the rand() function, or a column, such as the user_id column of the INTEGER type.

Example:

CREATE TABLE distributed_table ON CLUSTER default
 AS default.local_table 
ENGINE = Distributed(default, default, local_table, rand());

Create a table by copying the schema of an existing table

You can create a table by copying the schema of an existing table so that the table has the same schema as the source table.

Syntax:

CREATE TABLE [IF NOT EXISTS] [db.]table_name2 ON CLUSTER cluster AS [db.]table_name1 [ENGINE = engine_name];

The following table describes the parameters in the preceding syntax.

Parameter

Description

db

The name of the database. The default value is the name of the current database. In this example, default is used as the database name.

table_name1

The name of the source table from which the schema is copied. In this example, the local table local_table that has been created is used.

table_name2

The name of the table that you want to create.

ON CLUSTER cluster

Specifies that a table is created on each node. Set the value to ON CLUSTER default.

[ENGINE = engine_name]

The type of the table engine. If you do not specify a table engine when you create a table, the table engine of the source table is used by default.

Note

For more information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.

Example:

create table t2 ON CLUSTER default as default.local_table;

Create a table that has the same schema as the query result set of the SELECT clause

You can use a specified table engine to create a table that has the same schema as the query result set of the SELECT clause. The query result set of the SELECT clause is populated to the table.

Syntax:

CREATE TABLE [IF NOT EXISTS] [db.]s_table_name ON CLUSTER cluster ENGINE = engine_name() AS SELECT ...

The following table describes the parameters in the preceding syntax.

Parameter

Description

db

The name of the database. The default value is the name of the current database. In this example, default is used as the database name.

s_table_name

The name of the table that you want to create by specifying the SELECT clause in the CREATE TABLE statement.

ON CLUSTER cluster

Specifies that a table is created on each node. Set the value to ON CLUSTER default.

ENGINE = engine_name()

The type of the table engine.

Note

For more information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.

SELECT ...

The SELECT clause.

Example:

create table t3 ON CLUSTER default ENGINE =MergeTree() order by Year as select * from default.local_table;

References