本文介绍如何使用DMS(Data Management Service)在云数据库ClickHouse中创建表。
前提条件
已完成快速入门的如下步骤。
操作步骤
在数据管理DMS控制台的SQL Console页面中,输入
CREATE TABLE
语句创建本地表。说明本地表(Local tables)是实际存储数据的表,每个ClickHouse节点上都有其自己的本地表,这些表用于管理和查询存储在该节点上的数据。本地表适合于作为集群中每个节点的数据分片。
语法:
CREATE TABLE [IF NOT EXISTS] [db.]local_table_name ON CLUSTER cluster ( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], ... INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2 ) ENGINE = engine_name() [PARTITION BY expr] [ORDER BY expr] [PRIMARY KEY expr] [SAMPLE BY expr] [SETTINGS name=value, ...];
参数说明:
参数
说明
db
指定数据库的名称,默认为当前选择的数据库。
local_table_name
本地表名。
cluster
指定集群的名称,固定为default。
name1,name2
列名。
type1,type2
列类型。
engine_name
表引擎类型,具体请参见表引擎。
说明单副本版为MergeTree,双副本版为ReplicatedMergeTree。
PARTITION BY
指定分区键。
ORDER BY
指定排序键。
PRIMARY KEY
指定主键。
SAMPLE BY
抽样表达式。如果要使用抽样表达式,主键中必须包含这个表达式。
SETTINGS
影响性能的其他参数。
说明更多参数说明,请参见Create Table。
示例:
CREATE TABLE clickhouse_demo.ontime_local ON CLUSTER default ( `Year` UInt16, `Quarter` UInt8, `Month` UInt8, `DayofMonth` UInt8, `DayOfWeek` UInt8, `FlightDate` Date, `Reporting_Airline` String, `DOT_ID_Reporting_Airline` Int32, `IATA_CODE_Reporting_Airline` String, `Tail_Number` String, `Flight_Number_Reporting_Airline` String, `OriginAirportID` Int32, `OriginAirportSeqID` Int32, `OriginCityMarketID` Int32, `Origin` FixedString(5), `OriginCityName` String, `OriginState` FixedString(2), `OriginStateFips` String, `OriginStateName` String, `OriginWac` Int32, `DestAirportID` Int32, `DestAirportSeqID` Int32, `DestCityMarketID` Int32, `Dest` FixedString(5), `DestCityName` String, `DestState` FixedString(2), `DestStateFips` String, `DestStateName` String, `DestWac` Int32, `CRSDepTime` Int32, `DepTime` Int32, `DepDelay` Int32, `DepDelayMinutes` Int32, `DepDel15` Int32, `DepartureDelayGroups` String, `DepTimeBlk` String, `TaxiOut` Int32, `WheelsOff` Int32, `WheelsOn` Int32, `TaxiIn` Int32, `CRSArrTime` Int32, `ArrTime` Int32, `ArrDelay` Int32, `ArrDelayMinutes` Int32, `ArrDel15` Int32, `ArrivalDelayGroups` Int32, `ArrTimeBlk` String, `Cancelled` UInt8, `CancellationCode` FixedString(1), `Diverted` UInt8, `CRSElapsedTime` Int32, `ActualElapsedTime` Int32, `AirTime` Nullable(Int32), `Flights` Int32, `Distance` Int32, `DistanceGroup` UInt8, `CarrierDelay` Int32, `WeatherDelay` Int32, `NASDelay` Int32, `SecurityDelay` Int32, `LateAircraftDelay` Int32, `FirstDepTime` String, `TotalAddGTime` String, `LongestAddGTime` String, `DivAirportLandings` String, `DivReachedDest` String, `DivActualElapsedTime` String, `DivArrDelay` String, `DivDistance` String, `Div1Airport` String, `Div1AirportID` Int32, `Div1AirportSeqID` Int32, `Div1WheelsOn` String, `Div1TotalGTime` String, `Div1LongestGTime` String, `Div1WheelsOff` String, `Div1TailNum` String, `Div2Airport` String, `Div2AirportID` Int32, `Div2AirportSeqID` Int32, `Div2WheelsOn` String, `Div2TotalGTime` String, `Div2LongestGTime` String, `Div2WheelsOff` String, `Div2TailNum` String, `Div3Airport` String, `Div3AirportID` Int32, `Div3AirportSeqID` Int32, `Div3WheelsOn` String, `Div3TotalGTime` String, `Div3LongestGTime` String, `Div3WheelsOff` String, `Div3TailNum` String, `Div4Airport` String, `Div4AirportID` Int32, `Div4AirportSeqID` Int32, `Div4WheelsOn` String, `Div4TotalGTime` String, `Div4LongestGTime` String, `Div4WheelsOff` String, `Div4TailNum` String, `Div5Airport` String, `Div5AirportID` Int32, `Div5AirportSeqID` Int32, `Div5WheelsOn` String, `Div5TotalGTime` String, `Div5LongestGTime` String, `Div5WheelsOff` String, `Div5TailNum` String )ENGINE = ReplicatedMergeTree() PARTITION BY toYYYYMM(FlightDate) PRIMARY KEY (intHash32(FlightDate)) ORDER BY (intHash32(FlightDate)) SAMPLE BY intHash32(FlightDate) SETTINGS index_granularity= 8192 ;
创建分布式表,方便数据的写入和查询。
说明分布式表(Distributed tables)是虚拟的表,不直接存储任何数据,而是作为一个查询层,将查询分发到一个或多个节点上的本地表,并汇总返回结果。分布式表适用于在ClickHouse集群环境中实现数据分布查询和处理。
语法:
CREATE TABLE [db.]d_table_name ON CLUSTER cluster AS db.local_table_name ENGINE = Distributed(cluster, db, local_table_name [, sharding_key])
参数说明:
参数
说明
db
指定数据库的名称,默认为当前选择的数据库。
d_table_name
分布式表名。
cluster
指定集群的名称,固定为default。
local_table_name
已创建的本地表名。
sharding_key
分片表达式。
说明更多参数说明,请参见Create Table。
示例:
CREATE TABLE ontime_local_distributed ON CLUSTER default AS clickhouse_demo.ontime_local ENGINE = Distributed(default, clickhouse_demo, ontime_local, rand());