This topic provides an example to show how to use clickhouse-client to import data to ApsaraDB for ClickHouse. In this example, the On Time dataset is imported to the distributed table ontime_local_distributed in the clickhouse_demo database.
Prerequisites
The following steps that are listed in Getting Started are complete:
- Note
You need to add the IP address of the server where clickhouse-client is installed to the whitelist of the ApsaraDB for ClickHouse cluster.
clickhouse-client that corresponds to the version of the ApsaraDB for ClickHouse cluster is installed. For more information, see clickhouse-client.
Procedure
Click On Time Data to download the On Time dataset.
Decompress the On Time dataset package that you downloaded.
unzip ontime-data(1).zip
Connect to the ApsaraDB for ClickHouse cluster and import data to ApsaraDB for ClickHouse.
Run the following command in the directory in which clickhouse-client is installed:
./clickhouse-client --host=<host> --port=<port> --user=<user> --password=<password> --query="INSERT INTO <ClickHouse_table> FORMAT CSVWithNames" < ontime-data.csv
The following table describes the parameters.
Parameter
Description
host
The public endpoint or virtual private cloud (VPC) endpoint. You can view the public endpoint or VPC endpoint on the Cluster Information page.
If the server where clickhouse-client is installed is deployed in the same VPC as the ApsaraDB for ClickHouse cluster, use the VPC endpoint. If the server and the cluster are deployed in different VPCs, use the public endpoint.
port
The TCP port number. You can view the TCP port number on the Cluster Information page.
user
The database account. You can create a database account in the ApsaraDB for ClickHouse console.
password
The password of the database account.
ClickHouse_table
The table in ApsaraDB for ClickHouse to which you want to import data.
The following sample statement is provided:
./clickhouse-client --host=cc-bp16qwvp7hy8i****.public.clickhouse.ads.aliyuncs.com --port=3306 --user=test --password=123456Aa --query="INSERT INTO clickhouse_demo.ontime_local_distributed FORMAT CSVWithNames" < ontime-data.csv
Query data to check whether the data is imported.
SELECT OriginCityName, count(*) AS flights FROM ontime_local_distributed GROUP BY OriginCityName ORDER BY flights DESC LIMIT 10;
The following query result is returned:
OriginCityName │ flights ──────────────────────│──────── Chicago, IL │ 24114 Atlanta, GA │ 22001 Dallas/Fort Worth, TX │ 17340 Los Angeles, CA │ 14494 Denver, CO │ 14170 New York, NY │ 14075 Washington, DC │ 11985 Houston, TX │ 11483 San Francisco, CA │ 11259 St. Louis, MO │ 10721