All Products
Search
Document Center

ApsaraDB for ClickHouse:Use an external table to access OSS data

Last Updated:Dec 30, 2025

This topic describes how to query data in Alibaba Cloud Object Storage Service (OSS) using ApsaraDB for ClickHouse. This topic also describes how to write data to OSS or export data from ClickHouse to OSS.

Prerequisites

  • OSS requirements:

  • Cluster requirements:

    You have created a database account. For more information, see Account Management.

Preparations

  1. Create an OSS bucket. For more information, see Create buckets in the console.

    Important

    Make sure that the OSS bucket is in the same region as your ApsaraDB for ClickHouse cluster.

    This topic uses a bucket named ck-test-oss as an example.

  2. Upload a file.

    This topic uses the test.csv file as an example. Before you start, upload this file to OSS. Files that are uploaded to OSS are called objects. For more information, see Upload objects using the console.

    If you already have data in OSS, ensure that the data is in a file format that ApsaraDB for ClickHouse can access. All formats are supported except for Protobuf and CapnProto. For more information, see File formats supported by ClickHouse.

Query OSS data

Method 1: Use an OSS external table

Enterprise Edition or Community-compatible Edition v22.8 or later

  1. Log on to the database.

    1. Log on to the ApsaraDB for ClickHouse console.

    2. On the Clusters page, select the Clusters of Community-compatible Edition tab or the Enterprise Edition Clusters tab, and then click the ID of your cluster.

    3. On the Cluster Information page, click Log On to Database in the upper-right corner.

    4. In the Log On to Instance dialog box, enter the database account and password, and click Log On.

  2. Create an OSS external table.

    An OSS external table is a special table type in ApsaraDB for ClickHouse that lets you directly query files stored in OSS without importing data into local storage. This feature separates storage from computing.

    The following is the syntax for creating a table.

    CREATE TABLE <table_name> [ON cluster default]
    (
    'col_name1' col_type1,
    'col_name2' col_type2,
    ...
    )
    ENGINE = OSS('https://<bucket-name>.<oss-endpoint>/<file-name>', '<access-key-id>', '<access-key-secret>', '<file-format-name>');

    Parameters:

    Parameter name

    Description

    Example

    table_name

    The table name.

    oss_test_tb

    col_name1,col_name2

    The column name.

    user_name

    col_type1,col_type2

    The column type.

    Important

    The structure of the OSS external table must match the OSS data.

    String

    bucket-name

    The name of the OSS bucket.

    ck-test-oss

    oss-endpoint

    This endpoint provides private network access to an ECS instance in a VPC. For more information, see Regions and Endpoints.

    Important

    Make sure that the OSS bucket and your ApsaraDB for ClickHouse cluster are in the same region.

    oss-cn-hangzhou-internal.aliyuncs.com

    file-name

    The name of the object in the OSS bucket.

    test.csv

    access-key-id

    The AccessKey ID of the RAM user that is used to access OSS data.

    yourAccessKeyID

    access-key-secret

    The AccessKey secret of the RAM user that is used to access OSS data.

    Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

    Important

    To reduce security threats, the AccessKey secret is shown only when you create it. You cannot retrieve it later. Make sure to store it securely.

    yourAccessKeySecret

    file-format-name

    File format

    Important

    This parameter must be specified exactly as listed in File formats supported by ClickHouse. The names are case-sensitive.

    CSV

    Example:

    CREATE TABLE oss_test_tb ON cluster default
    (
    id UInt8,
    user_name String,
    age UInt16,
    city String,
    access_url String
    )
    ENGINE = OSS('http://ck-test-oss.oss-cn-hangzhou-internal.aliyuncs.com/test.csv', 'LTAI****************', 'yourAccessKeySecret','CSV')
  3. Query the OSS data.

    SELECT * FROM oss_test_tb;

    The result is as follows.

    +--------------+---------------------+---------------+----------------+----------------------+
    | id           | user_name           | age           | city           | access_url           |
    +--------------+---------------------+---------------+----------------+----------------------+
    | 1           | tick                | 32            | shanghai       | http://example.com   |
    +--------------+---------------------+---------------+----------------+----------------------+
    Returned rows: [1], Time elapsed: [183ms]

Community-compatible Edition v21.8 or earlier

  1. Log on to the database.

    1. Log on to the ApsaraDB for ClickHouse console.

    2. On the Clusters page, click the Clusters of Community-compatible Edition tab, and then click the ID of your cluster.

    3. On the Cluster Information page, click Log On to Database in the upper-right corner.

    4. In the Log On to Instance dialog box, enter the database account and password, and click Log On.

  2. Create an OSS external table.

    The following is the syntax for creating a table.

    CREATE TABLE <table_name> [ON cluster default]
    (
    'col_name1' col_type1,
    'col_name2' col_type2,
    ...
    )
    ENGINE = OSS('<oss-endpoint>', '<access-key-id>', '<access-key-secret>', '<oss-file-path>', '<file-format-name>');

    Parameters:

    Parameter Name

    Description

    Example

    table_name

    The table name.

    oss_test_tb

    col_name1,col_name2

    The column name.

    user_name

    col_type1,col_type2

    The column type.

    Important

    The structure of the OSS external table must match the OSS data.

    String

    oss-endpoint

    The Endpoint used to access an ECS instance over the VPC private network. For more information, see Regions and Endpoints.

    Important

    Make sure that the OSS bucket and your ApsaraDB for ClickHouse cluster are in the same region.

    oss-cn-hangzhou-internal.aliyuncs.com

    access-key-id

    The AccessKey ID of the RAM user that is used to access OSS data.

    yourAccessKeyID

    access-key-secret

    The AccessKey secret of the RAM user that is used to access OSS data.

    Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

    Important

    To reduce security threats, the AccessKey secret is shown only when you create it. You cannot retrieve it later. Make sure to store it securely.

    yourAccessKeySecret

    oss-file-path

    The storage path of the object. The format is usually oss://<bucket-name>/<path-to-file>.

    Note

    The oss-file-path parameter supports fuzzy matching with wildcard characters. For more information, see Use wildcard characters for fuzzy matching of OSS storage paths.

    oss://ck-test-oss/test.csv

    file-format-name

    File format

    Important

    This parameter must be specified exactly as listed in File formats supported by ClickHouse. The names are case-sensitive.

    CSV

    Example:

    CREATE TABLE oss_test_tb ON cluster default
    (
    id UInt8,
    user_name String,
    age UInt16,
    city String,
    access_url String
    )
    ENGINE = OSS('oss-cn-hangzhou-internal.aliyuncs.com', 'LTAI****************', 'yourAccessKeySecret', 'oss://ck-test-oss/test.csv', 'CSV');
  3. Query the OSS data.

    SELECT * FROM oss_test_tb;

    The result is as follows.

    +--------------+---------------------+---------------+----------------+----------------------+
    | id           | user_name           | age           | city           | access_url           |
    +--------------+---------------------+---------------+----------------+----------------------+
    | 1           | tick                | 32            | shanghai       | http://example.com   |
    +--------------+---------------------+---------------+----------------+----------------------+
    Returned rows: [1], Time elapsed: [183ms]

Method 2: Use a table function

Enterprise Edition or Community-compatible Edition v22.8 or later

  1. Log on to the database.

    1. Log on to the ApsaraDB for ClickHouse console.

    2. On the Clusters page, select the Clusters of Community-compatible Edition tab or the Enterprise Edition Clusters tab, and then click the ID of your cluster.

    3. On the Cluster Information page, click Log On to Database in the upper-right corner.

    4. In the Log On to Instance dialog box, enter the database account and password, and click Log On.

  2. Use a table function to query data.

    Syntax:

    SELECT * FROM oss('https://<bucket-name>.<oss-endpoint>/<file-name>','<access-key-id>', '<access-key-secret>', '<file-format-name>', '<col_name> <col_type>(,...)');

    The parameters are the same as those for creating an OSS external table. For more information, see Parameters.

    Example:

    SELECT * FROM oss('http://testBucketName.oss-cn-shanghai-internal.aliyuncs.com/test.csv', 'LTAI****************', 'yourAccessKeySecret', 'CSV', 'id UInt8, user_name String, age UInt16, city String, access_url String')

    The result is as follows.

    +--------------+---------------------+---------------+----------------+----------------------+
    | id           | user_name           | age           | city           | access_url           |
    +--------------+---------------------+---------------+----------------+----------------------+
    | 1           | tick                | 32            | shanghai       | http://example.com   |
    +--------------+---------------------+---------------+----------------+----------------------+
    Returned rows: [1], Time elapsed: [183ms]

Community-compatible Edition v21.8 or earlier

  1. Log on to the database.

    1. Log on to the ApsaraDB for ClickHouse console.

    2. On the Clusters page, click the Clusters of Community-compatible Edition tab, and then click the ID of your cluster.

    3. On the Cluster Information page, click Log On to Database in the upper-right corner.

    4. In the Log On to Instance dialog box, enter the database account and password, and click Log On.

  2. Use a table function to query data.

    Syntax:

    SELECT * FROM oss('<oss-endpoint>', '<access-key-id>', '<access-key-secret>', '<oss-file-path>', '<file-format-name>', '<col_name> <col_type>(,...)');

    The parameters are the same as those for creating an OSS external table. For more information, see Parameters.

    Example:

    SELECT * FROM oss('oss-cn-hangzhou-internal.aliyuncs.com', 'LTAI****************', 'yourAccessKeySecret', 'oss://ck-test-oss/test.csv', 'CSV', 'id UInt8, user_name String, age UInt16, city String, access_url String');

    The result is as follows.

    +--------------+---------------------+---------------+----------------+----------------------+
    | id           | user_name           | age           | city           | access_url           |
    +--------------+---------------------+---------------+----------------+----------------------+
    | 1           | tick                | 32            | shanghai       | http://example.com   |
    +--------------+---------------------+---------------+----------------+----------------------+
    Returned rows: [1], Time elapsed: [183ms]

Export ClickHouse data or write data to OSS

Usage notes

Enterprise Edition or Community-compatible Edition v22.8 or later

When you export data from ClickHouse to OSS or write data directly to OSS, you must set either `s3_truncate_on_insert=1` or `s3_create_new_file_on_insert=1`. If you do not set these parameters, or if you set them to 0, the operation fails if an object already exists at the destination path.

  • s3_truncate_on_insert=1: ApsaraDB for ClickHouse checks if an object exists at the destination path before exporting or writing data.

    • If the file already exists, ClickHouse will overwrite the file with new data.

    • This is suitable for full overwrite scenarios. Use this with caution, because this operation deletes the original data.

  • s3_create_new_file_on_insert=1: ClickHouse checks if an object exists at the destination path before exporting or writing data.

    • If a file with the specified name already exists, ClickHouse creates a new file in the same folder and writes the data to it. The new file is named using the following format: object file name + an incremental number that starts from 0 + file format name.

      For example, if the destination object specified when you create the OSS external table is test.csv, the new object is named test0.csv after the first operation. After the second operation, the new object is named test1.csv.

    • This method is suitable for incremental export or write scenarios and avoids overwriting existing data.

Community-compatible Edition v21.8 or earlier

When you export data from ClickHouse to OSS or write data directly to OSS, only one OSS object is used. The exported or written data overwrites the content of the original object. Use this with caution.

Step 1: Log on to the database

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select the Clusters of Community-compatible Edition tab or the Enterprise Edition Clusters tab, and then click the ID of your cluster.

  3. On the Cluster Information page, click Log On to Database in the upper-right corner.

  4. In the Log On to Instance dialog box, enter the database account and password, and click Log On.

Step 2: (Optional) Prepare the data to export

If you are exporting data from ClickHouse to OSS and do not have data to export, follow these steps to prepare the data.

If you already have data to export, you can skip this step.

  1. Create a table based on the cluster edition.

    The following example creates a table on a single-replica cluster of the Community-compatible Edition. To create a table on a dual-replica cluster of the Enterprise Edition or Community-compatible Edition, see Create a table.

     CREATE TABLE test_tb_local ON cluster default
    (
    id UInt8,
    user_name String,
    age UInt16,
    city String,
    access_url String
    )
    ENGINE = MergeTree()
    ORDER BY id;
  2. Write sample data.

    INSERT INTO test_tb_local VALUES (13, 'tick', 25, 'shanghai', 'http://example.com');

  3. (Optional) View the data.

    You can run the following statement to verify that the data was written successfully.

    SELECT * FROM test_tb_local;

Step 3: Create an OSS external table

Enterprise Edition or Community-compatible Edition v22.8 or later

The following is the syntax for creating a table.

CREATE TABLE <table_name> [ON cluster default]
(
'col_name1' col_type1,
'col_name2' col_type2,
...
)
ENGINE = OSS('https://<bucket-name>.<oss-endpoint>/<file-name>', '<access-key-id>', '<access-key-secret>', '<file-format-name>');

Parameters:

Parameter name

Description

Example

table_name

The table name.

oss_test_tb

col_name1,col_name2

The column name.

user_name

col_type1,col_type2

The column type.

Important

The structure of the OSS external table must match the OSS data.

String

bucket-name

The name of the OSS bucket.

ck-test-oss

oss-endpoint

This endpoint provides private network access to an ECS instance in a VPC. For more information, see Regions and Endpoints.

Important

Make sure that the OSS bucket and your ApsaraDB for ClickHouse cluster are in the same region.

oss-cn-hangzhou-internal.aliyuncs.com

file-name

The name of the object in the OSS bucket.

test.csv

access-key-id

The AccessKey ID of the RAM user that is used to access OSS data.

yourAccessKeyID

access-key-secret

The AccessKey secret of the RAM user that is used to access OSS data.

Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

Important

To reduce security threats, the AccessKey secret is shown only when you create it. You cannot retrieve it later. Make sure to store it securely.

yourAccessKeySecret

file-format-name

File format

Important

This parameter must be specified exactly as listed in File formats supported by ClickHouse. The names are case-sensitive.

CSV

Example:

CREATE TABLE oss_test_tb ON cluster default
(
id UInt8,
user_name String,
age UInt16,
city String,
access_url String
)
ENGINE = OSS('http://ck-test-oss.oss-cn-hangzhou-internal.aliyuncs.com/test.csv', 'LTAI****************', 'yourAccessKeySecret','CSV')

Community-compatible Edition v21.8 or earlier

The following is the syntax for creating a table.

CREATE TABLE <table_name> [ON cluster default]
(
'col_name1' col_type1,
'col_name2' col_type2,
...
)
ENGINE = OSS('<oss-endpoint>', '<access-key-id>', '<access-key-secret>', '<oss-file-path>', '<file-format-name>');

Parameters:

Parameter Name

Description

Example

table_name

The table name.

oss_test_tb

col_name1,col_name2

The column name.

user_name

col_type1,col_type2

The column type.

Important

The structure of the OSS external table must match the OSS data.

String

oss-endpoint

The Endpoint used to access an ECS instance over the VPC private network. For more information, see Regions and Endpoints.

Important

Make sure that the OSS bucket and your ApsaraDB for ClickHouse cluster are in the same region.

oss-cn-hangzhou-internal.aliyuncs.com

access-key-id

The AccessKey ID of the RAM user that is used to access OSS data.

yourAccessKeyID

access-key-secret

The AccessKey secret of the RAM user that is used to access OSS data.

Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

Important

To reduce security threats, the AccessKey secret is shown only when you create it. You cannot retrieve it later. Make sure to store it securely.

yourAccessKeySecret

oss-file-path

The storage path of the object. The format is usually oss://<bucket-name>/<path-to-file>.

Note

The oss-file-path parameter supports fuzzy matching with wildcard characters. For more information, see Use wildcard characters for fuzzy matching of OSS storage paths.

oss://ck-test-oss/test.csv

file-format-name

File format

Important

This parameter must be specified exactly as listed in File formats supported by ClickHouse. The names are case-sensitive.

CSV

Example:

CREATE TABLE oss_test_tb ON cluster default
(
id UInt8,
user_name String,
age UInt16,
city String,
access_url String
)
ENGINE = OSS('oss-cn-hangzhou-internal.aliyuncs.com', 'LTAI****************', 'yourAccessKeySecret', 'oss://ck-test-oss/test.csv', 'CSV');

Step 4: Export or write data to OSS

To export or write data to OSS, use an OSS external table and an INSERT INTO statement. For more information about the syntax, see INSERT INTO. The following examples provide the statements for different scenarios.

Export ClickHouse data

Enterprise Edition or Community-compatible Edition v22.8 or later

  • Full overwrite export (export data to OSS and overwrite existing data).

    INSERT INTO oss_test_tb SETTINGS s3_truncate_on_insert = 1 SELECT * FROM test_tb_local;
  • Incremental export (export data to OSS and do not overwrite existing data).

    INSERT INTO oss_test_tb SETTINGS s3_create_new_file_on_insert = 1 SELECT * FROM test_tb_local;

    After this operation is complete, a new object is created in the OSS bucket. The new object is named as follows: destination object name + a number starting from 0 and incrementing by 1 + object format name.

    For example, if the destination object specified when you create the OSS external table is test.csv, the new object is named test0.csv. If you run the operation again, the new object is named test1.csv. You can view the new object in the OSS console.

Community-compatible Edition v21.8 or earlier

Important

Exporting ClickHouse data to OSS overwrites existing data. Use this with caution.

INSERT INTO oss_test_tb  SELECT * FROM test_tb_local;

Write data

Enterprise Edition or Community-compatible Edition v22.8 or later

  • Write new data to OSS and overwrite existing data.

    INSERT INTO oss_test_tb SETTINGS s3_truncate_on_insert = 1 VALUES (14, 'tick', 25, 'shanghai', 'http://example.com');
  • Write new data to OSS and do not overwrite existing data.

    INSERT INTO oss_test_tb SETTINGS s3_create_new_file_on_insert = 1 VALUES (14, 'tick', 25, 'shanghai', 'http://example.com');

    After this operation is complete, a new object is created in the OSS bucket. The new object is named as follows: destination object name + a number starting from 0 and incrementing by 1 + object format name.

    For example, if the destination object specified when you create the OSS external table is test.csv, the new object is named test0.csv. If you run the operation again, the new object is named test1.csv. You can view the new object in the OSS console.

Community-compatible Edition v21.8 or earlier

Important

Writing data to OSS from ClickHouse overwrites existing data. Use this with caution.

INSERT INTO oss_test_tb SETTINGS s3_truncate_on_insert = 1 VALUES (14, 'tick', 25, 'shanghai', 'http://example.com');

More operations: Import OSS data to ClickHouse

If you use OSS to stage and migrate source data to ClickHouse, you must also import the data from OSS into the destination ClickHouse cluster. For more information, see Import data from OSS.