In Hologres V3.0 and later, you can create an external table in an external schema that is sourced from DLF 2.0. You can create only external tables in the Apache Paimon format. This topic describes the usage and limits of the CREATE EXTERNAL TABLE statement.
Syntax
Table creation syntax
The following code shows the syntax of the CREATE EXTERNAL TABLE statement:
CREATE EXTERNAL TABLE [IF NOT EXISTS] <ext_db_name>.<ext_schema_name>.<ext_table_name>
(
[{
<col_name> <col_type> [<col_constraints> [, ...]] |
<table_constraints> [, ...]
}]
)
[LOGICAL PARTITION BY LIST(<col_name> [, ...])]
[
WITH
(
"<property>" = '<value>' [, ...]
)
];
Parameters in the WITH clause
Parameter | Description | Example |
table_format | The table format of the external table that you want to create. Only Apache Paimon tables are supported. |
|
file_format | The file format of the external table that you want to create. Only the ORC and Parquet formats are supported. |
|
bucket | Data is read from and written to Apache Paimon tables by bucket. After you configure this parameter, all data in a non-partitioned table or data in each partition of a partitioned table is distributed to different buckets. This allows concurrent data reads from or concurrent data writes to Apache Paimon tables in a job, which improves processing efficiency. For more information, see Data Distribution. |
|
bucket-key | The key based on which data is distributed to buckets. |
|
changelog-producer | To allow downstream consumption in streaming mode, complete changelogs for INSERT, DELETE, and UPDATE operations on Apache Paimon tables are required. A changelog is similar to a binary log in a database. This parameter specifies the method in which changelogs are generated for Apache Paimon tables. For more information, see Changelog Producer. |
|
Other operations
Query the table creation statement
SELECT * FROM hologres.hg_dump_script_external('<ext_db_name>.<ext_schema_name>.<ext_table_name>');
Query the column and partition properties of a table
SELECT * FROM hologres.hg_external_columns('<ext_db_name>', '<ext_schema_name>', '<ext_table_name>');
Query all tables in an external schema
SELECT * FROM hologres.hg_external_tables ('<ext_db_name>', '<ext_schema_name>');
Refresh the metadata of an external table
REFRESH CACHE FOR External TABLE <ext_db_name>.<ext_schema_name>.<ext_table_name> WITH( cache_level = 'metadata');
Examples
Create a partitioned table in DLF 2.0
In this example, an Apache Paimon append-only table is created.
CREATE EXTERNAL TABLE ext_db_dlf.ext_schema_dlf.ext_par_table_dlf(
id TEXT,
created_at BIGINT,
type TEXT,
actor_id TEXT,
actor_login TEXT,
repo_id TEXT,
repo_name TEXT,
org TEXT,
org_login TEXT
)
LOGICAL PARTITION BY LIST(created_at)
WITH (
"file_format" = 'orc',
"bucket"=6,
"bucket-key"='id'
);
Create a non-partitioned table in DLF 2.0
In this example, an Apache Paimon primary key table is created.
CREATE EXTERNAL TABLE openlake_win.github_events.gh_event_ods(
id TEXT,
created_at BIGINT,
type TEXT,
actor_id TEXT,
actor_login TEXT,
repo_id TEXT,
repo_name TEXT,
org TEXT,
org_login TEXT,
PRIMARY KEY(id)
) WITH (
"changelog-producer"='input',
"bucket"=6,
"bucket-key"='id'
);