All Products
Search
Document Center

AnalyticDB:CREATE EXTERNAL TABLE

Last Updated:Nov 08, 2024

AnalyticDB for MySQL allows you to execute the CREATE EXTERNAL TABLE statement to create external tables based on a variety of services, such as Object Storage Service (OSS), ApsaraDB RDS for MySQL, ApsaraDB for MongoDB, Tablestore, and MaxCompute.

Prerequisites

  • An AnalyticDB for MySQL Data Lakehouse Edition cluster is created.

  • The minor version of the AnalyticDB for MySQL cluster is 3.1.8.0 or later.

    Note

    To query the minor version of an AnalyticDB for MySQL Data Lakehouse Edition cluster, execute the SELECT adb_version(); statement. To update the minor version of a cluster, contact technical support.

  • An external database is created. For more information, see CREATE EXTERNAL DATABASE.

Usage notes

When you create external tables across Alibaba Cloud accounts, only OSS external tables can be created.

OSS external tables

Important
  • The OSS bucket in which the OSS external table is created must reside in the same region as the AnalyticDB for MySQL cluster.

  • You can create Hudi external tables only for AnalyticDB for MySQL clusters of V3.1.9.2 or later.

    To query the minor version of an AnalyticDB for MySQL Data Lakehouse Edition cluster, execute the SELECT adb_version(); statement. To update the minor version of a cluster, contact technical support.

  • To query data of a partitioned OSS external table that you created, you must execute the MSCK REPAIR TABLE statement to synchronize the partitions of the external table.

  • To create an OSS external table across Alibaba Cloud accounts, you must first create a cross-account OSS external database with the required parameters configured. For more information, see CREATE EXTERNAL DATABASE.

Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name
(column_name column_type[, ...])
[PARTITIONED BY (column_name column_type[, ...])]
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS {TEXTFILE|ORC|PARQUET|JSON|RCFIL|HUDI}
LOCATION 'OSS_LOCATION';
[TBLPROPERTIES (
 'type' = 'cow|mor'
 'auto.create.location' = 'true|false')]

Parameters

Parameter

Required

Description

table_name (column_name column_type[, ...])

Yes

The name and schema of the external table.

The table and column names must comply with the naming conventions. For more information, see the "Naming limits" section of the Limits topic.

PARTITIONED BY (column_name column_type[, ...])

No

The partition key column. To create a partitioned external table, you must specify this parameter. To create a multi-level partitioned table, you can specify multiple partition key columns.

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

Yes

The column delimiter. You can specify a character that is the same as the delimiter in the file. In this example, a comma (,) is used.

Important

You can configure this parameter only for STORED AS TEXTFILE and STORED AS JSON.

STORED AS {TEXTFILE|ORC|PARQUET|JSON|RCFILE|HUDI}

Yes

The storage format of the file.

For TXT and CSV files, set this parameter to STORED AS TEXTFILE. PARQUET files support the STRUCT data type and nested data types.

Important

Only AnalyticDB for MySQL clusters of V3.1.8.0 or later support PARQUET files of the STRUCT data type.

LOCATION

Yes

The path of the OSS object or directory.

When you specify an OSS directory, we recommend that you comply with the following rules. Otherwise, a query error or data exception may occur.

  • The OSS directory must end with a forward slash (/).

  • All objects in the directory must be stored in the same format.

  • All objects in the directory must use the same quantity, order, and data types of fields.

When you create a partitioned external table, you must set the LOCATION parameter to the upper-level directory of partitions. For example, the path of an OSS object is oss://testBucketname/testfolder/p1=2023-06-13/data.csv. In this case, you must specify LOCATION 'oss://testBucketname/testfolder/' to create a partitioned external table that has the p1 partition key column.

Important
  • When you create a Hudi external table, make sure that the Hudi metadata file named .hoodies exists in the OSS path.

  • Assume that you set the auto.create.location parameter to true. If the OSS path or directory specified by the LOCATION parameter does not exist when you create a partitioned external table, an OSS path or directory is automatically created.

type

No

The type of the Hudi external table. Valid values:

  • COW (default): suitable for scenarios that have high requirements on read efficiency.

  • MOR: suitable for scenarios that have high requirements on write efficiency.

Important

You must specify this parameter only for STORED AS HUDI.

auto.create.location

No

Specifies whether to have an OSS path or directory automatically created. Valid values:

  • true:

  • false (default)

Important

This parameter takes effect only when you create a partitioned external table.

Examples

Example 1: Create a non-partitioned external table

  • Create a non-partitioned external table with the file storage format set to TEXTFILE.

    CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.osstest1
    (id int,
    name string,
    age int,
    city string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY  ','
    STORED AS TEXTFILE
    LOCATION  'oss://testBucketName/osstest/p1=hangzhou/p2=2023-06-13/data.csv';
  • Create a non-partitioned external table with the file storage format set to HUDI.

    CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.osstest2
    (id int,
    name string,
    age int,
    city string)
    STORED AS HUDI
    LOCATION  'oss://testBucketName/osstest/test'
    TBLPROPERTIES ('type' = 'cow');
    Important

    When you create a Hudi external table, the _hoodie_commit_time, _hoodie_commit_seqno, _hoodie_record_key, _hoodie_partition_path, and _hoodie_file_name columns are automatically created.

  • Create a non-partitioned external table with the file storage format set to PARQUET.

    CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.osstest3
    (
    A STRUCT < var1:string, var2:int >
    ) 
    STORED AS PARQUET 
    LOCATION 'oss://testBucketName/osstest/Parquet';

Example 2: Create a partitioned external table

CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.osstest4
(id int,
name string,
age int,
city string)
PARTITIONED BY (p2 string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY  ','
STORED AS TEXTFILE
LOCATION  'oss://testBucketName/osstest/p1=hangzhou/';

Example 3: Create a multi-level partitioned external table

CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.osstest5
(id int,
name string,
age int,
city string)
PARTITIONED BY (p1 string,p2 string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY  ','
STORED AS TEXTFILE
LOCATION  'oss://testBucketName/osstest/';

ApsaraDB RDS for MySQL external tables

Important
  • When you create an ApsaraDB RDS for MySQL external table, you must enable Elastic Network Interface (ENI) on the Cluster Information page of the AnalyticDB for MySQL console.

  • The ApsaraDB RDS for MySQL instance must reside in the same virtual private cloud (VPC) as the AnalyticDB for MySQL cluster.

Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name
(column_name column_type[, ...])
ENGINE='MYSQL'
TABLE_PROPERTIES='{  
	"url":"mysql_vpc_address",  
	"tablename":"mysql_table_name",  
	"username":"mysql_user_name",  
	"password":"mysql_user_password"
	[,"charset":"{gbk|utf8|utf8mb4}"]
  }';

Parameters

Parameter

Required

Description

table_name (column_name column_type[, ...])

Yes

The name and schema of the external table.

The table and column names must comply with the naming conventions. For more information, see the "Naming limits" section of the Limits topic.

ENGINE='MYSQL'

Yes

The storage engine of the external table. To read and write ApsaraDB RDS for MySQL data, set the storage engine to MYSQL.

TABLE_PROPERTIES

Yes

The properties of the external table.

url

Yes

The VPC endpoint and port number of the ApsaraDB RDS for MySQL instance and the name of the database in the instance. For information about how to obtain the VPC endpoint of an ApsaraDB RDS for MySQL instance, see View and change the internal and public endpoints and port numbers of an ApsaraDB RDS for MySQL instance.

tablename

Yes

The name of the ApsaraDB RDS for MySQL external table.

username

Yes

The name of the database account of the ApsaraDB RDS for MySQL instance.

password

Yes

The password of the database account of the ApsaraDB RDS for MySQL instance.

charset

No

The character set that is used by the database in the ApsaraDB RDS for MySQL instance. Valid values:

  • gbk

  • utf8 (default)

  • utf8mb4

Example

CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.mysqltest (
	id int,
	name varchar(1023),
	age int
 ) ENGINE = 'MYSQL'
 TABLE_PROPERTIES = '{
   "url":"jdbc:mysql://rm-bp1gx6h1tyd04****.mysql.rds.aliyuncs.com:3306/test_adb",
   "tablename":"person",
   "username":"testUserName",
   "password":"testUserPassword",
   "charset":"utf8"
}';

ApsaraDB for MongoDB external tables

Important
  • When you create an ApsaraDB for MongoDB external table, you must enable ENI on the Cluster Information page of the AnalyticDB for MySQL console.

  • The ApsaraDB for MongoDB instance must reside in the same VPC as the AnalyticDB for MySQL cluster.

Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name
(column_name column_type[, ...])
ENGINE='MONGODB'
TABLE_PROPERTIES = '{
	"mapped_name":"table",
  "location":"location",
  "username":"user",
  "password":"password",
}';

Parameters

Parameter

Required

Description

table_name (column_name column_type[, ...])

Yes

The name and schema of the external table.

The table and column names must comply with the naming conventions. For more information, see the "Naming limits" section of the Limits topic.

ENGINE='MYSQL'

Yes

The storage engine of the external table. To read and write ApsaraDB for MongoDB data, set the storage engine to MONGODB.

TABLE_PROPERTIES

Yes

The properties of the external table.

mapped_name

Yes

The name of the ApsaraDB for MongoDB collection.

location

Yes

The VPC endpoint of the ApsaraDB for MongoDB instance. For information about how to obtain the VPC endpoint, see Overview of instance connections.

username

Yes

The name of the database account of the ApsaraDB for MongoDB instance. For information about how to create a database account, see Manage the permissions of MongoDB database users.

Note

ApsaraDB for MongoDB verifies the specified database account name and password. You must use the database account name that is included in the VPC endpoint of the ApsaraDB for MongoDB instance. If you have any questions, contact technical support.

password

Yes

The password of the database account of the ApsaraDB for MongoDB instance.

Example

CREATE EXTERNAL TABLE adb_external_demo.person (
  id int,
  name string,
  age int
) ENGINE = 'MONGODB' TABLE_PROPERTIES ='{
"mapped_name":"person",
"location":"mongodb://testuser:****@dds-bp113d414bca8****.mongodb.rds.aliyuncs.com:3717,dds-bp113d414bca8****.mongodb.rds.aliyuncs.com:3717/test_mongodb",
"username":"testuser",
"password":"password",
}';

Tablestore external tables

Important

The VPC to which the Tablestore instance is bound must be the same as the VPC in which the AnalyticDB for MySQL cluster resides.

Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name
(column_name column_type[, ...])
ENGINE='OTS'
TABLE_PROPERTIES = '{
	"mapped_name":"table_name",
	"location":"tablestore_vpc_address"
}';

Parameters

Parameter

Required

Description

table_name (column_name column_type[, ...])

Yes

The name and schema of the external table. The table and column names must comply with the naming conventions. For more information, see the "Naming limits" section of the Limits topic.

ENGINE='OTS'

Yes

The storage engine of the external table. To read and write Tablestore data, set the storage engine to OTS.

mapped_name

Yes

The name of the table in the Tablestore instance. To view the name of the table, log on to the Tablestore console and go to the Instance Management page.

location

Yes

The VPC endpoint of the Tablestore instance. To view the VPC endpoint of the instance, log on to the Tablestore console and go to the Instance Management page.

Example

CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.otstest (
	id int,
	name string,
	age int
) ENGINE = 'OTS' 
TABLE_PROPERTIES = '{
	"mapped_name":"person",
	"location":"https://w0****la.cn-hangzhou.vpc.tablestore.aliyuncs.com"
}';

MaxCompute external tables

Important
  • The MaxCompute project must reside in the same region as the AnalyticDB for MySQL cluster.

  • For information about how to batch create MaxCompute external tables, see IMPORT FOREIGN SCHEMA.

Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name
(column_name column_type[, ...])
ENGINE='ODPS'
TABLE_PROPERTIES='{
	"endpoint":"endpoint",
	"accessid":"accesskey_id",
	"accesskey":"accesskey_secret",
	["partition_column":"partition_column"],
	"project_name":"project_name",
	"table_name":"table_name"
}'; 

Parameters

Parameter

Required

Description

table_name (column_name column_type[, ...])

Yes

The name and schema of the external table. The table schema must contain the partition key column.

table_name specifies the name of the table and column_name specifies the name of the column. The table and column names must comply with the naming conventions. For more information, see the "Naming limits" section of the Limits topic.

column_type specifies the data type of the column. Basic data types and complex data types of MaxCompute such as ARRAY, MAP, and STRUCT are supported.

Note

Only AnalyticDB for MySQL clusters of V3.2.1.0 or later support complex data types of MaxCompute. For more information, see the "Complex data types" section of the MaxCompute V2.0 data type edition topic.

To query the minor version of an AnalyticDB for MySQL Data Lakehouse Edition cluster, execute the SELECT adb_version(); statement. To update the minor version of a cluster, contact technical support.

ENGINE='ODPS'

Yes

The storage engine of the external table. To read and write MaxCompute data, set the storage engine to ODPS.

endpoint

Yes

The endpoint of the MaxCompute project.

Note

You can access MaxCompute only by using VPC endpoints. For more information, see Endpoints.

accessid

Yes

The AccessKey ID of an Alibaba Cloud account or a Resource Access Management (RAM) user that is used to access MaxCompute.

For information about how to obtain an AccessKey ID and an AccessKey secret, see Accounts and permissions.

accesskey

Yes

The AccessKey secret of an Alibaba Cloud account or a RAM user that is used to access MaxCompute.

For information about how to obtain an AccessKey ID and an AccessKey secret, see Accounts and permissions.

partition_column

No

The partition key column. To create a partitioned external table, you must specify this parameter.

project_name

Yes

The name of the MaxCompute project.

table_name

Yes

The name of the MaxCompute table.

Example

CREATE EXTERNAL TABLE IF NOT EXISTS adb_external_demo.mctest (
	id int,
	name varchar(1023),
	age int,
	dt string
) ENGINE='ODPS'
TABLE_PROPERTIES='{
	"accessid":"LTAILd4****",
	"endpoint":"http://service.cn-hangzhou.maxcompute.aliyun.com/api",
	"accesskey":"4A5Q7ZVzcYnWMQPysX****",
	"partition_column":"dt",
	"project_name":"test_adb",
	"table_name":"person"
}';

References