All Products
Search
Document Center

AnalyticDB:Export data to OSS

Last Updated:Jul 07, 2023

This topic describes how to use external tables and the INSERT INTO statement to export data from AnalyticDB for MySQL Data Warehouse Edition (V3.0) to Object Storage Service (OSS). Only CSV and Parquet files can be exported from AnalyticDB for MySQL to OSS.

Prerequisites

  • A directory is created in OSS by performing the following operations to store AnalyticDB for MySQL data:

    1. Activate OSS.

      Note

      Make sure that the involved OSS bucket resides in the same region as the AnalyticDB for MySQL Data Warehouse Edition (V3.0) cluster.

    2. Create a bucket in the OSS console. For more information, see Create buckets.

    3. Create a directory in the bucket. For more information, see Create directories.

      For example, you can create a directory named adb_data/ in OSS to store data exported from AnalyticDB for MySQL Data Warehouse Edition (V3.0).

      新建目录
  • An AnalyticDB for MySQL Data Warehouse Edition (V3.0) cluster, an account, and a database are created. A whitelist is configured. For more information, see Create a cluster.

Note

If the AnalyticDB for MySQL Data Warehouse Edition (V3.0) cluster is in elastic mode, you must log on to the AnalyticDB for MySQL console and view the cluster information, and then turn on Elastic Network Interface (ENI) in the Network Information section.启用ENI网络

Procedure

In this example, data is exported from the source_table table in the adb_demo database of the AnalyticDB for MySQL Data Warehouse Edition (V3.0) cluster to the adb_data directory in OSS.

  1. Connect to the AnalyticDB for MySQL cluster and the database. For more information, see Connect to an AnalyticDB for MySQL cluster.

  2. Create an external table in the adb_demo database. For more information, see the "Create an OSS external table for a non-partitioned object" section of the Use external tables to import data to Data Warehouse Edition topic.

  3. Select statements that can be executed to write data based on the type of the external table, and write the source data to the external table created in Step 2. For more information about the syntax supported for different types of external tables, see the "Syntax used to write data to an external table that is not partitioned" and "Syntax used to write data to an external table that is partitioned" sections of this topic.

  4. After the write task in Step 3 is completed, log on to the OSS console and view the data exported to the directory in OSS. Alternatively, connect to the AnalyticDB for MySQL cluster to query the data exported to the external table.

Syntax used to write data to an external table that is not partitioned

When you execute INSERT INTO or INSERT OVERWRITE SELECT to import data, the statement is synchronously executed by default. If hundreds of gigabytes of data is imported, the connection between the client and the AnalyticDB for MySQL server may be closed, and the data import task fails. In this case, we recommend that you asynchronously execute the INSERT OVERWRITE SELECT statement to import data by adding SUBMIT JOB to the beginning of the statement. Example: SUBMIT JOB INSERT OVERWRITE adb_table SELECT * FROM oss_table;. For more information about how to submit asynchronous tasks, see Asynchronously submit an import task.

Important

By default, the objects imported to OSS do not contain column names. If you want these objects to have column names, add the following hint to the beginning of the data import statement: /*+sql_output_oss_file_head_enable=true*/.

  • INSERT SELECT FROM

    • Feature: If your data is stored in a different table, you can execute the INSERT SELECT FROM statement to copy the data to an external table. You can write data from a source table to an external table, and then export the data to OSS. Each time the data is written, a new OSS object is generated.

      Important

      The columns that you want to export must all be written to the external table. You cannot specify specific columns to be written.

      Syntax:

      INSERT INTO table_name 
      SELECT select_statement FROM from_statement;       

      Example:

      insert into oss_table select col1, col2, col3 from source_table;
  • INSERT OVERWRITE SELECT

    • Feature: You can execute the INSERT OVERWRITE SELECT statement to batch insert data to a table. If data already exists in an external table, all objects in the original external table path are deleted before a new OSS object is generated.

      Important

      The columns that you want to export must all be written to the external table. You cannot specify specific columns to be written.

      Syntax:

      INSERT OVERWRITE table_name 
      select_statement;

      Example:

      INSERT OVERWRITE oss_table 
      SELECT col1, col2, col3 FROM source_table;
  • REPLACE SELECT FROM

    • Feature: OSS external tables do not support primary keys. The write performance of the REPLACE SELECT FROM statement is consistent with that of the INSERT SELECT FROM statement. Data is appended to another table. If data already exists in the destination table, the existing data remains unchanged, and the new data is saved to the new OSS object.

  • Export data in the CSV format to a single OSS object (Parquet format unsupported)

    • Feature: You can specify a single OSS object by using hints and export data to this object. When the overwrite keyword is included, the object that has the same name in the directory defined in the TABLE_PROPERTIES parameter of the external table is overwritten. Other objects in the directory are not affected.

      Important

      The columns that you want to export must all be written to the external table. You cannot specify specific columns to be written.

      Version description:

      • AnalyticDB for MySQL versions earlier than 3.1.2 do not support this feature. Object names are automatically generated by the system, and multiple objects are exported. The system dynamically determines the number of the objects based on how fast concurrent tasks are executed.

      • AnalyticDB for MySQL version 3.1.2 and later allow you to export data from an AnalyticDB for MySQL external table in the CSV format to a single OSS object by using hints. You can specify a name for the object. If data of the external table is exported without hints, multiple objects are exported as in AnalyticDB for MySQL versions earlier than 3.1.2.

      Syntax:

      /*+output_filename=adb.txt*/INSERT [OVERWRITE] table_name 
      SELECT select_statement FROM from_statement;

      Example:

      /*+output_filename=adb.txt*/INSERT [OVERWRITE] oss_table 
      SELECT * FROM source_table;

Syntax used to write data to an external table that is partitioned

When you execute INSERT INTO or INSERT OVERWRITE SELECT to import data, the statement is synchronously executed by default. If hundreds of gigabytes of data is imported, the connection between the client and the AnalyticDB for MySQL server may be closed, and the data import task fails. In this case, we recommend that you asynchronously execute the INSERT OVERWRITE SELECT statement to import data by adding SUBMIT JOB to the beginning of the statement. Example: SUBMIT JOB INSERT OVERWRITE adb_table SELECT * FROM oss_table;. For more information about how to submit asynchronous tasks, see Asynchronously submit an import task.

Important

By default, the objects imported to OSS do not contain column names. If you want these objects to have column names, add the following hint to the beginning of the data import statement: /*+sql_output_oss_file_head_enable=true*/.

When you write data from partitioned tables to objects, the objects do not contain the data of partition key columns. Data of partition key columns is displayed in OSS directories.

Important

Data of a partitioned external table cannot be exported to a single OSS object.

For example, two partition key columns and three common columns are defined for a partitioned table. The partition key column in the hash partition is named pcol1, and its value is 1. The partition key column in the list partition is named pcol2, and its value is a. Data of the partitioned table is exported to the adb_data/ path in OSS. If you write data from the partition whose pcol1 is set to 1 and pcol2 is set to a, the relative path of the object is adb_data/pcol1=1/pcol2=a/. The CSV or Parquet object does not contain the values of the pcol1 and pcol2 columns. Only the values of the three common columns are included.

  • INSERT INTO PARTITION SELECT FROM

Feature: You can execute the INSERT INTO PARTITION SELECT FROM statement to batch insert data to an external table that is partitioned. When you write data, you can specify all partition columns and their values in the PARTITION field. You can also specify partition values only of high-level partitions. The partition values of low-level partitions are dynamically generated. Alternatively, you may not specify the PARTITION field. In this case, partition values of all levels are dynamically generated.

New data is appended to the corresponding partition. Each time the data is written, a new OSS object is generated.

The columns that you want to export must all be written to the external table. You cannot specify specific columns to be written.

Fully static partition

Syntax:

INSERT INTO table_name PARTITION(par1=val1,par2=val2,...) 
select_statement;

Example:

INSERT INTO oss_table_par PARTITION(par1=val1,par2=val2) 
select col1, col2, col3 FROM source_table;

Static-dynamic partition

Syntax:

INSERT INTO table_name PARTITION(par1=val1,par2,...) 
select_statement;

Example:

INSERT INTO oss_table_par PARTITION(par1=val1,par2) 
SELECT col1, col2, col3, par2col FROM source_table;

Fully dynamic partition

Syntax:

INSERT INTO table_name 
select_statement;

Example:

INSERT INTO oss_table_par
SELECT col1, col2, col3, par1col, par2col FROM source_table;
  • INSERT OVERWRITE PARTITION SELECT

Feature: The INSERT OVERWRITE PARTITION SELECT statement is used in the same manner as the INSERT INTO PARTITION SELECT statement. However, when you execute the INSERT OVERWRITE PARTITION SELECT statement, existing objects in the desired partitions are cleared. Objects in partitions that have no new data written are not cleared.

Syntax:

INSERT OVERWRITE table_name PARTITION(par1=val1,par2=val2,...)[IF NOT EXISTS] 
select_statement;

Example:

INSERT OVERWRITE oss_table_par PARTITION(par1=val1,par2=val2) IF NOT EXISTS 
SELECT col1, col2, col3 FROM source_table;
  • REPLACE INTO PARTITION SELECT FROM

Feature: External tables do not support primary keys. The write performance of the REPLACE INTO PARTITION SELECT FROM statement is consistent with that of the INSERT INTO PARTITION SELECT FROM statement.

Unsupported statements

AnalyticDB for MySQL does not allow you to insert data to external tables by row. INSERT INTO VALUES and REPLACE INTO VALUES are not supported.