All Products
Search
Document Center

AnalyticDB:Use OSS foreign tables to import and export data

Last Updated:Aug 26, 2024

AnalyticDB for PostgreSQL allows you to import and export data between Object Storage Service (OSS) and an AnalyticDB for PostgreSQL table by using OSS foreign tables.

Supported object formats

You can use OSS foreign tables to import data from and export data to only the following data objects:

  • Uncompressed CSV and TEXT objects.

  • GZIP-compressed CSV and TEXT objects.

  • ORC binary objects.

    For information about data type mappings between ORC and AnalyticDB for PostgreSQL, see the "Data type mappings between ORC and AnalyticDB for PostgreSQL" section of the Data type mappings for OSS foreign tables topic.

Create an OSS server, a user mapping to the OSS server, and an OSS foreign table

Before you use OSS foreign tables, you must create an OSS server, a user mapping to the OSS server, and an OSS foreign table.

  • For information about how to create an OSS server, see the "Create an OSS server" section of the Use OSS foreign tables for data lake analysis topic.

  • For information about how to create a user mapping to the OSS server, see the "Create a user mapping to the OSS server" section of the Use OSS foreign tables for data lake analysis topic.

  • For information about how to create an OSS foreign table, see the "Create an OSS foreign table" section of the Use OSS foreign tables for data lake analysis topic.

Import data from OSS to an AnalyticDB for PostgreSQL table

  1. Upload a sample file named example.csv to OSS. For more information, see the "Preparations" section of the Use OSS foreign tables for data lake analysis topic and Upload objects.

    Note
    • To eliminate the encoding process and improve efficiency, we recommend that you use the same data encoding format for data objects and databases. The default database encoding format is UTF-8.

    • All compute nodes of AnalyticDB for PostgreSQL use a polling mechanism to read data objects of OSS in parallel.

    • AnalyticDB for PostgreSQL can read multiple CSV and TEXT objects in parallel. By default, four objects can be read in parallel. To maximize the read efficiency, we recommend that you set the number of objects that can be read in parallel to an integer multiple of the number of compute node cores. The number of compute node cores is the product of the number of compute nodes and the number of cores per compute node. For information about how to split objects, see the "Split large objects" section of the Best practices for performing operations on OSS foreign tables topic.

  2. Connect to an AnalyticDB for PostgreSQL database. For more information, see Client connection.

  3. Create an OSS server.

    CREATE SERVER oss_serv
        FOREIGN DATA WRAPPER oss_fdw
        OPTIONS (
            endpoint 'oss-cn-********.aliyuncs.com',
            bucket 'adb-pg'
      );
  4. Create a user to access the OSS server.

    CREATE USER MAPPING FOR PUBLIC
        SERVER oss_serv
        OPTIONS (
            id 'LTAI5t7Ge***************',
            key 'FikziJd2La*******************'
        );
  5. Create an OSS foreign table named ossexample.

    CREATE FOREIGN TABLE ossexample (
        date text,
        time text,
        open float,
        high float,
        low float,
        volume int
    ) SERVER oss_serv OPTIONS (dir 'oss_adb/', format 'csv');
  6. Execute an INSERT or CREATE TABLE AS statement to import OSS data.

    • INSERT

      1. To execute an INSERT statement to import data, create an AnalyticDB for PostgreSQL table named adbexample that has the same schema as the foreign table.

        CREATE TABLE adbexample (
            date text,
            time text,
            open float,
            high float,
            low float,
            volume int
        ) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN, COMPRESSTYPE=ZSTD, COMPRESSLEVEL=5);
      2. Execute an INSERT statement to import data of the foreign table ossexample to the AnalyticDB for PostgreSQL table adbexample.

        INSERT INTO adbexample SELECT * FROM ossexample;
    • CREATE TABLE AS

      Execute a CREATE TABLE AS statement to create an AnalyticDB for PostgreSQL table named adbexample and import data from the foreign table ossexample.

      CREATE TABLE adbexample AS SELECT * FROM ossexample DISTRIBUTED BY (volume);

Export data from an AnalyticDB for PostgreSQL table to OSS

  1. Create an OSS foreign table that can be used to export data to a CSV object stored in the tt_csv directory.

    CREATE FOREIGN TABLE foreign_x (i int, j int)
    SERVER oss_serv
    OPTIONS (format 'csv', dir 'tt_csv/');
  2. Execute an INSERT INTO statement to export data from an AnalyticDB for PostgreSQL table to OSS.

    INSERT INTO foreign_x SELECT * FROM local_x;

Naming conventions for exported objects

When data is exported, multiple compute nodes write data to the same directory in parallel. The objects exported by using OSS foreign tables are named in the following format:

{tablename | prefix } _{timestamp}_{random_key}_{seg}{segmentid}_{fileno}.{ext}[.gz]

The following table describes the parameters.

Parameter

Description

{tablename | prefix }

The prefix of the object name. The presentation of the prefix varies based on whether the prefix or dir parameter is used when you create an OSS foreign table.

  • If the prefix parameter is used, the object name is prefixed with the specified parameter value.

  • If the dir parameter is used, the object name is prefixed with the name of the OSS foreign table.

{timestamp}

The timestamp when data is exported. Format: YYYYMMDDHH24MISS.

{random_key}

The random key value.

{seg}{segmentid}

The combination of seg string and the ID of the compute node from which the object is exported. For example, seg1 indicates that the object is exported from compute node 1.

{fileno}

The number of the object segment. The value of this parameter starts from 0.

{ext}

The format of the exported object. Valid values:

  • .csv

  • .txt

  • .orc

[.gz]

Indicates that the exported object is compressed by using GZIP.

Examples:

  • Export data to a GZIP-compressed CSV object and use the dir parameter to specify the object directory.

    CREATE FOREIGN TABLE fdw_t_out_1(a int)
    SERVER oss_serv
    OPTIONS (format 'csv', filetype 'gzip', dir 'test/');

    The object is named in the following format:

    fdw_t_out_1_20200805110207_1718599661_seg-1_0.csv.gz
  • Export data to an ORC object and use the prefix parameter to specify the object path prefix.

    CREATE FOREIGN TABLE fdw_t_out_2(a int)
    SERVER oss_serv
    OPTIONS (format 'orc', prefix 'test/my_orc_test');

    The object is named in the following format:

    my_orc_test_20200924153043_1737154096_seg0_0.orc

References