All Products
Search
Document Center

Hologres:Export data to a data lake

Last Updated:Nov 18, 2024

This topic describes how to use Data Lake Formation (DLF) to write data in a Hologres internal table back to Object Storage Service (OSS) by executing an SQL statement and then query the written data by using an external engine.

Prerequisites

DLF is activated and the environment configuration is complete. Make sure that a Hologres foreign table can be used to read data from OSS. For more information, see the Procedures section of the "Use DLF to read data from and write data to OSS" topic.

Export data to a data lake

If data is updated, you need to write the data back to OSS and then process the data by using an external engine such as E-MapReduce (EMR). You can execute an SQL statement to directly insert data into OSS by using a foreign table.

Note

You can export only data in the following formats: ORC, Parquet, CSV, SequenceFile, Hudi, and Paimon.

  1. Write data back to OSS.

    Execute the following SQL statement to write the data in a Hologres internal table back to OSS:

    INSERT INTO <foreign_table_name>(<col_name>,......) SELECT <col_name>,...... FROM <holo_table_name>;

    The following table describes the parameters in the SQL statement:

    Parameter

    Description

    foreign_table_name

    The name of the foreign table.

    holo_table_name

    The name of the Hologres internal table.

    col_name

    The name of a column.

  2. Query the data that is written back to OSS.

    After the data is written back to OSS, you can execute the following SQL statement in the Hive or Spark engine of EMR to query the written data:

    SELECT * FROM <foreign_table_name> WHERE <col_name> = value;

    If a success message is returned, Hologres has written the data back to OSS and EMR can read the data.