All Products
Search
Document Center

PolarDB:Export local tables to OSS in serial

Last Updated:Nov 29, 2024

You can use the OSSOUTFILE feature in PolarDB to export local data files as CSV files and store them in Object Storage Service (OSS).

Prerequisites

Your PolarDB cluster runs one of the following database engine versions:

  • PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.8 or later.

  • PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.30 or later.

For information about how to obtain the database engine version of a cluster, see Query the engine version.

Usage notes

  • You cannot use Data Management (DMS) to connect to a PolarDB cluster and export data files to OSS. Otherwise, the following error message is returned:

    Can not issue executeUpdate() or executeLargeUpdate() for SELECTs

    You can connect to a PolarDB cluster by using a client or the command line and export data files to OSS. For more information, see Connect to a cluster.

  • If you want to back up incremental data to OSS, you can create and configure a backup schedule. For more information, see Create a backup schedule and Configure a backup schedule.

  • We recommend that you perform the export tasks on a read-only node. You can add the /*FORCE_SALVE*/ hint to the SQL statement or directly connect to a read-only node for data export.

Parameters

Parameter

Description

loose_oss_outfile_buffer_size

The memory size allowed for each OSS outfile thread. Valid values: 102400 to 536870912. Default value: 134217728. Unit: bytes. The larger the memory size, the faster the export process.

loose_max_oss_outfile_threads

The number of OSS outfile threads that can be started on a PolarDB cluster at a time. Valid values: 1 to 100. Default value: 1.

You can execute the SHOW STATUS LIKE "Oss_outfile_threads_running"; statement to view the number of OSS outfile threads that are running in the current cluster.

Note

This parameter does not take effect for PolarDB for MySQL 8.0.1.1.38 or later.

Note

When you use the OSSOUTFILE feature to export CSV-formatted data files to OSS, the maximum memory occupied by the cluster is obtained by using the following formula:

loose_max_oss_outfile_threads * loose_oss_outfile_buffer_size

When you use the OSSOUTFILE feature, we recommend that the total memory size for the feature is less than 5% of the memory capacity of the current node. Otherwise, other services on the current node may be affected.

Usage

Syntax

SELECT * FROM table_name INTO OSSOUTFILE 'outfile_path' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';
Note
  • COLUMNS TERMINATED BY: the delimiter between fields.

  • OPTIONALLY ENCLOSED BY: the character that encloses fields. After OPTIONALLY is added, the enclosing character is added to string-type fields. If OPTIONALLY is not added, the enclosing character is added to all fields.

  • NULL_MARKER BY: the character indicating that the NULL value is exported as text. The output value of NULL can be defined in the following methods:

    • NULL_MARKER: directly defines the output value of NULL. NULL_MARKER can be defined as any value. This method has the highest priority.

    • ESCAPED BY: Only a single character can be used. The output value of NULL is the character followed by N. For example, ESCAPED BY '\' means that the output value of NULL is \N. This method has the second priority.

    • NULL: If neither of the preceding two methods is defined, the output value of NULL is the string of NULL by default. This method has the lowest priority. NULL does not have ENCLOSED identifiers.

  • LINES TERMINATED BY: the delimiter between rows.

For more information, see MySQL documentation.

Parameters

Parameter

Description

outfile_path

The file directory on OSS. It contains the following parts:

  • The OSS server information, which identifies the current OSS bucket and the basic path. For more information, see Use OSS foreign tables to access OSS data.

  • The path of the current task (optional).

  • The name of the exported file.

The preceding three parts are separated with forward slashes (/). A single path of the current task can contain multiple forward slashes (/), indicating that a multi-level path is used. After a file is uploaded to OSS, the complete path is: the OSS server path + the path of the current task. The name of the exported file must be included in outfile_path.

table_name

The name of the table from which you want to export data to OSS.

Examples

  1. Create an OSS server.

    CREATE SERVER outserver FOREIGN DATA WRAPPER oss OPTIONS ( EXTRA_SERVER_INFO '{"oss_endpoint": "oss-cn-zhangjiakou-internal.aliyuncs.com", "oss_bucket": "polardb", "oss_access_key_id": "*******", "oss_access_key_secret": "********", "oss_prefix":"B_outfile"}');
    Note

    Set oss_endpoint to an internal endpoint in the oss-{xxx}-internal.aliyuncs.com format.

  2. Export the local data file to OSS. In the following sample code, the file that is exported to OSS is oss://polardb/B_outfile/tpch/1t/parallel-lineitem.TXT, the bucket name is polardb, and the path of the file is B_outfile/tpch/1t/parallel-lineitem.TXT.

    SELECT * FROM lineitem INTO OSSOUTFILE 'outserver/tpch/1t/parallel-lineitem.TXT' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';

    If the cluster version is PolarDB for MySQL 8.0.1.1.38 or later, you can execute the following statement to view the total memory size and the total number of threads that are occupied by the current export task in real time:

    • View the total memory size occupied by the export task in real time. Unit: bytes.

      SHOW STATUS LIKE "%Oss_outfile_memory_used%";
    • View the total number of threads occupied by the export task in real time.

      SHOW STATUS LIKE "%Oss_outfile_threads_running%";
    Note

    When you connect to a cluster endpoint to execute an export task on a node, you can query information about memory resources used by the export statement only when you execute the SHOW STATUS statement on the same node.

Issues

If the ERROR 1086 (HY000): File 'xxxx' already exists error message is returned, a file with the same name already exists in the current OSS bucket.

You can resolve this issue in the following two ways:

  • Manually delete the existing file on OSS and execute the SELECT INTO OSSOUTFILE statement again.

  • Use the OSS_REPLACE_EXISTED_FILE() hint to forcibly overwrite the existing file. Example:

    SELECT /*+ OSS_REPLACE_EXISTED_FILE() */ * FROM lineitem INTO OSSOUTFILE 'outserver/tpch/1t/parallel-lineitem.TXT' COLUMNS TERMINATED BY '|@|' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY "" LINES TERMINATED BY '\n';
    Note
    • If you use the preceding hint, the existing file in OSS is forcibly overwritten. If you disable the OSS versioning feature, the overwritten file cannot be retrieved. Proceed with caution. For more information about how to enable the OSS versioning feature, see Overview.

    • To use hints, your PolarDB cluster must use one of the following versions:

      • A PolarDB for MySQL 8.0.2 cluster whose revision version is 8.0.2.2.9 or later.

      • A PolarDB for MySQL 8.0.1 cluster whose revision version is 8.0.1.1.31 or later.