All Products
Search
Document Center

Hologres:Export data to MaxCompute

Last Updated:Aug 22, 2024

This topic describes how to export data to MaxCompute by executing SQL statements in Hologres.

Prerequisites

  • MaxCompute is activated, and a development tool is connected to MaxCompute.

  • A Hologres instance is created, and a development tool is connected to the instance. For more information, see Connect to HoloWeb and run queries.

Limits

  • Only Hologres V0.9 and later allow you to export data to MaxCompute by executing SQL statements. If the version of your Hologres instance is earlier than V0.9, you must manually upgrade your Hologres instance in the Hologres console or join a Hologres DingTalk group to apply for an instance upgrade. For more information about how to manually upgrade a Hologres instance, see the Manual upgrade section of the "Instance upgrades" topic. For more information about how to join a Hologres DingTalk group, see Obtain online support for Hologres.

  • The data types in Hologres are mapped to those in MaxCompute. In versions earlier than Hologres V1.3, Hologres does not support complex data types such as DATE, ARRAY, MAP, and STRUCT. Hologres V1.3 and later allow you to write data of ARRAY and DATE types to MaxCompute. For more information about data type mappings between Hologres and MaxCompute, see Data types.

  • Hologres V1.1 and later allow you to export data to MaxCompute. If the version of your Hologres instance is earlier than V1.1, you must add the following statement before the INSERT statement:

    -- This statement is required in Hologres V0.9 and V0.10.
    set hg_experimental_enable_write_maxcompute = on;
  • You cannot export data to a transactional table in MaxCompute by executing SQL statements in Hologres.

Precautions

When you export data to MaxCompute by executing SQL statements in Hologres, take note of the following items:

  • Hologres allows you to export data to MaxCompute across regions. However, data export performance is better if your Hologres instance and the destination MaxCompute project reside in the same region because the network is more stable.

  • Hologres supports a maximum of one level of partitions but allows you to export data to the subpartitions of MaxCompute tables. To export data to the subpartitions of a MaxCompute table, you must map the partition key values of the MaxCompute table to the fields in the source Hologres table. In addition, Hologres allows you to export data from a partitioned Hologres table to a non-partitioned MaxCompute table.

  • The number of MaxCompute write servers is limited. To obtain better performance, we recommend that you do not export data to MaxCompute during peak hours such as early mornings.

  • If a field of the TIMESTAMPTZ type exists, the valid values of the field range from 1677-09-21 00:00:00 to 2262-04-12 00:00:00.

Procedure

  1. Create an internal table in Hologres to export data to MaxCompute. In this example, an internal table named holo_source is created.

  2. Create a MaxCompute table to receive data. For example, you can create a table named mc_sink.

  3. Create a foreign table in Hologres to map the data of the Hologres internal table to the MaxCompute table. In this example, a foreign table named mapping_foreign_table is created.

  4. Execute the following SQL statements as required in Hologres to export data to MaxCompute:

    Note

    Hologres V2.1.17 and later support the Serverless Computing feature. The Serverless Computing feature is suitable for scenarios in which you want to import a large amount of data offline, run large-scale extract, transform, and load (ETL) jobs, or query a large amount of data from foreign tables. You can use the Serverless Computing feature to perform the preceding operations based on additional serverless computing resources. This can eliminate the need to reserve additional computing resources for the instances. This improves instance stability and reduces the occurrences of out of memory (OOM) errors. You are charged only for the additional serverless computing resources used by tasks. For more information about the Serverless Computing feature, see Overview of Serverless Computing. For more information about how to use the Serverless Computing feature, see User guide on Serverless Computing.

    -- Optional. We recommend that you use the Serverless Computing feature to import a large amount of data offline and run extract, transform, and load (ETL) jobs.
    SET hg_computing_resource = 'serverless';
    
    -- Export the data of specified fields.
    INSERT INTO mapping_foreign_table
    SELECT x,x,x FROM holo_soruce;-- Replace x,x,x with the name of the field that you want to export.
    
    -- Export the data of all fields.
    INSERT INTO mapping_foreign_table
    SELECT * FROM holo_soruce;
    
    -- Reset the configurations. This ensures that serverless computing resources are not used for subsequent SQL statements. 
    RESET hg_computing_resource;

The following sections describe how to synchronize data between non-partitioned tables and between partitioned tables.

Export data from a non-partitioned Hologres table to a non-partitioned MaxCompute table

  1. Create an internal table in Hologres as the data source.

    You can create a Hologres internal table to export data to MaxCompute. The following sample code shows how to create a Hologres internal table:

    BEGIN;
    CREATE  TABLE "public"."bank" (
     "age" int8,
     "job" text,
     "marital" text,
     "education" text,
     "card" text,
     "housing" text,
     "loan" text,
     "contact" text,
     "month" text,
     "day_of_week" text,
     "duration" text,
     "campaign" int8,
     "pdays" float8,
     "previous" float8,
     "poutcome" text,
     "emp_var_rate" float8,
     "cons_price_idx" float8,
     "cons_conf_idx" float8,
     "euribor3m" float8,
     "nr_employed" float8,
     "y" int8
    );
    COMMIT;
  2. Create a MaxCompute table to receive data.

    The following sample code shows how to create a MaxCompute table to receive data. The fields and data types in the MaxCompute table must correspond with those in the Hologres internal table. For more information about the syntax, see Table operations.

    CREATE TABLE IF NOT EXISTS mc_bank
    (
     age             BIGINT COMMENT 'Age',
     job             STRING COMMENT 'Job type',
     marital         STRING COMMENT 'Marital status',
     education       STRING COMMENT 'Education level',
     card            STRING COMMENT 'Credit card available or not',
     housing         STRING COMMENT 'Mortgage',
     loan            STRING COMMENT 'Loan',
     contact         STRING COMMENT 'Contact information',
     month           STRING COMMENT 'Month',
     day_of_week     STRING COMMENT 'Day of the week',
     duration        STRING COMMENT 'Duration',
     campaign        BIGINT COMMENT 'Number of contacts during the campaign',
     pdays           DOUBLE COMMENT 'Time elapsed since the last contact',
     previous        DOUBLE COMMENT 'Number of contacts with the customer',
     poutcome        STRING COMMENT 'Result of the previous marketing campaign',
     emp_var_rate    DOUBLE COMMENT 'Employment change rate',
     cons_price_idx  DOUBLE COMMENT 'Consumer price index',
     cons_conf_idx   DOUBLE COMMENT 'Consumer confidence index',
     euribor3m       DOUBLE COMMENT 'Euro deposit rate',
     nr_employed     DOUBLE COMMENT 'Number of employees',
     y               BIGINT COMMENT 'Time deposit available or not'
    );
  3. Create a foreign table in Hologres to map the data of the Hologres internal table to the MaxCompute table.

    You can create a foreign table in Hologres and use this foreign table to map data to the MaxCompute table. Alternatively, you can execute the Import Foreign Table statement to create a corresponding foreign table in Hologres for the MaxCompute table. The following sample code shows how to create a foreign table in Hologres:

    BEGIN;
    CREATE FOREIGN TABLE "public"."mapping_bank" (
     "age" int8,
     "job" text,
     "marital" text,
     "education" text,
     "card" text,
     "housing" text,
     "loan" text,
     "contact" text,
     "month" text,
     "day_of_week" text,
     "duration" text,
     "campaign" int8,
     "pdays" float8,
     "previous" float8,
     "poutcome" text,
     "emp_var_rate" float8,
     "cons_price_idx" float8,
     "cons_conf_idx" float8,
     "euribor3m" float8,
     "nr_employed" float8,
     "y" int8
    )
    SERVER odps_server
    OPTIONS (project_name 'xxx',table_name 'mc_bank');-- Replace the project_name parameter with the name of the destination MaxCompute project and the table_name parameter with the name of the destination MaxCompute table.
    COMMIT;

    The following table describes the parameters in the preceding syntax.

    Parameter

    Description

    project_name

    The name of the destination MaxCompute project.

    table_name

    The name of the destination MaxCompute table.

  4. Export data from Hologres to MaxCompute.

    Execute the following statements to export data from Hologres to the destination MaxCompute table. You can export the data of specified fields or of all fields. You must specify fields in the same sequence as they are in the Hologres internal table.

    -- Optional. We recommend that you use the Serverless Computing feature to import a large amount of data offline and run ETL jobs.
    SET hg_computing_resource = 'serverless';
    
    -- Export the data of specified fields.
    INSERT INTO mapping_bank 
    SELECT age,job FROM bank;
    
    -- Export the data of all fields.
    INSERT INTO mapping_bank 
    SELECT * FROM bank;
    
    -- Reset the configurations. This ensures that serverless computing resources are not used for subsequent SQL statements. 
    RESET hg_computing_resource;                      

Export data from a partitioned Hologres table to a partitioned MaxCompute table

  1. Create an internal table in Hologres as the data source.

    You can create a Hologres internal table to export data to MaxCompute. The following sample code shows how to create a Hologres internal table:

    BEGIN;
    CREATE TABLE "public"."par_bank" (
     "age" int8,
     "job" text,
     "marital" text,
     "education" text,
     "default" text,
     "housing" text,
     "loan" text,
     "contact" text,
     "month" text,
     "day_of_week" text,
     "duration" text,
     "campaign" int8,
     "pdays" float8,
     "previous" float8,
     "poutcome" text,
     "emp_var_rate" float8,
     "cons_price_idx" float8,
     "cons_conf_idx" float8,
     "euribor3m" float8,
     "nr_employed" float8,
     "y" int8,
     "ds" text
    )
    PARTITION BY list (ds);
    COMMIT;
    
    -- Child tables are required.
    CREATE TABLE "public"."par_bank_20190830" PARTITION OF "public"."par_bank" FOR VALUES IN ('20190830');
    
    CREATE TABLE "public"."par_bank_20190901" PARTITION OF "public"."par_bank" FOR VALUES IN ('20190901');
  2. Create a MaxCompute table to receive data.

    The following sample code shows how to create a MaxCompute table to receive data. The MaxCompute table can contain one or two levels of partitions. The fields and data types in the MaxCompute table must correspond with those in the Hologres internal table. For more information about the syntax, see Table operations.

    -- Case 1: Create a MaxCompute table that contains only one level of partitions.
    CREATE TABLE IF NOT EXISTS mc_par_bank
    (
        age            BIGINT COMMENT 'Age',
        job            STRING COMMENT 'Job type',
        marital        STRING COMMENT 'Marital status',
        education      STRING COMMENT 'Education level',
        default        STRING COMMENT 'Credit card available or not',
        housing        STRING COMMENT 'Mortgage',
        loan           STRING COMMENT 'Loan',
        contact        STRING COMMENT 'Contact information',
        month          STRING COMMENT 'Month',
        day_of_week    STRING COMMENT 'Day of the week',
        duration       STRING COMMENT 'Duration',
        campaign       BIGINT COMMENT 'Number of contacts during the campaign',
        pdays          DOUBLE COMMENT 'Time elapsed since the last contact',
        previous       DOUBLE COMMENT 'Number of contacts with the customer',
        poutcome       STRING COMMENT 'Result of the previous marketing campaign',
        emp_var_rate   DOUBLE COMMENT 'Employment change rate',
        cons_price_idx DOUBLE COMMENT 'Consumer price index',
        cons_conf_idx  DOUBLE COMMENT 'Consumer confidence index',
        euribor3m      DOUBLE COMMENT 'Euro deposit rate',
        nr_employed    DOUBLE COMMENT 'Number of employees',
        y              BIGINT COMMENT 'Time deposit available or not'
    ) 
    PARTITIONED BY
    (
        ds             STRING
    );
    -- Create the required partitions if they do not exist.
    alter table mc_par_bank add if not exists partition (ds='20190830');
    alter table mc_par_bank add if not exists partition (ds='20190901');
    
    
    -- Case 2: Create a MaxCompute table that contains two levels of partitions.
    CREATE TABLE IF NOT EXISTS mc_par_bank_2
    (
        age            BIGINT COMMENT 'Age',
        job            STRING COMMENT 'Job type',
        marital        STRING COMMENT 'Marital status',
        education      STRING COMMENT 'Education level',
        default        STRING COMMENT 'Credit card available or not',
        housing        STRING COMMENT 'Mortgage',
        loan           STRING COMMENT 'Loan',
        contact        STRING COMMENT 'Contact information',
        month          STRING COMMENT 'Month',
        day_of_week    STRING COMMENT 'Day of the week',
        duration       STRING COMMENT 'Duration',
        campaign       BIGINT COMMENT 'Number of contacts during the campaign',
        pdays          DOUBLE COMMENT 'Time elapsed since the last contact',
        previous       DOUBLE COMMENT 'Number of contacts with the customer',
        poutcome       STRING COMMENT 'Result of the previous marketing campaign',
        emp_var_rate   DOUBLE COMMENT 'Employment change rate',
        cons_price_idx DOUBLE COMMENT 'Consumer price index',
        cons_conf_idx  DOUBLE COMMENT 'Consumer confidence index',
        euribor3m      DOUBLE COMMENT 'Euro deposit rate',
        nr_employed    DOUBLE COMMENT 'Number of employees'
    ) 
    PARTITIONED BY
    (
        y              BIGINT, 
        ds             STRING
    );
    
    alter table mc_par_bank_2 add if not exists partition (y='1',ds='20190830');
    alter table mc_par_bank_2 add if not exists partition (y='1',ds='20190901');
  3. Create a foreign table in Hologres to map the data of the Hologres internal table to the MaxCompute table.

    You can create a foreign table in Hologres and use this foreign table to map the data of the Hologres internal table to the MaxCompute table. The following sample code shows how to create a foreign table in Hologres:

    -- Create a foreign table to map the data of the Hologres internal table to the MaxCompute table that contains only one level of partitions.
    BEGIN;
    CREATE FOREIGN TABLE "public"."mapping_par_bank" (
     "age" int8,
     "job" text,
     "marital" text,
     "education" text,
     "default" text,
     "housing" text,
     "loan" text,
     "contact" text,
     "month" text,
     "day_of_week" text,
     "duration" text,
     "campaign" int8,
     "pdays" float8,
     "previous" float8,
     "poutcome" text,
     "emp_var_rate" float8,
     "cons_price_idx" float8,
     "cons_conf_idx" float8,
     "euribor3m" float8,
     "nr_employed" float8,
     "y" int8,
     "ds" text
    )
    SERVER odps_server
    OPTIONS (project_name 'xxx',table_name 'mc_par_bank');
    COMMIT;
    
    
    -- Create a foreign table to map the data of the Hologres internal table to the MaxCompute table that contains two levels of partitions.
    BEGIN;
    CREATE FOREIGN TABLE "public"."mapping_par_bank_2" (
     "age" int8,
     "job" text,
     "marital" text,
     "education" text,
     "default" text,
     "housing" text,
     "loan" text,
     "contact" text,
     "month" text,
     "day_of_week" text,
     "duration" text,
     "campaign" int8,
     "pdays" float8,
     "previous" float8,
     "poutcome" text,
     "emp_var_rate" float8,
     "cons_price_idx" float8,
     "cons_conf_idx" float8,
     "euribor3m" float8,
     "nr_employed" float8,
     "y" int8,
     "ds" text
    )
    SERVER odps_server
    OPTIONS (project_name 'xxx',table_name 'mc_par_bank_2');// Replace the project_name parameter with the name of the destination MaxCompute project and the table_name parameter with the name of the destination MaxCompute table.
    COMMIT;
                            

    The following table describes the parameters in the preceding syntax.

    Parameter

    Description

    project_name

    The name of the destination MaxCompute project.

    table_name

    The name of the destination MaxCompute table.

  4. Export data from Hologres to MaxCompute.

    Execute the following statements to export data from Hologres to the destination MaxCompute table. You can export the data of specified fields or of all fields. You must specify fields in the same sequence as they are in the Hologres internal table.

    -- Optional. We recommend that you use the Serverless Computing feature to import a large amount of data offline and run ETL jobs.
    SET hg_computing_resource = 'serverless';
    
    -- Export data to the MaxCompute table that contains a single level of partitions.
    -- Example 1: Export data from the parent partitioned table in Hologres by using a WHERE clause.
    INSERT INTO mapping_par_bank SELECT * FROM "public"."par_bank" WHERE ds='20190830';
    
    -- Example 2: Export data from the corresponding child table in Hologres.
    insert into mapping_par_bank select * from "public"."par_bank_20190901";
    
    
    -- Export data to the MaxCompute table that contains two levels of partitions.
    -- Example 1: Export data from the parent partitioned table in Hologres by using a WHERE clause.
    INSERT INTO mapping_par_bank_2 SELECT * FROM "public"."par_bank" WHERE y='1' and ds='20190830';
    
    -- Example 2: Export data from a child partitioned table in Hologres.
    INSERT INTO mapping_par_bank_2 SELECT * FROM "public"."par_bank_20190901" where y='1';
    
    -- Reset the configurations. This ensures that serverless computing resources are not used for subsequent SQL statements. 
    RESET hg_computing_resource;

FAQ

  • Problem description

    The error message internal error: odps_writer_xxx,UploadId: xxx, Block 0 close odps writer failed: RequestId=xxx, ErrorCode=FlowExceeded, ErrorMessage= GlobalSlot Quota is returned when I export data to a MaxCompute table.

  • Cause

    When you use SQL statements in Hologres to export data to MaxCompute, the Tunnel SDK service of MaxCompute is used to ensure high-performance data export. For more information, see Tunnel commands. However, when you use the Tunnel SDK service of MaxCompute, the number of concurrent write requests and the data size are limited. For more information, see Limits. During peak hours of the Tunnel SDK service of MaxCompute, if the write task triggers write limits, the preceding error message is returned.

  • Solution

    • Export data to MaxCompute again.

    • If the error persists after you export data to MaxCompute again, execute the following statement to set a rate of concurrent data writes to MaxCompute:

      -- Configure a rate of concurrent data writes to MaxCompute. We recommend that you set the hg_experimental_write_maxcompute_dop parameter to a value from 0 to the shard count.
      set hg_experimental_write_maxcompute_dop =<count>;