In Hologres V2.0 and later, you can use the INSERT OVERWRITE statement in a stored procedure to write full data in a table or partition at a time. This topic describes how to use the INSERT OVERWRITE statement in a stored procedure to import data to Hologres. If the version of your Hologres instance is earlier than V2.0, upgrade your instance. For more information, see Instance upgrades. If you do not want to upgrade your instance, you can import data by using a temporary table.
Description
In Hologres V3.0 and later, the capability of the hg_insert_overwrite stored procedure is enhanced. You can execute the INSERT OVERWRITE statement to import data to partitioned parent tables.
In Hologres V2.0.15 and later, you can execute the
set hg_experimental_hg_insert_overwrite_enable_view=on;
statement to configure the Grand Unified Configuration (GUC) parameter. This allows you to import data to a table on which a view depends. You cannot import data to a table on which a materialized view depends.In Hologres V3.0 and later, you can import data to a table on which a view depends without configuring the GUC parameter. You cannot import data to a table on which a materialized view depends.
In versions earlier than V2.0.11, you must manually clear temporary tables if the import fails. In Hologres V2.0.11 and later, the system automatically clears temporary tables if the import fails.
Limits
If you want to import only specific fields, the fields that you want to import must be in the same sequence as and correspond to the fields in the source table.
The hg_insert_overwrite operation requires the table owner permission to create a temporary table. Therefore, only the superuser and the table owner have the permissions to perform the hg_insert_overwrite operation.
Partition key columns of destination tables can be of the INT, TEXT, or VARCHAR type.
In Hologres V3.0 and later, you cannot use the hg_insert_overwrite stored procedure in transactions. Otherwise, an error message is reported.
NoteIn versions earlier than Hologres V3.0, the hg_insert_overwrite stored procedure can be successfully executed in transactions in most cases.
Default behavior changes
In Hologres V3.0 and later, the hg_insert_overwrite stored procedure has the following changes:
In versions earlier than Hologres V3.0, if only two input parameters target_table and sql are available and the destination table is a partitioned parent table, an error message is reported. In Hologres V3.0 and later, if the partitioned child tables returned by select_query already exist in this situation, data writes are successful. If the partitioned child tables returned by select_query do not exist in this situation, an error message is reported.
In Hologres V3.0 and later, if the hg_insert_overwrite stored procedure is canceled during execution, you must execute the following SQL statement to clear temporary tables. In versions earlier than V3.0, you do not need to clear temporary tables in this situation.
--- Delete temporary tables that were created by the system before the time specified by before_time. CALL hg_clean_insert_overwrite_tmp_tables(before_time::timestamptz);
Use a stored procedure to implement the INSERT OVERWRITE feature
Syntax
-- Syntax of the hg_insert_overwrite stored procedure in versions earlier than Hologres V3.0
CALL hg_insert_overwrite('<target_table>' regclass, ['<partition_value>' text], '<sql>' text);
-- Syntax of the hg_insert_overwrite stored procedure in Hologres V3.0 and later
CALL hg_insert_overwrite('<target_table>' regclass, ['<partition_value>' array], '<sql>' text, ['<auto_create_partition>' bool]);
Parameters
In Hologres V3.0 and later, the data type of the partition_value parameter in the hg_insert_overwrite stored procedure is changed to ARRAY. This allows data to be written to the partitioned parent table and multiple partitioned child tables. You can still set the partition_value parameter to a value of the TEXT type. In this case, you can write data only to a partitioned child table.
Parameter | Description |
target_table | The name of the Hologres internal table to which you want to import data. The table must already exist. |
partition_value | The value of the partition key column in the partitioned table.
|
sql | The standard SELECT statement that is used to query data in a MaxCompute or Hologres table. You must make sure that the value of the partition field specified in the SELECT statement is the same as the value of the
|
auto_create_partition | Specifies whether to automatically create partitions. Only Hologres V3.0 and later support this parameter. Valid values:
|
In Hologres V3.0 and later, if you set the target_table
parameter to a partitioned parent table in the INSERT OVERWRITE statement, the system performs different operations based on different parameter settings.
The following table describes the operations that the system performs based on different parameter settings when the
partition_value
parameter is not specified.auto_create_partition
valueDescription
TRUE
The system overwrites data in all partitions that are returned by the SELECT statement in the table specified by
target_table
. The SELECT statement is specified bysql
. If a partition does not exist, the system automatically creates the partition.In the table specified by
target_table
, the system does not overwrite data in the partitions that are not included in the returned result of the SELECT statement. The SELECT statement is specified bysql
.
FALSE
If all partitions that are returned by the SELECT statement exist in the table specified by
target_table
, the following rules apply. The SELECT statement is specified bysql
.The system overwrites data in all partitions that are returned by the SELECT statement in the table specified by
target_table
.In the table specified by
target_table
, the system does not overwrite data in the partitions that are not included in the returned result of the SELECT statement.
If a partition that is returned by the SELECT statement does not exist in the table specified by
target_table
, the system returns an error message and does not overwrite data in any partition. The SELECT statement is specified bysql
.
The following table describes the operations that the system performs based on different parameter settings when the
partition_value
parameter is specified.auto_create_partition
valueDescription
TRUE
For partitions that are specified by
partition_value
in the table specified bytarget_table
, the following rules apply:For partitions that do not exist, the system automatically creates the partitions.
For partitions that are returned by the SELECT statement, the system overwrites data in the partitions. The SELECT statement is specified by
sql
.For partitions that are not included in the returned result of the SELECT statement, the system clears the partitions. The SELECT statement is specified by
sql
.
For partitions that are not specified by
partition_value
in the table specified bytarget_table
, the following rules apply:If the execution result of the SELECT statement contains the partitions, the system performs no operation. The SELECT statement is specified by
sql
.If the execution result of the SELECT statement does not contain the partitions, the system performs no operation. The SELECT statement is specified by
sql
.
FALSE
For partitions that are specified by
partition_value
in the table specified bytarget_table
, the following rules apply:If a partition does not exist, the system returns an error message and does not overwrite data in any partition.
For partitions that are returned by the SELECT statement, the system overwrites data in the partitions. The SELECT statement is specified by
sql
.For partitions that are not included in the returned result of the SELECT statement, the system clears the partitions. The SELECT statement is specified by
sql
.
For partitions that are not specified by
partition_value
in the table specified bytarget_table
, the following rules apply:If the execution result of the SELECT statement contains the partitions, the system performs no operation. The SELECT statement is specified by
sql
.If the execution result of the SELECT statement does not contain the partitions, the system performs no operation. The SELECT statement is specified by
sql
.
Examples
Example 1: Use a stored procedure to import data from a Hologres internal table to a Hologres non-partitioned table
-- Create a table named tablea as the destination table.
BEGIN;
CREATE TABLE public.tablea (
cid integer NOT NULL,
cname text,
code integer
,PRIMARY KEY (cid)
);
CALL set_table_property('public.tablea', 'orientation', 'column');
CALL set_table_property('public.tablea', 'storage_format', 'orc');
CALL set_table_property('public.tablea', 'bitmap_columns', 'cname');
CALL set_table_property('public.tablea', 'dictionary_encoding_columns', 'cname:auto');
CALL set_table_property('public.tablea', 'distribution_key', 'cid');
CALL set_table_property('public.tablea', 'time_to_live_in_seconds', '3153600000');
COMMIT;
-- Create a table named tableb as the source table.
CREATE TABLE public.tableb (
cid integer NOT NULL,
cname text,
code integer
,PRIMARY KEY (cid)
);
INSERT INTO public.tableb VALUES(1,'aaa',10001),(2,'bbb','10002');
-- Call the hg_insert_overwrite stored procedure to import data from tableb to tablea.
CALL hg_insert_overwrite('public.tablea' , 'SELECT * FROM public.tableb');
Example 2: Use a stored procedure to import data from a Hologres internal table to a Hologres partitioned table
-- Create a table named tableA as the destination table.
BEGIN;
CREATE TABLE public.tableA(
a text ,
b int,
c timestamp,
d text,
ds text,
PRIMARY key(ds,b)
)
PARTITION BY LIST(ds);
CALL set_table_property('public.tableA', 'orientation', 'column');
CREATE TABLE public.holo_child_1 PARTITION OF public.tableA FOR VALUES IN('20201215');
CREATE TABLE public.holo_child_2 PARTITION OF public.tableA FOR VALUES IN('20201216');
CREATE TABLE public.holo_child_3 PARTITION OF public.tableA FOR VALUES IN('20201217');
COMMIT;
-- Create a table named tableB as the source table.
BEGIN;
CREATE TABLE public.tableB(
a text ,
b int,
c timestamp,
d text,
ds text,
PRIMARY key(ds,b)
)
PARTITION BY LIST(ds);
CALL set_table_property('public.tableB', 'orientation', 'column');
CREATE TABLE public.holo_child_3a PARTITION OF public.tableB FOR VALUES IN('20201215');
CREATE TABLE public.holo_child_3b PARTITION OF public.tableB FOR VALUES IN('20201216');
CREATE TABLE public.holo_child_3c PARTITION OF public.tableB FOR VALUES IN('20201217');
COMMIT;
INSERT INTO public.holo_child_3a VALUES('a',1,'2034-10-19','a','20201215');
INSERT INTO public.holo_child_3b VALUES('b',2,'2034-10-20','b','20201216');
INSERT INTO public.holo_child_3c VALUES('c',3,'2034-10-21','c','20201217');
-- Call the hg_insert_overwrite stored procedure to import data from tableB to tableA.
CALL hg_insert_overwrite('public.tableA' , '20201215',$$SELECT * FROM public.tableB WHERE ds='20201215'$$);
Example 3: Use a stored procedure to import data from a MaxCompute non-partitioned table to Hologres
-- Create a non-partition table in MaxCompute. In this example, the customer table in the MaxCompute public dataset named public_data is used. The following sample code shows the data definition language (DDL) statement that is used to create the customer table:
CREATE TABLE IF NOT EXISTS public_data.customer(
c_customer_sk BIGINT,
c_customer_id STRING,
c_current_cdemo_sk BIGINT,
c_current_hdemo_sk BIGINT,
c_current_addr_sk BIGINT,
c_first_shipto_date_sk BIGINT,
c_first_sales_date_sk BIGINT,
c_salutation STRING,
c_first_name STRING,
c_last_name STRING,
c_preferred_cust_flag STRING,
c_birth_day BIGINT,
c_birth_month BIGINT,
c_birth_year BIGINT,
c_birth_country STRING,
c_login STRING,
c_email_address STRING,
c_last_review_date STRING,
useless STRING);
-- Create a foreign table in Hologres for mapping to the source table in MaxCompute.
CREATE FOREIGN TABLE customer (
"c_customer_sk" int8,
"c_customer_id" text,
"c_current_cdemo_sk" int8,
"c_current_hdemo_sk" int8,
"c_current_addr_sk" int8,
"c_first_shipto_date_sk" int8,
"c_first_sales_date_sk" int8,
"c_salutation" text,
"c_first_name" text,
"c_last_name" text,
"c_preferred_cust_flag" text,
"c_birth_day" int8,
"c_birth_month" int8,
"c_birth_year" int8,
"c_birth_country" text,
"c_login" text,
"c_email_address" text,
"c_last_review_date" text,
"useless" text
)
SERVER odps_server
OPTIONS (project_name 'public_data', table_name 'customer');
-- Create an internal table in Hologres to receive data from the MaxCompute source table. In this example, a column-oriented table is created.
BEGIN;
CREATE TABLE public.holo_customer (
"c_customer_sk" int8,
"c_customer_id" text,
"c_current_cdemo_sk" int8,
"c_current_hdemo_sk" int8,
"c_current_addr_sk" int8,
"c_first_shipto_date_sk" int8,
"c_first_sales_date_sk" int8,
"c_salutation" text,
"c_first_name" text,
"c_last_name" text,
"c_preferred_cust_flag" text,
"c_birth_day" int8,
"c_birth_month" int8,
"c_birth_year" int8,
"c_birth_country" text,
"c_login" text,
"c_email_address" text,
"c_last_review_date" text,
"useless" text
);
COMMIT;
-- Import data to Hologres.
IMPORT FOREIGN SCHEMA <project_name> LIMIT TO
(customer) FROM server odps_server INTO PUBLIC options(if_table_exist 'update');-- Update the metadata in the foreign table.
SELECT pg_sleep(30);-- Wait for a period of time before you import data to Hologres. Otherwise, the import operation may fail due to the latency that is caused by the update of metadata.
CALL hg_insert_overwrite('holo_customer', 'SELECT * FROM customer where c_birth_year > 1980');
-- Query the data of the MaxCompute source table in Hologres.
SELECT * FROM holo_customer limit 10;
Example 4: Use a stored procedure to import data from a MaxCompute partitioned table to Hologres
-- Create a partitioned table in MaxCompute.
DROP TABLE IF EXISTS odps_sale_detail;
CREATE TABLE IF NOT EXISTS odps_sale_detail
(
shop_name STRING
,customer_id STRING
,total_price DOUBLE
)
PARTITIONED BY
(
sale_date STRING
)
;
-- Add the 20210815 partition to the partitioned table.
ALTER TABLE odps_sale_detail ADD IF NOT EXISTS PARTITION(sale_date='20210815')
;
-- Insert data into the 20210815 partition.
INSERT OVERWRITE TABLE odps_sale_detail PARTITION(sale_date='20210815') VALUES
('s1','c1',100.1),
('s2','c2',100.2),
('s3','c3',100.3)
;
-- Create a foreign table in Hologres for mapping to the table in MaxCompute.
DROP FOREIGN TABLE IF EXISTS odps_sale_detail;
-- Create a foreign table.
IMPORT FOREIGN SCHEMA <maxcompute_project> LIMIT TO
(
odps_sale_detail
)
FROM SERVER odps_server INTO public
OPTIONS(if_table_exist 'error',if_unsupported_type 'error');
-- Create an internal table in Hologres to receive data from the MaxCompute table.
DROP TABLE IF EXISTS holo_sale_detail;
-- Create an internal partitioned table in Hologres.
BEGIN ;
CREATE TABLE IF NOT EXISTS holo_sale_detail
(
shop_name TEXT
,customer_id TEXT
,total_price FLOAT8
,sale_date TEXT
)
PARTITION BY LIST(sale_date);
COMMIT;
-- Import data to Hologres.
CALL hg_insert_overwrite('holo_sale_detail', '20210815', $$SELECT * FROM public.odps_sale_detail WHERE sale_date='20210815'$$);
-- Query the data of the MaxCompute table in Hologres.
SELECT * FROM holo_sale_detail;
Example 5: Use a stored procedure to import data from a MaxCompute partitioned table to a Hologres partitioned parent table
-- Create a partitioned table in MaxCompute.
DROP TABLE IF EXISTS odps_sale_detail;
CREATE TABLE IF NOT EXISTS odps_sale_detail
(
shop_name STRING
,customer_id STRING
,total_price DOUBLE
)
PARTITIONED BY
(
sale_date STRING
)
;
-- Add the 20210815 and 20210816 partitions to the partitioned table.
ALTER TABLE odps_sale_detail ADD IF NOT EXISTS PARTITION(sale_date='20210815')
;
ALTER TABLE odps_sale_detail ADD IF NOT EXISTS PARTITION(sale_date='20210816')
;
-- Insert data into the partitions.
INSERT OVERWRITE TABLE odps_sale_detail PARTITION(sale_date='20210815') VALUES
('s1','c1',100.1),
('s2','c2',100.2),
('s3','c3',100.3)
;
INSERT OVERWRITE TABLE odps_sale_detail PARTITION(sale_date='20210816') VALUES
('s1','c1',100.1),
('s2','c2',100.2),
('s3','c3',100.3)
;
-- Create a foreign table in Hologres for mapping to the table in MaxCompute.
DROP FOREIGN TABLE IF EXISTS odps_sale_detail;
-- Create a foreign table.
IMPORT FOREIGN SCHEMA <maxcompute_project> LIMIT TO
(
odps_sale_detail
)
FROM SERVER odps_server INTO public
OPTIONS(if_table_exist 'error',if_unsupported_type 'error');
-- Create an internal table in Hologres to receive data from the MaxCompute table.
DROP TABLE IF EXISTS holo_sale_detail;
-- Create an internal partitioned table in Hologres.
BEGIN ;
CREATE TABLE IF NOT EXISTS holo_sale_detail
(
shop_name TEXT
,customer_id TEXT
,total_price FLOAT8
,sale_date TEXT
)
PARTITION BY LIST(sale_date);
COMMIT;
-- Import data to Hologres. No partition is specified, and the auto_create_partition parameter is set to TRUE. In this case, the system automatically creates two partitions and imports data to the partitions.
CALL hg_insert_overwrite ('holo_sale_detail', $$SELECT * FROM public.odps_sale_detail$$, TRUE);
-- Query data in Hologres.
SELECT * FROM holo_sale_detail;
Parameter description:
maxcompute_project: the name of the project to which the MaxCompute partitioned table belongs.
Use a temporary table to implement the INSERT OVERWRITE feature
Syntax
You can execute the following SQL statements to import data by using a temporary table.
BEGIN ;
-- Drop the existing temporary table.
DROP TABLE IF EXISTS <table_new>;
-- Create a temporary table.
SET hg_experimental_enable_create_table_like_properties=on;
CALL HG_CREATE_TABLE_LIKE ('<table_new>', 'select * from <table>');
COMMIT ;
-- Import data to the temporary table.
INSERT INTO <table_new> [( <column> [, ...] )]
VALUES ( {<expression>} [, ...] )
[, ...] | <query>}
ANALYZE <table_new>;
BEGIN ;
-- Drop the original table.
DROP TABLE IF EXISTS <table>;
-- Rename the temporary table.
ALTER TABLE <table_new> RENAME TO <table>;
COMMIT ;
Parameters
Parameter | Description |
table_new | The name of the temporary table that you want to create. You can specify the table name in the |
table | The name of the original table. You can specify the table name in the |
DDL statements used to create a temporary table | You can use one of the following methods to create a temporary table:
|
Examples
Example 1: Import data from MaxCompute to a non-partitioned table in Hologres
You can use the method described in this topic if you need to import data from a result table of offline processing in MaxCompute to an online service table in Hologres and overwrite all the data of the online service table. In this example, data is imported from the MaxCompute table named odps_region_10g to the region table in Hologres, and the imported data overwrites all the data of the region table in Hologres.
BEGIN ;
-- Drop the existing temporary table.
DROP TABLE IF EXISTS public.region_new;
-- Create a temporary table.
SET hg_experimental_enable_create_table_like_properties=on;
CALL HG_CREATE_TABLE_LIKE ('public.region_new', 'select * from public.region');
COMMIT ;
-- Insert data into the temporary table.
INSERT INTO public.region_new
SELECT *
FROM public.odps_region_10g;
ANALYZE public.region_new;
BEGIN ;
-- Drop the original table.
DROP TABLE IF EXISTS public.region;
-- Rename the temporary table.
ALTER TABLE IF EXISTS public.region_new RENAME TO region;
COMMIT ;
Example 2: Import data from MaxCompute to a partitioned table in Hologres
You can use the method described in this topic if you need to import the daily updated data from a partitioned table in MaxCompute to a partitioned table in Hologres and overwrite all the data of the specified partition in the partitioned Hologres table. This way, you can use offline data to correct real-time data. In this example, data is imported from the MaxCompute table named odps_lineitem_10g to the lineitem table in Hologres, and the imported data overwrites all the data of the specified partition in the lineitem table in Hologres. Both tables are partitioned by the day based on the ds field.
BEGIN ;
-- Drop the existing temporary table.
DROP TABLE IF EXISTS public.lineitem_new_20210101;
-- Create a temporary table.
SET hg_experimental_enable_create_table_like_properties=on;
CALL HG_CREATE_TABLE_LIKE ('public.lineitem_new_20210101', 'select * from public.lineitem');
COMMIT ;
-- Insert data into the temporary table.
INSERT INTO public.lineitem_new_20210101
SELECT *
FROM public.odps_lineitem_10g
WHERE DS = '20210101'
ANALYZE public.lineitem_new_20210101;
BEGIN ;
-- Drop the original partition.
DROP TABLE IF EXISTS public.lineitem_20210101;
-- Rename the temporary table.
ALTER TABLE public.lineitem_new_20210101 RENAME TO lineitem_20210101;
-- Attach the temporary table to the partitioned Hologres table.
ALTER TABLE public.lineitem ATTACH PARTITION lineitem_20210101 FOR VALUES IN ('20210101');
COMMIT ;
Example 3: Import data from Hologres to a non-partitioned table in MaxCompute
You can create a temporary table if you need to import data from Hologres to a non-partitioned table in MaxCompute. After data is imported, you need to rename the temporary table as the name of a regular table. In this example, data is imported from the Hologres table named holotable into the MaxCompute table named mc_holotable, and the imported data overwrites all the data of the MaxCompute table named mc_holotable.
-- Create a temporary table of the destination table in MaxCompute.
CREATE TABLE if not exists mc_holotable_temp(
age int,
job string,
name string
);
-- Create the mapping to the temporary table in Hologres.
CREATE FOREIGN TABLE "public"."mapping_holotable_temp" (
"age" int,
"job" text,
"name" text
)
SERVER odps_server
OPTIONS (project_name 'DLF_test',table_name 'mc_holotable_temp');
-- Update the original table in Hologres.
UPDATE holotable SET "job" = 'president' WHERE "name" = 'Lily';
-- Write the updated data to the mapping table for the temporary table.
INSERT INTO mapping_holotable_temp SELECT * FROM holotable;
-- Drop the original destination table in MaxCompute.
DROP TABLE IF EXISTS mc_holotable;
-- Rename the temporary table as the name of the destination table.
ALTER TABLE mc_holotable_temp RENAME TO mc_holotable;
You can import some of the fields or all of the fields:
Export part of the fields.
INSERT INTO mapping_holotable_temp SELECT x,x,x FROM holotable; -- Replace x,x,x with the names of the fields whose data you want to export.
Export all the fields.
INSERT INTO mapping_holotable_temp SELECT * FROM holotable;