Hologres allows you to create foreign tables to accelerate queries on MaxCompute data. This way, you can directly access and analyze data stored in MaxCompute by using foreign tables in Hologres. This improves query efficiency and simplifies data processing.
Prerequisites
A Hologres instance is purchased. For more information, see Purchase a Hologres instance.
MaxCompute is activated. A MaxCompute project is created. For more information about how to activate MaxCompute, see Activate MaxCompute.
You are granted the permissions to access the MaxCompute project and tables. For more information, see Manage user permissions by using commands.
Usage notes
When you use a foreign table in Hologres to accelerate queries on MaxCompute data, take note of the following items:
You must make sure that the foreign table that is created in Hologres is in the same region as the MaxCompute table that the foreign table is sourced from. Otherwise, the stability of data queries cannot be ensured because of potential network errors.
You can accelerate queries on data only in internal tables in MaxCompute but not external tables or views in MaxCompute.
Partition fields in MaxCompute tables are mapped to regular fields in Hologres foreign tables. You can use values of partition fields as filter conditions to query data.
The amount of data to be scanned in a query cannot exceed 200 GB, and the number of partitions to be queried at a time cannot exceed 512. However, if you import MaxCompute data to Hologres internal tables before you query the data, the amount of data to be scanned and the number of partitions to be queried are not limited.
You cannot query data of the MAP or STRUCT type from MaxCompute tables.
Range-clustered tables in MaxCompute that use columns of the DATETIME, TIMESTAMP, or DECIMAL type as clustering key columns are not supported.
You can only query MaxCompute data that uses the standard storage but not MaxCompute data that uses the infrequent access (IA) storage or long-term storage.
Data type mappings
Data types that are supported by MaxCompute and those supported by Hologres have a one-to-one mapping relationship. You can create tables based on the data type mappings. For more information, see the Data type mappings between MaxCompute and Hologres section of the "Data types" topic.
Description
You can use one of the following methods to accelerate queries on MaxCompute data:
Method 1: Use the CREATE FOREIGN TABLE statement: This method is suitable for scenarios in which you need to accelerate queries on a small number of MaxCompute tables or only a specific number of columns and the schemas of MaxCompute tables are not frequently changed.
Method 2: Use the IMPORT FOREIGN SCHEMA statement: This method is suitable for scenarios in which you want to map all tables in a database or schema of an external data source to a schema in Hologres.
Method 3: Use the Auto Load feature: This method is suitable for scenarios in which you need to accelerate queries on a large number of MaxCompute tables or the schemas of MaxCompute tables are frequently changed. In this case, you can enable the Auto Load feature to automatically create foreign tables for specific or all MaxCompute tables based on your business requirements.
Method 1: Use the CREATE FOREIGN TABLE
statement to accelerate queries on MaxCompute data
Hologres allows you to use the CREATE FOREIGN TABLE
statement to create MaxCompute foreign tables. In this statement, you can customize table names, columns, and comments. This section describes how to use the CREATE FOREIGN TABLE
statement in Hologres to create foreign tables to accelerate queries on MaxCompute non-partitioned tables and partitioned tables.
You can also create foreign tables in the HoloWeb console. For more information, see Create a foreign table mapped to MaxCompute.
Example 1: Query data from a MaxCompute non-partitioned table
Create a non-partitioned table in MaxCompute.
Create a non-partitioned table in MaxCompute and import data to the table. In this example, the
customer
table in the MaxCompute public dataset namedBIGDATA_PUBLIC_DATASET.tpcds_10t
is used. The following sample code shows the DDL statement that is used to create the table and the SELECT statement that is used to query data from the table:-- The DDL statement that is used to create the table. CREATE TABLE IF NOT EXISTS public_data.customer( c_customer_sk BIGINT, c_customer_id STRING, c_current_cdemo_sk BIGINT, c_current_hdemo_sk BIGINT, c_current_addr_sk BIGINT, c_first_shipto_date_sk BIGINT, c_first_sales_date_sk BIGINT, c_salutation STRING, c_first_name STRING, c_last_name STRING, c_preferred_cust_flag STRING, c_birth_day BIGINT, c_birth_month BIGINT, c_birth_year BIGINT, c_birth_country STRING, c_login STRING, c_email_address STRING, c_last_review_date_sk STRING); -- The SELECT statement that is used to query the table. SELECT * FROM BIGDATA_PUBLIC_DATASET.tpcds_10t.customer;
The following figure shows part of the data in the customer table.
Create a foreign table in Hologres.
Create a foreign table in Hologres to map the table in MaxCompute. Sample statements:
SET hg_enable_convert_type_for_foreign_table = true; CREATE FOREIGN TABLE customer ( "c_customer_sk" int8, "c_customer_id" text, "c_current_cdemo_sk" int8, "c_current_hdemo_sk" int8, "c_current_addr_sk" int8, "c_first_shipto_date_sk" int8, "c_first_sales_date_sk" int8, "c_salutation" text, "c_first_name" text, "c_last_name" text, "c_preferred_cust_flag" text, "c_birth_day" int8, "c_birth_month" int8, "c_birth_year" int8, "c_birth_country" text, "c_login" text, "c_email_address" text, "c_last_review_date_sk" text) SERVER odps_server OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET.tpcds_10t', table_name 'customer');
The following table describes the parameters in the preceding statements.
Parameter
Description
SERVER
The server on which you want to create the foreign table.
You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see postgres_fdw.
project_name
If the MaxCompute project uses the three-layer model, set the project_name parameter to a combination of the MaxCompute project name and schema name in the
odps_project_name#odps_schema_name
format.
If the MaxCompute project uses the two-layer model, set the project_name parameter to the MaxCompute project name.
For more information about the three-layer model, see Schema-related operations.
table_name
The name of the MaxCompute table that you want to query.
Accelerate queries on the MaxCompute table by using the foreign table in Hologres.
After you create a foreign table, you can directly query data from the MaxCompute table by using the foreign table. Sample statement:
SELECT * FROM customer LIMIT 10;
Example 2: Query data from a MaxCompute partitioned table
Create a partitioned table in MaxCompute.
Create a partitioned table in MaxCompute and import data to the table. In this example, the
ods_enterprise_share_trade_h
table in the MaxCompute public dataset namedBIGDATA_PUBLIC_DATASET.finance
is used. The following sample code shows the DDL statement that is used to create the table and the SELECT statement that is used to query data from the table:-- The DDL statement that is used to create the table. CREATE TABLE IF NOT EXISTS public_data.ods_enterprise_share_trade_h( code STRING COMMENT 'Code' name STRING COMMENT 'Name', industry STRING COMMENT 'Industry', area STRING COMMENT 'Country/region', pe STRING COMMENT 'Price-earnings ratio', outstanding STRING COMMENT 'Outstanding', totals STRING COMMENT 'Total shares outstanding (ten thousand)', totalassets STRING COMMENT 'Total assets (ten thousand)', liquidassets STRING COMMENT 'Liquid assets', fixedassets STRING COMMENT 'Fixed assets', reserved STRING COMMENT 'Reserved', reservedpershare STRING COMMENT 'Reserved per share', eps STRING COMMENT 'earnings per share', bvps STRING COMMENT 'book value per share', pb STRING COMMENT 'Price-to-book ratio', timetomarket STRING COMMENT 'Time to market', undp STRING COMMENT 'Undistributed profits', perundp STRING COMMENT 'Undistributed profits per share', rev STRING COMMENT 'Year-over-year revenue (%)', profit STRING COMMENT 'Year-over-year profit (%)', gpr STRING COMMENT 'Gross profit margin (%)', npr STRING COMMENT 'Net profit ratio (%)', holders_num STRING COMMENT 'Holders') PARTITIONED BY (ds STRING) STORED AS ALIORC TBLPROPERTIES ('comment'='date when data is imported'); -- Query data from a specific partition in the MaxCompute partitioned table. SELECT * FROM BIGDATA_PUBLIC_DATASET.finance.ods_enterprise_share_trade_h WHERE ds = '20170113';
The following figure shows some data in the table.
Create a foreign table in Hologres.
Create a foreign table in Hologres to map the table in MaxCompute. Sample statement:
CREATE FOREIGN TABLE public.foreign_ods_enterprise_share_trade_h ( "code" text, "name" text, "industry" text, "area" text, "pe" text, "outstanding" text, "totals" text, "totalassets" text, "liquidassets" text, "fixedassets" text, "reserved" text, "reservedpershare" text, "eps" text, "bvps" text, "pb" text, "timetomarket" text, "undp" text, "perundp" text, "rev" text, "profit" text, "gpr" text, "npr" text, "holders_num" text, "ds" text ) SERVER odps_server OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET#finance', table_name 'ods_enterprise_share_trade_h'); comment on foreign table public.foreign_ods_enterprise_share_trade_h is 'Stock transaction history'; comment on column public.foreign_ods_enterprise_share_trade_h."code" is 'Stock symbol'; comment on column public.foreign_ods_enterprise_share_trade_h."name" is 'Name'; comment on column public.foreign_ods_enterprise_share_trade_h."industry" is 'Industry'; comment on column public.foreign_ods_enterprise_share_trade_h."area" is 'Country/region'; comment on column public.foreign_ods_enterprise_share_trade_h."pe" is 'Price-earnings ratio'; comment on column public.foreign_ods_enterprise_share_trade_h."outstanding" is 'Outstanding'; comment on column public.foreign_ods_enterprise_share_trade_h."totals" is 'Total shares outstanding (ten thousand)'; comment on column public.foreign_ods_enterprise_share_trade_h."totalassets" is 'Total assets (ten thousand)'; comment on column public.foreign_ods_enterprise_share_trade_h."liquidassets" is 'Liquid assets'; comment on column public.foreign_ods_enterprise_share_trade_h."fixedassets" is 'Fixed assets'; comment on column public.foreign_ods_enterprise_share_trade_h."reserved" is 'Reserved'; comment on column public.foreign_ods_enterprise_share_trade_h."reservedpershare" is 'Reserved per share'; comment on column public.foreign_ods_enterprise_share_trade_h."eps" is 'Earnings per share'; comment on column public.foreign_ods_enterprise_share_trade_h."bvps" is 'Book value per share'; comment on column public.foreign_ods_enterprise_share_trade_h."pb" is 'Price-to-book ratio'; comment on column public.foreign_ods_enterprise_share_trade_h."timetomarket" is 'Time to market'; comment on column public.foreign_ods_enterprise_share_trade_h."undp" is 'Undistributed profits'; comment on column public.foreign_ods_enterprise_share_trade_h."perundp" is 'Undistributed profits per share'; comment on column public.foreign_ods_enterprise_share_trade_h."rev" is 'Year-over-year revenue (%)'; comment on column public.foreign_ods_enterprise_share_trade_h."profit" is 'Year-over-year profit (%)'; comment on column public.foreign_ods_enterprise_share_trade_h."gpr" is 'Gross profit margin (%)'; comment on column public.foreign_ods_enterprise_share_trade_h."npr" is 'Net profit ratio (%)'; comment on column public.foreign_ods_enterprise_share_trade_h."holders_num" is 'Holders';
Query data from the MaxCompute partitioned table.
To query the first 10 data records, execute the following statement:
SELECT * FROM foreign_ods_enterprise_share_trade_h limit 10;
To query data from a specific partition, execute the following statement:
SELECT * FROM foreign_ods_enterprise_share_trade_h WHERE ds = '20170113';
Method 2: Use the IMPORT FOREIGN SCHEMA
statement to accelerate queries on MaxCompute data
Hologres allows you to use the IMPORT FOREIGN SCHEMA
statement to create multiple MaxCompute foreign tables at a time. For more information, see IMPORT FOREIGN SCHEMA.
Method 3: Use the Auto Load feature to accelerate queries on MaxCompute data
If the number of foreign tables to be created is large or schemas of MaxCompute tables are frequently changed, you can enable the Auto Load feature to automatically create foreign tables for queried MaxCompute tables based on your business requirements or for all MaxCompute tables. Schema changes on MaxCompute tables include operations such as deleting columns, changing the order of columns, and changing the data types of columns. You do not need to manually change the schemas of foreign tables. This improves the query efficient. For more information, see Auto Load.