MaxCompute facilitates efficient querying of data in Hologres by using external tables, eliminating the need for data import. This topic explains how to create a Hologres external table using the CREATE EXTERNAL TABLE statement, specifying a Hologres data source, Security Token Service (STS) authentication or dual-signature mode, a Hologres source table, and a JDBC driver.
Background information
Hologres is a real-time interactive analytics data warehouse compatible with the PostgreSQL protocol, seamlessly integrated with MaxCompute.
Create a Hologres external table in MaxCompute to query Hologres data source using the PostgreSQL JDBC driver and STS authentication information. This approach avoids redundant data storage and enables fast query results without data import or export.
Prerequisites
Ensure the following requirements are met before creating a Hologres external table:
-
A Hologres database and source table have been established.
For details on creating a Hologres database, see Create a Database.
For details on creating a Hologres table, see CREATE TABLE.
The prepared Hologres instance information is as follows:
-
Hologres database name: mc_test
.
-
The schema for the Hologres database is public
.
-
The classic network connection address for the Hologres database is hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80
.
-
Hologres table name: holo
. The following image displays the table data: data:image/s3,"s3://crabby-images/88818/88818815adddf53257c97dba5cd5c4a09d3f1d77" alt="Table Data"
-
The MaxCompute project intended for the Hologres external table is in place.
For details on creating a MaxCompute project, see Create a MaxCompute Project.
-
The MaxCompute client is installed and configured.
For details on installing the MaxCompute client, see Install and Configure the MaxCompute Client.
Limits
Be aware of the following constraints when using Hologres external tables:
-
MaxCompute does not support updating or deleting data on Hologres external tables.
-
Partitioned tables in Hologres do not correspond to partitioned tables in MaxCompute. Hologres external tables cannot be partitioned.
-
Concurrent writes to a Hologres external table by multiple processes may occasionally result in data duplication.
-
In MaxCompute, when creating an external table for Hologres, the DECIMAL data type defaults to a precision of 18 decimal places and is immutable, only allowing creation as decimal(38,18)
. If fewer decimal places are needed, consider using the String data type for the external table and apply the cast
function to convert the data type as necessary.
-
Complex data types such as ARRAY, MAP, and STRUCT are not supported in Hologres external tables created in MaxCompute.
-
Data types like JSON, JSONB, and MONEY supported in Hologres are not supported in MaxCompute. Columns matching these types in the Hologres source table cannot be configured when creating a Hologres external table in MaxCompute.
-
Hologres external tables do not support the clustering attribute.
Precautions
Consider the following points when using Hologres external tables:
-
If the IP address whitelist feature is enabled for Hologres, create a Hologres external table in dual-signature mode. STS mode creation will be blocked by the IP address whitelist feature.
-
Parent and child table names in a Hologres database are specified in Hologres external tables. SQL statements can be executed on these tables, but only parent table data can be read.
-
When writing data to a Hologres external table, the INSERT ON CONFLICT (UPSERT) feature of Hologres is not available. It is essential to ensure that the primary key of the data being written does not duplicate the primary key in the Hologres source table if the latter contains a primary key.
-
Table and field names are not case-sensitive when creating or querying foreign tables or fields, and forced case conversions are not supported.
Create a hologres external table
To create an external table, specify a storage handler in the CREATE TABLE statement and configure STS authentication (or enable dual-signature mode) and the JDBC connection address to access the Hologres data source.
Syntax
-
Create a Hologres external table in STS mode. For sample code, see Example of Creating an External Table in STS Mode.
CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name>(
<col1_name> <data_type>,
<col2_name> <data_type>,
......
)
stored BY '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
WITH serdeproperties (
'odps.properties.rolearn'='<ram_arn>')
location '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>'
tblproperties (
'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
'odps.federation.jdbc.target.db.type'='holo',
'odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>'
);
-
Create a Hologres external table in dual-signature mode. For sample code, see Example of Creating an External Table in Dual-Signature Mode.
Dual-signature mode allows the same RAM user to access the Hologres external table created in MaxCompute and the Hologres source table, provided the RAM user has permissions on the tables, thus eliminating the need for manual authorization. An IP address whitelist can also be configured for access.
SET odps.sql.common.table.planner.ext.hive.bridge=true;
CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name>(
<col1_name> <data_type>,
<col2_name> <data_type>,
......
)
STORED BY '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
LOCATION '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>'
tblproperties (
'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
'odps.federation.jdbc.target.db.type'='holo',
['odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>']
);
Parameter description
Parameter | Required | Description |
Parameter | Required | Description |
odps.sql.common.table.planner.ext.hive.bridge | Yes | Create a Hologres external table in dual-signature mode. The dual-signature mode is enabled when the value of this parameter is set to true . This indicates that the dual-signature mode is enabled and a Hologres external table is created in dual-signature mode. |
No | This parameter is not required when you create a Hologres external table in STS mode. |
IF NOT EXISTS | No | If you do not specify the IF NOT EXISTS option and a table with the same name exists, an error is returned. If you specify IF NOT EXISTS, the operation is successful regardless of whether a table with the same name exists. The metadata of the existing table remains unchanged even if the structure of the existing table is different from that of the table to be created. |
table_name | Yes | The name of the Hologres external table that you want to create in MaxCompute. |
col_name | Yes | The names of columns in the Hologres external table that you want to create in MaxCompute. |
data_type | Yes | The data types of columns in the Hologres external table. |
STORED BY | Yes | A storage handler, which defines the method that you want to use to access the Hologres external table. The value of this parameter is fixed as com.aliyun.odps.jdbc.JdbcStorageHandler , which indicates that the JdbcStorageHandler connection method is used. |
ram_arn | Yes | This parameter is required when you create a Hologres external table in STS mode. The Alibaba Cloud Resource Name (ARN) of the specified RAM role. The ARN is used as the STS authentication information. You can obtain the ARN in the RAM access control page by clicking the name of the target RAM role and then obtaining the ARN in the Basic Information section. |
No | This parameter is not required when you create a Hologres external table in dual-signature mode. |
LOCATION | Yes | The JDBC URL of the Hologres instance. The URL contains the following parameters: endpoint: Required. The classic network domain name of the Hologres instance. For more information about how to obtain the domain name, see Instance configuration. port: Required. The port number of the Hologres instance. For more information about how to obtain the port number, see Instance configuration. data:image/s3,"s3://crabby-images/8443e/8443e3e03341ba5e3c1db30e8be7b355c6bad2da" alt="获取endpoint和端口"
database: Required. The name of the Hologres database that you want to access. For more information about the Hologres database, see CREATE DATABASE. ApplicationName: Required. The default value is MaxCompute and no modification is required. schema: Optional. If the name of the source table is unique in the Hologres database or the source table is a table in the default schema, you do not need to specify this parameter. For more information about the schema, see CREATE SCHEMA. holo_table_name: Required. The name of the Hologres source table. For more information about the Hologres source table, see CREATE TABLE. data:image/s3,"s3://crabby-images/844da/844dab8c308a9c1d13789f0c195f9c9ae4e9bec5" alt="查看表"
|
tblproperties | Yes | mcfed.mapreduce.jdbc.driver.class: Required. The JDBC driver that is used to connect to the Hologres database. The value of this parameter is fixed as org.postgresql.Driver . odps.federation.jdbc.target.db.type: Required. The type of the Hologres database that you want to access. The value of this parameter is fixed as holo . odps.federation.jdbc.colmapping: Optional. If you want to map some columns of the Hologres source table to the Hologres external table, you must configure this parameter. This parameter specifies the mappings between the fields of the Hologres source table and the fields of the Hologres external table. If you do not configure this parameter, the fields of the source table are mapped to the columns with the same names in the Hologres external table. If you configure this parameter but specify only the mappings between some columns of the MaxCompute external table, the fields of the source table are mapped to the columns with the same names in the Hologres external table. An error is returned if the names or data types of the other columns do not match. If you configure this parameter and the field names in Hologres contain uppercase letters, you must enclose the field names in Hologres in double quotation marks ("" ). The format is MaxCompute field 1 : "Hologres field 1"[ ,MaxCompute field 2 : "Hologres field 2" ,...] . Note The fields of the Hologres source table are c bool, map_B string, a bigint . The fields of the MaxCompute external table are a bigint, x string, c bool . The content of the colmapping parameter is 'x: "map_B"' . In this case, the fields can be successfully mapped and the data in Hologres can be queried.
mcfed.mapreduce.jdbc.input.query: Optional. Read the data of the Hologres source table. The columns and column names of the external table must be consistent with those of the Hologres source table that is directly queried. If aliases are used, the columns and column names of the external table must be consistent with the aliases. The format of the select_sentence parameter is SELECT xxx FROM <holo_database_name>.<holo_schema_name>.<holo_table_name> .
|
Create a hologres external table (STS mode)
Follow these steps to create a Hologres external table in STS mode:
Create a RAM role
Create a RAM role and obtain its ARN to specify STS authentication information when creating an external table.
-
Log on to the RAM Access Control console to create a RAM role.
data:image/s3,"s3://crabby-images/26912/26912652f2649bfffe7d0d4c30092da5db1e363b" alt="Create RAM Role"
In the Create Role panel, select Alibaba Cloud Account or IdP for Select Trusted Entity based on your business requirements.
-
Alibaba Cloud Account:
RAM users in an Alibaba Cloud account can access cloud resources by assuming RAM roles. For more information, see Create a RAM Role for an Alibaba Cloud Account.
-
IdP:
Role-based single sign-on (SSO) allows you to log on to the Alibaba Cloud Management Console, meeting the unified user logon authentication requirements of enterprises. For more information, see Create a RAM Role for an IdP.
-
Modify the trust policy configuration.
-
On the Roles page, click the name of the created RAM role.
-
Click the Trust Policy tab.
-
On the Trust Policy tab, click Edit Trust Policy.
-
Modify the trust policy based on the content provided.
The trust policy configuration varies depending on the trusted entity type you selected.
-
Alibaba Cloud Account:
{
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"RAM": [
"acs:ram::<UID>:root"
]
}
},
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": [
"<UID>@odps.aliyuncs.com"
]
}
}
],
"Version": "1"
}
-
IdP:
{
"Statement": [
{
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"saml:recipient": "https://signin.aliyun.com/saml-role/sso"
}
},
"Effect": "Allow",
"Principal": {
"Federated": [
"acs:ram::<UID>:saml-provider/IDP"
]
}
},
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": [
"<UID>@odps.aliyuncs.com"
]
}
}
],
"Version": "1"
}
-
Click OK.
Add the RAM role to a hologres instance and grant permissions
Before the RAM role can use the Hologres instance, it must be granted the necessary development permissions. By default, the RAM role does not have permission to view or manage instances in the Hologres console. You must grant these permissions using your Alibaba Cloud account. After adding the RAM role to a Hologres instance, you can grant permissions using one of the following methods:
-
Grant the necessary permissions to the RAM role via the Hologres console.
-
Log on to the Hologres console.
-
In the left-side navigation pane, click Instance List, then select the Hologres instance to which you want to grant permissions.
-
On the instance details page, click Account Management.
-
On the User Management page, click Add User to add the RAM role to the Hologres instance.
data:image/s3,"s3://crabby-images/e9a36/e9a369d551c86e2af45d9d211267bf10e8cd8c49" alt="Add User"
-
On the DB Authorization tab, grant the instance development permissions to the RAM role.
Note
If the database permission policy is set to Expert Mode, change the role type to SuperUser on the User Management page. In this case, database authorization is not required.
data:image/s3,"s3://crabby-images/5b89c/5b89c96b8cbd689f9e724eb365e3e36298fcd58c" alt="DB Authorization"
-
Grant the required permissions to the RAM role using an SQL statement.
An SQL statement can be used to grant the necessary permissions. For more information, see Hologres Permission Model.
-
By default, a RAM user assuming a RAM role does not have permissions to access the Hologres console. Grant the AliyunRAMReadOnlyAccess permission to the RAM user on the access control page using your Alibaba Cloud account. Otherwise, the RAM user cannot perform operations in the Hologres console. For more information, see Grant Permissions to a RAM User.
data:image/s3,"s3://crabby-images/615cf/615cf524e6441dbb8ff38cadddc36924a49ee3bd" alt="fir"
data:image/s3,"s3://crabby-images/63aa8/63aa87ad3a0f8caeda6fe6b67baf2ddf70113869" alt="ppp4"
Example
After completing the above steps, log on to the MaxCompute client and create a Hologres external table using the prepared data information. For more information, see Create a Hologres External Table.
-
Install and log on to the MaxCompute local client, and navigate to the target MaxCompute project.
For details on navigating to the target project, see Project Space Operations.
-
Use the following command to create a Hologres external table:
Here is an example SQL command:
CREATE EXTERNAL TABLE IF NOT EXISTS my_table_holo_jdbc
(
id bigint,
name string
)
stored BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
WITH serdeproperties (
'odps.properties.rolearn'='acs:ram::139699392458****:role/aliyunodpsholorole')
location 'jdbc:postgresql://hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80/mc_test?ApplicationName=MaxCompute¤tSchema=public&useSSL=false&table=holo/'
tblproperties (
'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
'odps.federation.jdbc.target.db.type'='holo',
'odps.federation.jdbc.colmapping'='id:id,name:name'
);
-
Execute the following command to query the Hologres source table using the created Hologres external table:
Here is an example SQL code:
SET odps.sql.split.hive.bridge=true;
SET odps.sql.hive.compatible=true;
SELECT * FROM my_table_holo_jdbc limit 10;
The query returns the following result:
+
| id | name |
+
| 1 | kate |
| 2 | mary |
| 3 | bob |
| 4 | tom |
| 5 | lulu |
| 6 | mark |
| 7 | haward |
| 8 | lilei |
| 9 | hanmeimei |
| 10 | lily |
+
-
Optional:Exchange data and perform joint analysis between the Hologres external table and Hologres.
For instance, you can use the Hologres external table to write processed data from MaxCompute to Hologres for accelerated analysis and online services. Here is an example command:
SET odps.sql.split.hive.bridge=true;
SET odps.sql.hive.compatible=true;
INSERT INTO my_table_holo_jdbc VALUES (12,'alice');
SELECT * FROM my_table_holo_jdbc;
The query returns the following result:
+
| id | name |
+
| 12 | alice |
| 1 | kate |
| 2 | mary |
| 3 | bob |
| 4 | tom |
| 5 | lulu |
| 6 | mark |
| 7 | haward |
| 8 | lilei |
| 9 | hanmeimei |
| 10 | lily |
| 11 | lucy |
+
Dimension tables that are frequently updated are stored in Hologres databases, meeting real-time data update requirements. MaxCompute accesses these tables through external tables to perform association analysis with fact tables in MaxCompute. Here is an example command:
SET odps.sql.split.hive.bridge=true;
SET odps.sql.hive.compatible=true;
CREATE TABLE holo_test AS SELECT * FROM my_table_holo_jdbc;
SELECT * FROM my_table_holo_jdbc t1 INNER JOIN holo_test t2 ON t1.id=t2.id;
The query returns the following result:
+
| id | name | id2 | name2 |
+
| 1 | kate | 1 | kate |
| 2 | mary | 2 | mary |
| 3 | bob | 3 | bob |
| 4 | tom | 4 | tom |
| 5 | lulu | 5 | lulu |
| 6 | mark | 6 | mark |
| 7 | harward | 7 | harward |
| 8 | lilei | 8 | lilei |
| 9 | hanmeimei | 9 | hanmeimei |
| 10 | lily | 10 | lily |
| 11 | lucy | 11 | lucy |
| 12 | alice | 12 | alice |
+
Create a hologres external table (dual-signature mode)
Dual-signature mode is an authentication protocol developed by MaxCompute and Hologres. It authenticates access to Hologres data sources using account logon information from MaxCompute. If the account names used for MaxCompute and Hologres match, authentication succeeds, allowing direct access to external tables without additional authentication configuration.
Note
-
Ensure the Hologres account has the same name as MaxCompute and possesses Read and Write permissions on the corresponding Hologres tables.
-
Only Hologres V1.3 and later versions support creating Hologres external tables in MaxCompute using dual-signature mode. This mode allows reading data from a Hologres external table but not writing to it.
Example
Log on to the MaxCompute client and create a Hologres external table in dual-signature mode directly. For more information, see Create a Hologres External Table.
Here is an example SQL command:
CREATE EXTERNAL TABLE IF NOT EXISTS holo_mc_external_dbl
(
id int,
name string,
ds string
)
STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
location 'jdbc:postgresql://hgprecn-cn-zvp2o6aq****-cn-beijing-internal.hologres.aliyuncs.com:80/mc_test?ApplicationName=MaxCompute¤tSchema=public&preferQueryMode=simple&useSSL=false&table=mf_holo_mc_up/'
TBLPROPERTIES (
'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
'odps.federation.jdbc.target.db.type'='holo',
'odps.federation.jdbc.colmapping'='id:id,name:name,ds:ds'
);
SET odps.sql.common.table.planner.ext.hive.bridge=true;
SELECT * FROM holo_mc_external_dbl;
Enable the direct read feature for hologres external tables
Background information
MaxCompute typically reads data from Hologres external tables using JDBC mode. The latest version of MaxCompute supports direct reading from Hologres storage layers, offering the following advantages:
-
Significantly reduces latency and improves data query speed.
-
Reduces the number of connections to Hologres FE, with most queries requiring only one connection.
Limits
The following limitations apply when enabling direct read mode for Hologres. If these conditions are not met, JDBC mode is used:
-
The Hologres instance must be version V1.3.34 or later.
Direct read is not supported for versions earlier than V1.3.34.
-
Due to network connectivity, only MaxCompute projects in the same region as the Hologres instances can access them.
Cross-region access is not supported. Attempting to access a Hologres instance from a different region will result in the following error: FAILED: ODPS-0010000:System internal error - fuxi job failed, caused by: Pangu request failed with error code 3
.
-
Hologres cold storage tables are not supported by the direct read feature.
-
Row-oriented tables in Hologres are not compatible with the direct read feature.
-
If a Hologres database is configured with primary and secondary instances, only the primary instance's URL can be used for MaxCompute connections. The secondary instance's URL is not supported.
-
-
The following data types are unsupported in both JDBC and MaxCompute direct read modes:
-
Complex types such as ARRAY, MAP, and STRUCT
-
Types like JSON and MONEY
-
The following types are supported in JDBC mode but not in direct read mode:
-
The following types are supported in direct read mode but not in JDBC mode:
-
If the TIMESTAMP type in MaxCompute is mapped to the TIMESTAMP WITH TIME ZONE type in Hologres during direct read, time discrepancies may occur:
-
Should the TIMESTAMP WITH TIME ZONE column in Hologres reflect a time earlier than 1900-12-31 15:54:15
, the time retrieved from MaxCompute will be 5 minutes and 44 seconds ahead of the actual time.
-
If the TIMESTAMP WITH TIME ZONE column in Hologres records a time between 1900-12-31 15:54:16
and 1969-12-31 23:59:58
, the time retrieved from MaxCompute will be 1 second greater than the actual time.
-
If the TIMESTAMP WITH TIME ZONE column in Hologres has a timestamp later than 1969-12-31 23:59:59
, the time retrieved from MaxCompute will match the actual time in Hologres.
Note
Time zone differences and errors may also occur when mapping TIMESTAMP types between MaxCompute and Hologres in direct read mode:
-
If MaxCompute's time zone is UTC+8, a TIMESTAMP WITH TIME ZONE of 2000-01-01 00:00:00
in Hologres will appear as 2000-01-01 08:00:00
in MaxCompute.
-
If MaxCompute's time zone is UTC+8, a TIMESTAMP WITH TIME ZONE of 1969-01-01 00:00:00
in Hologres will appear as 1969-01-01 08:00:01
in MaxCompute.
-
The table below describes the data type mappings between Hologres and MaxCompute:
Hologres data type | MaxCompute data type | Description |
Hologres data type | MaxCompute data type | Description |
TEXT | | None. |
SMALLINT | SMALLINT | None. |
| INT | None. |
| BIGINT | None. |
| FLOAT | None. |
| DOUBLE | None. |
| BOOLEAN | None. |
TIMESTAMP | TIMESTAMP | Data of the TIMESTAMP type is displayed in the unit of microsecond. Time error exists due to different time zones. |
TIMESTAMP WITH TIME ZONE | TIMESTAMP | The precision and scale are automatically converted between MaxCompute and Hologres at the underlying layer. MaxCompute provides data of the TIMESTAMP type without a time zone specified. |
NUMERIC | DECIMAL | If the precision of the DECIMAL data type in MaxCompute is not specified, the default precision is (38,18) . When you create a table by executing the IMPORT FOREIGN SCHEMA statement, the system automatically converts the precision. |
CHAR(n) | CHAR(n) | The CHAR(n) in MaxCompute is a fixed-length character type, where n is the length. The maximum value is 255. If the length is insufficient, spaces are used for padding. |
VARCHAR(n) | VARCHAR(n) | The VARCHAR(n) in MaxCompute is a variable-length character type, where n is the length. The value range is 1 to 65535. |
DATE | DATE | None. |
-
In foreign server mode, the three-layer model must be enabled for your MaxCompute project.
Procedure
When querying data from a Hologres external table in MaxCompute, prepend the following command to the SQL statement:
SET odps.table.api.enable.holo.table=true;
Verify direct read
Check the logs in Logview to determine if direct read mode is active. For guidance on using Logview, see Use Logview 2.0 to View Job Running Information.
In Logview, on the Summary tab, locate the external holo tables
field to check the attributes. The attributes are formatted as follows:
<project_name>.<table_name>:<access mode>[<(rollback reason)>]
The table below describes the parameters:
Parameter | Description |
project_name | The name of the project. |
table_name | The name of the table. |
Access mode | The mode that is used to access external tables. Valid values: Optimized : indicates that the direct read mode is used. The following figure shows an example in Logview. data:image/s3,"s3://crabby-images/0be64/0be64ed6e5f7992002e09fa05628cebbb49aed1d" alt="直读模式"
Fallback : indicates that the JDBC mode is used. The following figure shows an example in Logview. data:image/s3,"s3://crabby-images/cf107/cf107845590fb0f835fc50ef80c89bfe0d01f005" alt="回退为JDBC模式"
|
Rollback reason | If the access mode is Fallback , the reason why the JDBC mode is used is displayed. The valid values and solutions are as follows: Column type map error Column name ${ColumnName} : The data types of the columns in the MaxCompute table do not match or are not compatible with those of the columns in the Hologres table. You must change the data types of the external table based on the data type mappings. Then, the direct read mode can be used.
Holo connection error : The Hologres instance is not connected. This may be because the permission information is incorrect or the status of the Hologres instance is abnormal. Check whether the current user has the permissions to access the Hologres database or check whether the Hologres instance is in the serviceable state.
Odps table is partition table : The MaxCompute external table is a partitioned table. This is not supported.
Select hg_version error , Hologres version check error , or Fetch hg_version data error : The version of the Hologres instance is incorrect. You must upgrade the version of the Hologres instance to V1.3.34 or later. For more information about how to upgrade the version of the instance, see Upgrade an instance.
|