Gateway is an Elastic Compute Service (ECS) server located in the same intranet as the E-MapReduce (EMR) cluster. It can be used for load balancing and security isolation. You can create Gateway nodes of corresponding clusters through the Console Page > Configuration Management > Overview > Create Gateway.
Note: The HAProxy service is installed on the Gateway node by default, but it is not started automatically.
It is relatively simple to configure the Gateway proxy for a common cluster. You only need to configure the HAProxy reverse proxy, and implement the reverse proxy for port 9090 of the Presto Coodrinator on the Header node of the EMR cluster. The configuration steps are as follows:
Log on to the Gateway node through SSH and modify the HAProxy configuration file /etc/haproxy/haproxy.cfg. Add the following:
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
......
## Configure the proxy to map the port 9090 of Gateway
## to the port 9090 of emr-header-1.cluster-xxxx
listen prestojdbc: 9090
mode tcp
option tcplog
balance source
server presto-coodinator-1 emr-header-1.cluster-xxxx:9090
Save and exit. Run the following command to restart the HAProxy service:
$> service haproxy restart
The rules to be configured are as follows:
Direction | Configuration rule | Description |
Internet inbound | Customize TCP, and enable 9090 port | This port is used for the coordinator port on the Header node of the HAProxy proxy |
Now, you can delete the public IP of the Header node on the ECS console, and access the Presto service through Gateway on your client.
The Presto service in the high security EMR cluster uses Kerberos service for authentication. The Kerberos KDC service is located on emr-header-1 on port 88, and supports TCP/UDP protocol. To use Gateway to access the Presto service in the high security cluster, proxies must be implemented for both the Presto Coordinator service port and Kerberos KDC.
In addition, the EMR Presto Coordinator cluster uses the keystore with CN configured as emr-header-1 by default, but it can only be used within the Intranet. Therefore, it is necessary to regenerate the keystore with CN configured as emr-header-1.cluster-xxx.
Create a keystore with CN configured as emr-header-1.cluster-xxx for the server:
[root@emr-header-1 presto-conf]# keytool -genkey -dname "CN=emr-header-1.cluster-xxx,OU=Alibaba,O=Alibaba,L=HZ, ST=zhejiang, C=CN" -alias server -keyalg RSA -keystore keystore -keypass 81ba14ce6084 -storepass 81ba14ce6084 -validity 36500
Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore keystore -destkeystore keystore -deststoretype pkcs12".
Export the certificate:
[root@emr-header-1 presto-conf]# keytool -export -alias server -file server.cer -keystore keystore -storepass 81ba14ce6084
The certificate stored in the file <server.cer>
Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore keystore -destkeystore keystore -deststoretype pkcs12".
Create a keystore for the client:
[root@emr-header-1 presto-conf]# keytool -genkey -dname "CN=myhost,OU=Alibaba,O=Alibaba,L=HZ, ST=zhejiang, C=CN" -alias client -keyalg RSA -keystore client.keystore -keypass 123456 -storepass 123456 -validity 36500
Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12, which is an industry standard format, using "keytool -importkeystore -srckeystore client.keystore -destkeystore client.keystore -deststoretype pkcs12".
Import the certificate to the keystore of the client:
[root@emr-header-1 presto-conf]# keytool -import -alias server -keystore client.keystore -file server.cer -storepass 123456
Owner: CN=emr-header-2.cluster-xxx, OU=Alibaba, O=Alibaba, L=HZ, ST=zhejiang, C=CN
Publisher: CN=emr-header-2.cluster-xxx, OU=Alibaba, O=Alibaba, L=HZ, ST=zhejiang, C=CN
Serial number:4247108
Validity period: Thu Mar 01 09:11:31 CST 2018 to Sat Feb 05 09:11:31 CST 2118
Certificate fingerprint:
MD5: 75:2A:AA:40:01:5B:3F:86:8F:9A:DB:B1:85:BD:44:8A
SHA1: C7:25:B9:AD:5F:FE:FC:05:8E:A0:24:4A:1C:AA:6A:8D:6C:39:28:16
SHA256: DB:86:69:65:73:D5:C6:E2:98:7C:4A:3B:31:EF:70:80:F0:3C:3B:0C:14:94:37:9F:9C:22:47:EA:7E:1E:DE:8C
Name of the signature algorithm: SHA256withRSA
Subject public key algorithm: 2048bit RSA key
Version: 3
Extension:
#1: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 45 1D A9 C7 D5 4E BB CF BD CE B4 5E E2 16 FB 2F E.... N..... ^... /
0010: E9 5D 4A B6 .] J.
]
]
Do you trust this certificate? [No]: Yes
The certificate has been added to the keystore
Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12, which is an industry standard format, using "keytool -importkeystore -srckeystore client.keystore -destkeystore client.keystore -deststoretype pkcs12".
Copy the generated file to the client:
$> scp root@xxx.xxx.xxx.xxx:/etc/ecm/presto-conf/client.keystore . /
Add the client user principal:
[root@emr-header-1 presto-conf]# sh /usr/lib/has-current/bin/hadmin-local.sh /etc/ecm/has-conf -k /etc/ecm/has-conf/admin.keytab
[INFO] conf_dir=/etc/ecm/has-conf
Debug is true storeKey true useTicketCache false useKeyTab true doNotPrompt true ticketCache is null is Initiator true KeyTab is /etc/ecm/has-conf/admin.keytab refreshKrb5Config is true principal is kadmin/EMR.xxx.COM@EMR.xxx.COM tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Refreshing Kerberos configuration
principal is kadmin/EMR.xxx.COM@EMR.xxx.COM
Will use keytab
Commit Succeeded
Login successful for user: kadmin/EMR.xxx.COM@EMR.xxx.COM
enter "cmd" to see legal commands.
HadminLocalTool.local: addprinc -pw 123456 clientuser
Success to add principal: clientuser
HadminLocalTool.local: ktadd -k /root/clientuser.keytab clientuser
Principal export to keytab file: /root/clientuser.keytab successful .
HadminLocalTool.local: exit
Copy the generated file to the client:
$> scp root@xxx.xxx.xxx.xxx:/root/clientuser.keytab . /
$> scp root@xxx.xxx.xxx.xxx:/etc/krb5.conf . /
Modify the following two points in the krb5.conf file copied to the client:
[libdefaults]
kdc_realm = EMR.xxx.COM
default_realm = EMR.xxx.COM
# Change to 1, so that the client can use TCP protocol to communicate with KDC (because HAProxy does not support UDP protocol)
udp_preference_limit = 1
kdc_tcp_port = 88
kdc_udp_port = 88
dns_lookup_kdc = false
[realms]
EMR.xxx.COM = {
# Set to the Internet IP of the Gateway
kdc = xxx.xxx.xxx.xxx:88
}
Modify the hosts file of the client host, and add the following:
# gateway ip
xxx.xxx.xxx.xxx emr-header-1.cluster-xxx
Log on to the Gateway node through SSH and modify /etc/haproxy/haproxy.cfg. Add the following:
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
......
listen prestojdbc :7778
mode tcp
option tcplog
balance source
server presto-coodinator-1 emr-header-1.cluster-xxx:7778
listen kdc :88
mode tcp
option tcplog
balance source
server emr-kdc emr-header-1:88
Save and exit. Run the following command to restart the HAProxy service:
$> service haproxy restart
The rules to be configured are as follows:
Direction | Configuration rule | Description |
Internet inbound | Customize UDP, and enable port 88 | This port is used for KDC on the Header node of the HAProxy proxy |
Internet inbound | Customize TCP, and enable port 88 | This port is used for KDC on the Header node of the HAProxy proxy |
Internet inbound | Customize TCP, and enable port 7778 | This port is used for the Coodinator port on the Header node of the HAProxy proxy |
Now, you can delete the public IP of the Header node on the ECS console, and access the Presto service through Gateway on your client.
The code is as follows:
try {
Class.forName("com.facebook.presto.jdbc.PrestoDriver");
} catch(ClassNotFoundException e) {
LOG.error("Failed to load presto jdbc driver.", e);
System.exit(-1);
}
Connection connection = null;
Statement statement = null;
try {
String url = "jdbc:presto://emr-header-1.cluster-59824:7778/hive/default";
Properties properties = new Properties();
properties.setProperty("user", "hadoop");
// Https related configuration
properties.setProperty("SSL", "true");
properties.setProperty("SSLTrustStorePath", "resources/59824/client.keystore");
properties.setProperty("SSLTrustStorePassword", "123456");
// Kerberos related configuration
properties.setProperty("KerberosRemoteServiceName", "presto");
properties.setProperty("KerberosPrincipal", "clientuser@EMR. 59824. COM");
properties.setProperty("KerberosConfigPath", "resources/59824/krb5.conf");
properties.setProperty("KerberosKeytabPath", "resources/59824/clientuser.keytab");
// Create a Connection object
connection = DriverManager.getConnection(url, properties);
// Create a Statement object
statement = connection.createStatement();
// Execute the query
ResultSet rs = statement.executeQuery("select * from table1");
// Obtain the result
int columnNum = rs.getMetaData().getColumnCount();
int rowIndex = 0;
while (rs.next()) {
rowIndex++;
for(int i = 1; i <= columnNum; i++) {
System.out.println("Row " + rowIndex + ", Column " + i + ": " + rs.getString(i));
}
}
} catch(SQLException e) {
LOG.error("Exception thrown.", e);
} finally {
// Destroy the Statement object
if (statement ! = null) {
try {
statement.close();
} catch(Throwable t) {
// No-ops
}
}
// Close the Connection
if (connection ! = null) {
try {
connection.close();
} catch(Throwable t) {
// No-ops
}
}
}
This article describes how to use the HAProxy reverse proxy to access the Presto service through the Gateway node. The method can also be extended to other components, such as Impala.
Alibaba EMR - May 10, 2021
Alibaba Clouder - October 17, 2017
Alibaba EMR - August 5, 2024
Alibaba Cloud Community - February 8, 2023
vboylin - May 10, 2019
Alibaba EMR - November 4, 2020
Alibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreA tool product specially designed for remote access to private network databases
Learn MoreApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.
Learn MoreMore Posts by vboylin