×
Community Blog Setup EMR Yarn authentication using Active Directory with Apache Knox

Setup EMR Yarn authentication using Active Directory with Apache Knox

A guide to configure integration between Alibaba Cloud EMR with Active Directory.

This blog is written by M Fakhri Darmawan, Solution Architect from Alibaba Cloud Indonesia

Alibaba Cloud Elastic MapReduce (EMR) is a big data processing solution that runs on the Alibaba Cloud platform. EMR is built on Alibaba Cloud ECS instances and is based on open-source Apache Hadoop and Apache Spark. EMR allows you to use the Hadoop and Spark ecosystem components, such as Apache Hive, Apache Kafka, Flink, Druid, and TensorFlow, to analyze and process data. You can use EMR to process data stored on different Alibaba Cloud data storage service, such as Object Storage Service (OSS), Log Service (SLS), and Relational Database Service (RDS). HDFS is the base layer used in Alibaba Cloud EMR if you use the Hadoop Platform based.
HDFS security is crucial because it used for data stored, data processing and hence. One of the important point in data security is authorization. To simplify managed authorization Alibaba Cloud EMR can integrated with Active Directory using Apache Knox (Knox). This is a guide to configure integration between Alibaba Cloud EMR with Active Directory.

Prerequisite: Active Directory and User Service

This is Active Directory used in this configuration, I create Hadoop OU (Organizational Unit) with user service: hdfsadmin and common user: hdfsuser
1

Step 1: Open EMR Console
Login to your Alibaba Cloud console and search for the E-MapReduce.
2

Step 2 : Open EMR Cluster
Open EMR Cluster detail. Select your cluster deployment region then select cluster management
3

In the cluster management page, select your cluster
4

Step 3 : Configure Knox
In your cluster page select cluster service
5

Scroll down services list and select knox
6

On the Knox service page, move to configure tab menu
7

Select cluster topo to configure the topology file
8

Scroll down the list to find the xml-direct-to-file-content configuration
9

Change the configuration file, delete shiroprovider in the existing configuration

<?xml version="1.0" encoding="utf-8"?>
<topology>
  <gateway>
    <!-- Delete from this line
    <provider>
      <role>authentication</role>
      <name>ShiroProvider</name>
      <enabled>true</enabled>
      <param>        
        <name>sessionTimeout</name>
        <value>30</value>
      </param>
      <param>
        <name>main.ldapRealm</name>
        <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
      </param>
      <param>
        <name>main.ldapContextFactory</name>
        <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
      </param>
      <param>
        <name>main.ldapRealm.contextFactory</name>
        <value>$ldapContextFactory</value>
      </param>
      <param>
        <name>main.ldapRealm.userDnTemplate</name>
        <value>uid={0},ou=people,o=emr</value>
      </param>
      <param>
        <name>main.ldapRealm.contextFactory.url</name>
        <value>ldap://{{hostname_ldap}}:10389</value>
      </param>
      <param>
        <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
        <value>simple</value>
      </param>
      <param>
        <name>urls./**</name>
        <value>authcBasic</value>
      </param>
    </provider>
    until this line -->
    <provider>
      <role>identity-assertion</role>
      <name>Default</name>
      <enabled>true</enabled>
    </provider>
    <provider>
      <role>hostmap</role>
      <name>static</name>
      <enabled>true</enabled>
      <param>
        <name>knox.{{clusterId_region}}.emr.aliyuncs.com</name>
        <value>{{hostname_master_main}}</value>
      </param>
    </provider>
    <provider>
      <role>ha</role>
      <name>HaProvider</name>
      <enabled>{{ha_enable}}</enabled>
      <param>
        <name>HDFSUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>WEBHDFS</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>YARNUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>GANGLIAUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>SPARKHISTORYUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>JOBHISTORYUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>NODEUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>HBASEUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>IMPALA-CATALOGD-UI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>IMPALA-STATESTORED-UI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>KUDUUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
    </provider>
  </gateway>
  <service>
    <role>NODEUI</role>
    <url>http://emr-header-1:8042</url>
    <url>http://emr-header-2:8042</url>
  </service>
  <service>
    <role>STORMUI</role>
    <url>http://emr-header-1:9999</url>
    <url>http://emr-header-2:9999</url>
  </service>
  <service>
    <role>OOZIEUI</role>
    <url>http://emr-header-1:11000/oozie</url>
    <url>http://emr-header-2:11000/oozie</url>
  </service>
</topology>

replace with this new shiro provider tag, you need to configure the specified parameter with your value:
-main.ldapRealm.contextFactory.url
-main.ldapRealm.contextFactory.systemUsername
-main.ldapRealm.contextFactory.systemPassword
-main.ldapRealm.searchBase

?xml version="1.0" encoding="utf-8"?>

<topology>
  <gateway>
    <provider>
      <role>authentication</role>
      <name>ShiroProvider</name>
      <enabled>true</enabled>
      <param>
          <name>sessionTimeout</name>
          <value>30</value>
      </param>
      <param>
          <name>main.ldapRealm</name>
          <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value> 
      </param>
      <param>
          <name>main.ldapContextFactory</name>
          <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
      </param>

      <!-- main.ldapRealm.contextFactory needs to be placed before main.ldapRealm.contextFactory* entries  -->
      <param>
          <name>main.ldapRealm.contextFactory</name>
          <value>$ldapContextFactory</value>
      </param>

      <!-- AD url -->
      <param>
          <name>main.ldapRealm.contextFactory.url</name>
          <!-- change this ip address with your AD hostname or IP address -->
          <value>ldap://127.0.0.1:389</value> 
      </param>

      <!-- system user -->
      <param>
          <name>main.ldapRealm.contextFactory.systemUsername</name>
          <!-- change this CN with your system user for integration-->
          <value>CN=systemuser,OU=hadoop,DC=ad,DC=ondemand</value>
      </param>

      <!-- pass in the password using the alias created earlier -->
      <param>
          <name>main.ldapRealm.contextFactory.systemPassword</name>
          <!-- change this value with your system user password -->
          <value>userpassword</value>
      </param>

      <param>
          <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
          <value>simple</value>
      </param>
      <param>
          <name>urls./**</name>
          <value>authcBasic</value> 
      </param>

      <!--  AD groups of users to allow -->
      <param>
          <name>main.ldapRealm.searchBase</name>
          <!-- change this value with your OU CN -->
          <value>OU=hadoop,DC=ad,DC=ondemand</value>
      </param>
      <param>
          <name>main.ldapRealm.userObjectClass</name>
          <value>person</value>
      </param>
      <param>
          <name>main.ldapRealm.userSearchAttributeName</name>
          <value>sAMAccountName</value>
      </param>
    </provider>
    <provider>
      <role>identity-assertion</role>
      <name>Default</name>
      <enabled>true</enabled>
    </provider>
    <provider>
      <role>hostmap</role>
      <name>static</name>
      <enabled>true</enabled>
      <param>
        <name>knox.{{clusterId_region}}.emr.aliyuncs.com</name>
        <value>{{hostname_master_main}}</value>
      </param>
    </provider>
    <provider>
      <role>ha</role>
      <name>HaProvider</name>
      <enabled>{{ha_enable}}</enabled>
      <param>
        <name>HDFSUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>WEBHDFS</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>YARNUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>GANGLIAUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>SPARKHISTORYUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>JOBHISTORYUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>NODEUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>HBASEUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>IMPALA-CATALOGD-UI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>IMPALA-STATESTORED-UI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
      <param>
        <name>KUDUUI</name>
        <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
      </param>
    </provider>
  </gateway>
  <service>
    <role>NODEUI</role>
    <url>http://emr-header-1:8042</url>
    <url>http://emr-header-2:8042</url>
  </service>
  <service>
    <role>STORMUI</role>
    <url>http://emr-header-1:9999</url>
    <url>http://emr-header-2:9999</url>
  </service>
  <service>
    <role>OOZIEUI</role>
    <url>http://emr-header-1:11000/oozie</url>
    <url>http://emr-header-2:11000/oozie</url>
  </service>
</topology>

Step 4 : Logon Test
Open your Yarn UI and try logon using your user in Active Directory
10

You are successfully logged in
11

0 0 0
Share on

Alibaba Cloud Indonesia

99 posts | 15 followers

You may also like

Comments