In the new E-MapReduce (EMR) console, Hue is no longer available when you create DataLake clusters of EMR V5.8.0 or a later minor version and DataLake clusters of EMR V3.42.0 or a later minor version. This topic describes how to use the root user to build and install Hue in an EMR DataLake cluster and how to access the web UI of Hue.
Prerequisites
A DataLake cluster is created. For more information, see Create a cluster.
Limits
You must turn on Assign Public Network IP for the master node group when you create the DataLake cluster.
Procedure
Log on to the master node of the DataLake cluster. For more information, see Log on to a cluster.
Download the hue-release package from the Git repository. Upload the package to the master node of the DataLake cluster and run the following command to decompress the package:
cd $hue_dir tar zxf hue-release-4.10.0.tar.gz
$hue_dir
specifies the directory to which the package is uploaded. In this example,$hue_dir
is set to /tmp/.Run the following commands to install dependencies:
sudo yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mysql-devel openldap-devel python3-devel sqlite-devel gmp-devel rsync mysql-devel sudo yum -y install nodejs npm sudo yum -y install git
Create a database and modify Hue-related configurations to connect to MySQL.
Run the following command to log on to MySQL Shell:
mysql -u root -pEMRroot1234
NoteThe username that is used to log on to MySQL Shell is root, and the password is EMRroot1234.
Run the following commands to create a database named hue and an account named hue, and grant all permissions on the database to the hue account:
CREATE DATABASE IF NOT EXISTS hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci; CREATE USER 'hue'@'localhost' IDENTIFIED BY '******'; GRANT ALL on hue.* to 'hue'@'localhost' IDENTIFIED BY '******'; FLUSH PRIVILEGES;
NoteIn the preceding commands,
******
specifies the password of the hue account. You can change the password based on your business requirements.Go to the
$hue_dir/hue-release-4.10.0/desktop/conf
directory, change pseudo-distributed.ini.tmpl to pseudo-distributed.ini, and then modify the configurations in the pseudo-distributed.ini file based on your business requirements. In this example, you need to modify the configurations under [desktop] and [[database]].[desktop] gunicorn_work_class=sync [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, sqlite3 or oracle. # # Note that for sqlite3, 'name', below is a path to the filename. For other backends, it is the database name # Note for Oracle, options={"threaded":true} must be set in order to avoid crashes. # Note for Oracle, you can use the Oracle Service Name by setting "host=" and "port=" and then "name=<host>:<port>/<service_name>". # Note for MariaDB use the 'mysql' engine. engine=mysql host=localhost port=3306 user=hue password=****** # conn_max_age option to make database connection persistent value in seconds # https://docs.djangoproject.com/en/1.11/ref/databases/#persistent-connections ## conn_max_age=0 # Execute this script to produce the database password. This will be used when 'password' is not set. ## password_script=/path/script name=hue ## options={} # Database schema, to be used only when public schema is revoked in postgres ## schema=public
Parameter
Description
gunicorn_work_class
Set the value to sync.
engine
The database engine. In this example, set the value to mysql.
host
The hostname that is used to access the database. In MySQL, the default value is localhost.
port
The port number that is used for communication with the database. In MySQL, the default value is 3306.
user
Set this parameter to the name of the account that you created. In this example, the account name is hue.
password
Set this parameter to the password for the account that you created. In this example, the password is ******.
name
Set this parameter to the name of the database that you created. In this example, the database name is hue.
Run the following commands to configure environment variables:
export PYTHON_VER=python3.6 export SKIP_PYTHONDEV_CHECK=true
NoteIn this topic, Hue is built by using Python 3.6 that is deployed on the master node. You can specify the Python version based on your business requirements.
Download dependencies and install Hue. If the node can access GitHub in a stable manner, you can select automatic download and installation. Otherwise, you must manually download the related software packages and install Hue.
Automatic download and installation
Go to the root directory of Hue and run the following commands to install Hue. During the automatic download and installation process, the node accesses GitHub and downloads the required dependencies.
Hue is installed in the /opt/apps/ directory.
rm -rf $hue_dir/hue-release-4.10.0/desktop/core/ext-py/ rm -rf /opt/apps/hue cd $hue_dir/hue-release-4.10.0/ PREFIX=/opt/apps make install
NoteIf the node cannot access GitHub in a stable manner, Hue may fail to be installed. In this case, we recommend that you manually download all required software packages and install Hue.
Manual download and installation
Modify the last two lines in the $hue_dir/hue-release-4.10.0/desktop/core/requirements.txt file to cancel the automatic download of required dependencies in GitHub. The following sample code provides an example of the content before and after modification.
# Before modification # git+https://github.com/gethue/django-babel.git # git+https://github.com/gethue/django-mako.git # After modification django-babel django-mako
Go to the root directory $hue_dir/hue-release-4.10.0/ of Hue and run the following commands to install Hue. Hue is installed in the /opt/apps/ directory.
rm -rf desktop/core/ext-py/ rm -rf /opt/apps/hue PREFIX=/opt/apps make install
Download the django-mako and django-babel packages from the Git repository. Upload the packages to the node where the Hue package resides and run the following commands to decompress the packages:
unzip django-babel-master.zip unzip django-mako-master.zip
In this example, the packages are uploaded to the root directory /tmp/ of the master node. The decompressed packages are stored in the /tmp/django-babel-master and /tmp/django-mako-master directories.
Separately run the following commands in the django-babel and django-mako root directories to install django-mako and django-babel:
source /opt/apps/hue/build/env/bin/activate pip install -e .
Run the following commands to start and use Hue:
source /opt/apps/hue/build/env/bin/activate sudo useradd hue supervisor
You can run the following commands to create a superuser and use the superuser to access the web UI of Hue:
source /opt/apps/hue/build/env/bin/activate hue createsuperuser #Trigger an interactive command line. You must enter the username and password of the superuser.
Enter an address in the
http://<Public IP address of the master node>:8000
format in the address bar of your browser and press Enter to access the web UI of Hue.NoteIn this example, only the basic settings of Hue are configured. To use other features of Hue, view the configurations of Hue, modify the /opt/apps/hue/desktop/conf/pseudo-distributed.ini configuration file, and then restart Hue.
References
Official O&M documentation of Hue: ADMINISTRATOR.