Services such as YARN and Hive contain a large number of configuration items. You can use the Custom Software Configuration feature provided by E-MapReduce (EMR) to modify existing configurations or add configuration items when you create a cluster.
Limits
You can use the Custom Software Configuration feature only when you create a cluster.
Procedure
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select the region in which you want to create a cluster and select a resource group based on your business requirements.
On the EMR on ECS page, click Create Cluster.
In the Advanced Settings section of the Software Configuration step, click Show More and turn on Custom Software Configuration.
You can add a configuration file that is in the JSON format to overwrite or add the default parameters for the services in a cluster when you create the cluster. The following sample code provides an example of the content of a configuration file in the JSON format:
[ { "ApplicationName":"YARN", "ConfigFileName":"yarn-site.xml", "ConfigItemKey":"yarn.nodemanager.resource.cpu-vcores", "ConfigItemValue":"8" }, { "ApplicationName":"YARN", "ConfigFileName":"yarn-site.xml", "ConfigItemKey":"aaa", "ConfigItemValue":"bbb" } ]
The following table describes the parameters in the preceding configuration file.
Parameter
Description
ApplicationName
The name of the service. You must specify the service name in all uppercase.
ConfigFileName
The name of the configuration file, which must be the name of the file that is actually passed.
NoteTo ensure that a configuration file can be correctly applied to the desired cluster, take note of the naming details of the configuration file when the configuration file is passed.
A file name extension is required for a configuration file that is applied to a DataLake cluster, a Dataflow cluster, an online analytical processing (OLAP) cluster, a DataServing cluster, or a custom cluster. Example:
yarn-site.xml
.A file name extension is not required for a configuration file that is applied to a Hadoop cluster. Example:
yarn-site
.
ConfigItemKey
The name of a configuration item.
ConfigItemValue
The value of the configuration item.
The following table lists the configuration files of each service.
Service
Configuration file
YARN
core-site.xml
log4j.properties
hdfs-site.xml
mapred-site.xml
yarn-site.xml
httpsfs-site.xml
capacity-scheduler.xml
hadoop-env.sh
httpfs-env.sh
mapred-env.sh
yarn-env.sh
Hive
hive-env.sh
hive-site.xml
hive-exec-log4j.properties
hive-log4j.properties
After you customize software configurations, you can continue to create the cluster. For more information, see Create a cluster.
References
After a cluster is created, you can adjust the settings of configuration items on the Configure tab of the desired service page. For more information, see Manage configuration items.