All Products
Search
Document Center

E-MapReduce:Manage configuration items

Last Updated:Aug 12, 2024

E-MapReduce (EMR) allows you to modify, add, and view configuration items for services such as Hadoop Distributed File System (HDFS), YARN, and Spark in the EMR console. This topic describes how to modify, add, and view configuration items in the EMR console.

Prerequisites

An EMR cluster is created. For more information, see Create a cluster.

View configuration items

By default, the Configure tab for a service displays default cluster-level configuration items. You can also select Node Group Configuration or Independent Node Configuration from the Default Cluster Configuration drop-down list. You can modify specific configuration items at the node and node group levels. For more information, see Modifiable configuration items.

image.png

Note
  • If a configuration item at the node group or node level is modified, or the settings of a configuration item are inconsistent with the default settings at the cluster level, the settings of the configuration item at the node group or node level are displayed on the Configure tab, with Default Cluster Configuration selected.

  • You must select a specific node group or node to view the value of a configuration item at the node group or node level.

  • The value of a configuration item configured at different levels can be overwritten. You can manage the values of configuration items at the node, node group, or cluster level. The values of configuration items take effect in the following level priority: node > node group > cluster.

On the Configure tab with Default Cluster Configuration selected, the configuration items at the node group or node level can only be viewed. To modify a configuration item at the node group or node level, perform the following operations: Select the desired file subtab on the Configure tab, select Node Group Configuration or Independent Node Configuration, modify the configuration item, and then save the configurations.

Modify configuration items

  1. Go to the Configure tab of a service.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select the region in which your cluster resides and select a resource group based on your business requirements.

    3. On the EMR on ECS page, find the desired cluster and click Services in the Actions column.

    4. On the Services tab, find the desired service and click Configure.

  2. Modify configuration items.

    1. In the search box, enter the name of the configuration item that you want to modify and click the search icon.

    2. Change the value of the configuration item.

  3. Save the configurations.

    1. On the Configure tab, click Save.

    2. In the Save that appears, configure the Execution Reason parameter and click Save.

      Note

      In the Save dialog box, the Save and Deliver Configuration switch is turned on by default. After you save the configurations, the configurations are also sent to the client. You can make the configurations take effect in manual mode. If you turn off Save and Deliver Configuration in the Save dialog box, you can make the configurations take effect in prompt mode.

  4. Make the configurations take effect.

    To allow the modifications to take effect, perform the following operations based on the type of the configuration item you modified:

    Manual mode

    • Client-side configurations

      1. Click Deploy Client Configuration next to Save in the lower part of the page.

      2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

      3. In the Confirm dialog box, click OK.

    • Server-side configurations

      1. Choose More > Restart in the upper-right corner of the Configure tab.

      2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

      3. In the Confirm dialog box, click OK.

    Prompt mode

    Note

    This mode is available only for EMR V5.12.1, EMR V3.46.1, or a minor version later than EMR V5.12.1 or EMR V3.46.1.

    • Client-side configurations

      1. After you save the configurations, the image.png prompt appears.

      2. Click the To Be Delivered prompt.

      3. In the Configurations to Be Delivered message, click Deliver.

        Note

        For the YARN service, if the configuration items that you modified contain queue-related configuration items, you need to click the image.png prompt or click Deploy on the Edit Resource Queue tab of the YARN service page to make the modifications take effect after you save and deliver the modifications.

    • Server-side configurations

      1. After you save the configurations, the image.png prompt appears.

      2. Click the Not Effective Yet prompt.

      3. In the Configurations to Take Effect dialog box, configure settings based on the mode in which you can make the configurations take effect.

        • Configurations That Require Custom Operations

          Manually click items in the Actions column that corresponds to a component to make the configurations take effect.

        • Configurations That Require Restart

          1. Click restart in the Actions column that corresponds to a component. Alternatively, select multiple components and click Batch Restart.

          2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

Add configuration items.

  1. Go to the Configure tab of a service.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select the region in which your cluster resides and select a resource group based on your business requirements.

    3. On the EMR on ECS page, find the desired cluster and click Services in the Actions column.

    4. On the Services tab, find the desired service and click Configure.

  2. Add configuration items.

    1. Click the tab on which you want to add configuration items.

    2. Click Add Configuration Item.

    3. Add configuration items based on your business requirements.

      You can add multiple configuration items at the same time.

      Configuration item

      Description

      Key

      The name of the configuration item.

      Value

      The value of the configuration item.

      Description

      The description of the configuration item.

      Actions

      You can remove configuration items.

    4. Click OK.

    5. In the dialog box that appears, configure the Execution Reason parameter and click Save.

  3. Make the configurations take effect.

    To allow the modifications to take effect, perform the following operations based on the type of the configuration item you modified:

    Manual mode

    • Client-side configurations

      1. Click Deploy Client Configuration next to Save in the lower part of the page.

      2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

      3. In the Confirm dialog box, click OK.

    • Server-side configurations

      1. Choose More > Restart in the upper-right corner of the Configure tab.

      2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

      3. In the Confirm dialog box, click OK.

    Prompt mode

    Note

    This mode is available only for EMR V5.12.1, EMR V3.46.1, or a minor version later than EMR V5.12.1 or EMR V3.46.1.

    • Client-side configurations

      1. After you save the configurations, the image.png prompt appears.

      2. Click the To Be Delivered prompt.

      3. In the Configurations to Be Delivered message, click Deliver.

        Note

        For the YARN service, if the configuration items that you modified contain queue-related configuration items, you need to click the image.png prompt or click Deploy on the Edit Resource Queue tab of the YARN service page to make the modifications take effect after you save and deliver the modifications.

    • Server-side configurations

      1. After you save the configurations, the image.png prompt appears.

      2. Click the Not Effective Yet prompt.

      3. In the Configurations to Take Effect dialog box, configure settings based on the mode in which you can make the configurations take effect.

        • Configurations That Require Custom Operations

          Manually click items in the Actions column that corresponds to a component to make the configurations take effect.

        • Configurations That Require Restart

          1. Click restart in the Actions column that corresponds to a component. Alternatively, select multiple components and click Batch Restart.

          2. In the dialog box that appears, configure the Execution Reason parameter and click OK.

Modifiable configuration items

The following table describes the configuration items that can be modified at the node and node group levels, or at the node or node group level in clusters of EMR V5.17.1.

Note

Kerberos-related configuration items are available only if Kerberos authentication is enabled.

Service name

File name

Configuration item

Hadoop-Common

core-site.xml

fs.oss.tmp.data.dirs

hadoop.tmp.dir

HDFS

hdfs-env.sh

hadoop_datanode_heapsize

hadoop_secondarynamenode_opts

hadoop_namenode_heapsize

hdfs-site.xml

dfs.datanode.data.dir

dfs.datanode.failed.volumes.tolerated

dfs.datanode.du.reserved

dfs.datanode.balance.max.concurrent.moves

OSS-HDFS

None

None

Hive

hive-env.sh

hive_metastore_heapsize

hive_server2_heapsize

Spark 2

hiveserver2-site.xml

hive.server2.authentication.kerberos.principal

spark-env.sh

spark_history_daemon_memory

spark_thrift_daemon_memory

spark-thriftserver.conf

spark.yarn.historyServer.address

spark.hadoop.hive.server2.thrift.bind.host

spark.yarn.principal

spark-defaults.conf

spark.yarn.historyServer.address

spark.history.kerberos.principal

Spark 3

hiveserver2-site.xml

hive.server2.authentication.kerberos.principal

spark-env.sh

spark_history_daemon_memory

spark_thrift_daemon_memory

spark-thriftserver.conf

spark.yarn.historyServer.address

spark.hadoop.hive.server2.thrift.bind.host

spark.kerberos.principal

spark-defaults.conf

spark.yarn.historyServer.address

spark.history.kerberos.principal

Tez

None

None

Trino

iceberg.properties

hive.hdfs.trino.principal

hive.metastore.client.principal

delta.properties

hive.hdfs.trino.principal

hive.metastore.client.principal

config.properties

coordinator

node-scheduler.include-coordinator

query.max-memory

query.max-total-memory

query.max-memory-per-node

http-server.authentication.type

http-server.authentication.krb5.user-mapping.pattern

http-server.authentication.krb5.service-name

http-server.authentication.krb5.keytab

http.authentication.krb5.config

http-server.https.enabled

http-server.https.port

http-server.https.keystore.key

http-server.https.keystore.path

event-listener.config-files

Note

event-listener.config-files specifies the path where the configuration file of an event listener is stored. This configuration item is available only if you turn on EmrEventListener.

jvm.config

jvm parameter

hudi.properties

hive.hdfs.trino.principal

hive.metastore.client.principal

password-authenticator.properties

ldap.url

ldap.user-bind-pattern

hive.properties

hive.hdfs.trino.principal

hive.metastore.client.principal

Delta Lake

None

None

Hudi

None

None

Iceberg

None

None

JindoData

storage.yaml

jindofsx.storage.cache-mode

storage.watermark.high.ratio

storage.watermark.low.ratio

storage.handler.threads

Note
  • JindoData applies to clusters of EMR V5.14.0 or a later minor version and clusters of EMR V3.48.0 or a later minor version.

  • JindoData is unavailable for clusters of EMR V5.15.0 or a later minor version and clusters of EMR V3.49.0 or a later minor version. You can use JindoCache for data caching and DLF-Auth for authentication.

Flume

flume-conf.properties

agent_name

flume-conf.properties

Kyuubi

kyuubi-env.sh

kyuubi_java_opts

YARN

yarn-site.xml

yarn.nodemanager.resource.memory-mb

yarn.nodemanager.local-dirs

yarn.nodemanager.log-dirs

yarn.nodemanager.resource.cpu-vcores

yarn.nodemanager.address

yarn.nodemanager.node-labels.provider.configured-node-partition

yarn-env.sh

YARN_RESOURCEMANAGER_HEAPSIZE

YARN_TIMELINESERVER_HEAPSIZE

YARN_PROXYSERVER_HEAPSIZE

YARN_NODEMANAGER_HEAPSIZE

YARN_RESOURCEMANAGER_HEAPSIZE_MIN

YARN_TIMELINESERVER_HEAPSIZE_MIN

YARN_PROXYSERVER_HEAPSIZE_MIN

YARN_NODEMANAGER_HEAPSIZE_MIN

mapred-env.sh

HADOOP_JOB_HISTORYSERVER_HEAPSIZE

mapred-site.xml

mapreduce.cluster.local.dir

Impala

None

None

OpenLDAP

None

None

Ranger

None

None

Ranger-Plugin

None

None

DLF-Auth

None

None

Presto

iceberg.properties

hive.hdfs.presto.principal

hive.metastore.client.principal

delta.properties

hive.hdfs.presto.principal

hive.metastore.client.principal

hive.properties

hive.hdfs.presto.principal

hive.metastore.client.principal

config.properties

coordinator

node-scheduler.include-coordinator

query.max-memory-per-node

query.max-total-memory-per-node

http-server.authentication.type

http.authentication.krb5.principal-hostname

http.server.authentication.krb5.service-name

http.server.authentication.krb5.keytab

http.authentication.krb5.config

http-server.https.enabled

http-server.https.port

http-server.https.keystore.key

http-server.https.keystore.path

jvm.config

jvm parameter

hudi.properties

hive.hdfs.presto.principal

hive.metastore.client.principal

password-authenticator.properties

ldap.url

ldap.user-bind-pattern

StarRocks2

fe.conf

JAVA_OPTS

meta_dir

be.conf

storage_root_path

JAVA_OPTS

StarRocks3

fe.conf

JAVA_OPTS

meta_dir

be.conf

storage_root_path

JAVA_OPTS

Doris

fe.conf

JAVA_OPTS

JAVA_OPTS_FOR_JDK_9

meta_dir

be.conf

storage_root_path

ClickHouse

server-config

interserver_http_host

server-metrika

macros.shard

macros.replica

ZooKeeper

None

None

Sqoop

None

None

Knox

None

None

Celeborn

celeborn-env.sh

CELEBORN_WORKER_MEMORY

CELEBORN_WORKER_OFFHEAP_MEMORY

CELEBORN_MASTER_MEMORY

celeborn-defaults.conf

celeborn.worker.storage.dirs

celeborn.worker.flusher.threads

Flink

flink-conf.yaml

security.kerberos.login.principal

security.kerberos.login.keytab

HBase

hbase-env.sh

hbase_master_opts

hbase_thrift_opts

hbase_rest_opts

hbase_regionserver_opts

hbase-site.xml

hbase.regionserver.handler.count

hbase.regionserver.global.memstore.size

hbase.regionserver.global.memstore.lowerLimit

hbase.regionserver.thread.compaction.throttle

hbase.regionserver.thread.compaction.large

hbase.regionserver.thread.compaction.small

HBase-HDFS

hdfs-env.sh

hadoop_secondarynamenode_opts

hadoop_namenode_heapsize

hadoop_datanode_heapsize

hdfs-site.xml

dfs.datanode.data.dir

dfs.datanode.failed.volumes.tolerated

dfs.datanode.du.reserved

dfs.datanode.balance.max.concurrent.moves

JindoCache

None

None

Kafka

server.properties

broker.id

num.network.threads

num.io.threads

kafka.heap.opts

log.dirs

kafka.public-access.ip

listeners

advertised.listeners

Note

kafka.public-access.ip specifies the public IP address of a Kafka broker. You can use this configuration item to configure listeners that have public IP addresses.

kafka-internal-config

broker_id

user_params

is_local_disk_instance

Kudu

master.gflags

fs_data_dirs

fs_wal_dir

fs_metadata_dir

log_dir

tserver.gflags

fs_data_dirs

fs_wal_dir

fs_metadata_dir

log_dir

Paimon

None

None

Phoenix

None

None