All Products
Search
Document Center

E-MapReduce:Overview

Last Updated:Jan 16, 2024

YARN is a distributed resource management system. YARN is the core component of the Hadoop system. YARN manages resources in Hadoop clusters, and schedules and monitors jobs in the clusters.

Components

Component

Description

ResourceManager

Manages and schedules cluster resources and allocates resources for various types of jobs that are running on YARN.

For a non-high availability (HA) Hadoop cluster, ResourceManager is deployed on the master node of the cluster. For an HA Hadoop cluster, ResourceManagers are deployed on multiple master nodes of the cluster.

NodeMananger

Manages and monitors node resources and runs jobs on nodes.

NodeManagers are deployed on core or task nodes of a Hadoop cluster.

MapReduce History Server (MRHistoryServer)

Parses the metrics of a MapReduce job, displays the job status,

and periodically deletes expired aggregation logs.

TimelineServer

Collects the metrics of a job and displays the job status.

Note

The component is used to only monitor the resource usage of a single job. It does not cause failures in the development, running, and submission of data jobs.

WebAppProxyServer

Is responsible for the redirection to the URL of a job. It reduces the probability of web-based attacks.

ApplicationMaster

Handles transactions related to applications.

For example, ApplicationMaster schedules resources obtained from ResourceManager and communicates with NodeManagers to monitor and manage resources.

Benefits

YARN in a Hadoop cluster has the following benefits:
  • By default, YARN is deployed in HA mode in an HA Hadoop cluster.
  • O&M is convenient.

    You can add NodeManagers, decommission NodeManagers, and perform a rolling restart on NodeManagers in the E-MapReduce (EMR) console.

  • Monitoring and alerting are supported.

    YARN can monitor various metrics and report alerts based on alert rules.

  • Graceful decommission of NodeManagers is supported.

    If graceful decommission is enabled, YARN does not decommission a NodeManager within a specific period of time until all running tasks on the node are completed.