YARN is a distributed resource management system. YARN is the core component of the Hadoop system. YARN manages resources in Hadoop clusters, and schedules and monitors jobs in the clusters.
Components
Component | Description |
ResourceManager | Manages and schedules cluster resources and allocates resources for various types of jobs that are running on YARN. For a non-high availability (HA) Hadoop cluster, ResourceManager is deployed on the master node of the cluster. For an HA Hadoop cluster, ResourceManagers are deployed on multiple master nodes of the cluster. |
NodeMananger | Manages and monitors node resources and runs jobs on nodes. NodeManagers are deployed on core or task nodes of a Hadoop cluster. |
MapReduce History Server (MRHistoryServer) | Parses the metrics of a MapReduce job, displays the job status, and periodically deletes expired aggregation logs. |
TimelineServer | Collects the metrics of a job and displays the job status. Note The component is used to only monitor the resource usage of a single job. It does not cause failures in the development, running, and submission of data jobs. |
WebAppProxyServer | Is responsible for the redirection to the URL of a job. It reduces the probability of web-based attacks. |
ApplicationMaster | Handles transactions related to applications. For example, ApplicationMaster schedules resources obtained from ResourceManager and communicates with NodeManagers to monitor and manage resources. |
Benefits
- By default, YARN is deployed in HA mode in an HA Hadoop cluster.
- O&M is convenient.
You can add NodeManagers, decommission NodeManagers, and perform a rolling restart on NodeManagers in the E-MapReduce (EMR) console.
- Monitoring and alerting are supported.
YARN can monitor various metrics and report alerts based on alert rules.
- Graceful decommission of NodeManagers is supported.
If graceful decommission is enabled, YARN does not decommission a NodeManager within a specific period of time until all running tasks on the node are completed.