All Products
Search
Document Center

E-MapReduce:Overview

Last Updated:Sep 02, 2024

E-MapReduce (EMR) Doctor is an intelligent O&M system developed by the Alibaba Cloud EMR team for open source big data clusters. EMR Doctor provides the health diagnostics feature and daily cluster reports to allow you to learn the health status of your clusters. If your clusters are in an abnormal state, EMR Doctor can provide suggestions for O&M and resource optimization. You can use EMR Doctor on the Monitoring and Diagnostics tab of the details page of a cluster.

If you are an O&M engineer of an EMR cluster, you may need to obtain the following information:

  • The overall cluster stability, which can be evaluated based on factors such as the status of key services in the cluster and exception handling for the services. The key services include YARN, Hadoop Distributed File System (HDFS), Hive, and Spark.

  • The overall effectiveness of the cluster, such as the loads on the cluster and effective memory and CPU utilization of the cluster.

  • The service level agreement (SLA) that needs to be maintained for the cluster user. You must make sure that sufficient resources are allocated to key tasks and the execution of the key tasks is complete as expected.

EMR Doctor is an open source big data cluster manager that provides the following capabilities:

  • Monitors the health status of the cluster in real time, and provides suggestions on the usage of key services. This helps reduce the cluster O&M costs and consistently improve cluster stability.

  • Provides information about the usage and allocation of cluster resources and allocates suitable hardware resources to improve the cluster resource utilization.

  • Helps optimize service components in the cluster and tasks that are run on the cluster, and provides optimization suggestions that can be used to ensure the effectiveness and stability of the overall data link and computing link.

EMR Doctor provides the following features:

  • In-depth health diagnostics. This feature is used to analyze the health status of clusters and nodes. You can identify issues based on the diagnostic result and troubleshoot the issues based on suggestions. For more information, see Initiate health diagnostics.

  • Daily cluster report. This feature is used to generate a score for the cluster based on the health status of the cluster and provide intelligent optimization suggestions. For more information, see View daily cluster reports and analysis results in the reports.

  • Information integration and intelligent diagnostics. EMR Doctor integrates various information in the cluster for analysis and uses intelligent algorithms for problem diagnostics. This reduces the heavy and repetitive big data workloads on the cluster.