A large number of logs are generated when open source components that are deployed in an E-MapReduce (EMR) cluster are running. You can use the log management feature together with Log Service to query the logs that are generated for the open source components in the EMR console.
Prerequisites
An EMR cluster is created in the EMR console. For more information, see Create a cluster.
Log Service is activated. For more information, see Getting Started.
Limits
This topic applies only to Dataflow, OLAP, and DataServing clusters, DataLake clusters in the new data lake scenario, and Hadoop clusters in the original data lake scenario.
You can query logs of the following services that are deployed in an EMR cluster: Hadoop Distributed File System (HDFS), YARN, YARN application, Hive, Spark, JindoData, Tez, Flink, HBase, ZooKeeper, Kafka, Presto, Kudu, Impala, Flume, StarRocks, ClickHouse, Kyuubi, RSS, and hosts.
Precautions
You are charged by Log Service for the resources that you use to store log data and index traffic of log data. For more information, see Billable items.
If you are using a RAM user, use your Alibaba Cloud account to log on to the Resource Access Management (RAM) console and attach the AliyunLogFullAccess and AliyunRAMFullAccess policies to the RAM user. For more information, see Grant permissions to the RAM user.
Query real-time logs
Go to the Logs tab.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
On the EMR on ECS page, find the desired cluster and click the name of the cluster in the Cluster ID/Name column.
On the page that appears, click the Logs tab.
Configure the log shipping scope.
On the Logs tab, click Set Log Shipping Scope.
In the Set Log Shipping Scope dialog box, specify a project to store logs.
You can select Select Existing Project or Create Project.
ImportantAfter you specify a project, you cannot change the project.
Select the services whose logs you want to ship and click OK.
View the logs of a service.
Select the service whose logs you want to view from the Please select EMR service drop-down list.
Analyze the logs.
You can specify the time range and query statement in real-time log queries. For example, you can analyze the distribution of a specified field within a specific time range. You can also specify filter conditions to search for the access records that you want to view.
Disable log shipping
If you disable log shipping, the project is not automatically deleted. After you disable log shipping, log on to the Log Service console and delete the project that is automatically created to prevent unexpected charges. For more information, see Manage a project.
If you no longer require log data, perform the following steps to disable log shipping:
Go to the Logs tab.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
On the EMR on ECS page, find the desired cluster and click the name of the cluster in the Cluster ID/Name column.
On the page that appears, click the Logs tab.
On the Logs tab, click Disable Log Shipping.
In the Disable Log Shipping message, click OK.
Manage Log Service projects
On the Logs tab, click Go to Log Service Console to go to the Log Service console. In the Log Service console, you can specify a storage period for log data and modify Logstore-related configurations.
FAQ
Q: Log Service is activated, and the required permissions are granted. However, the message appears: "The resource is unavailable. Please open service log collection on EMR console". What do I do?
A: The parameters of log shipping are not configured. Click Set Log Shipping Scope to complete the configurations of log shipping.