A DataWorks task instance that runs on the E-MapReduce (EMR) compute engine contains multiple EMR jobs. The EMR jobs are run in sequence. You can use the engine O&M feature provided by DataWorks to view the details of each EMR job and find jobs that fail to be run and remove the failed jobs. This prevents failed jobs from affecting the running of the DataWorks task instances to which the jobs belong and their descendant instances.
Limits
DataWorks allows you to perform O&M only on EMR jobs. You must submit a ticket to upgrade your EMR execution package to obtain O&M data.
Engine Maintenance is displayed in the left-side navigation pane of the Operation Center page only after you register an EMR cluster to your DataWorks workspace.
If you have purchased an exclusive resource group for scheduling, you must submit a ticket to upgrade the configurations of the resource group. If you do not upgrade the configurations, the values of specific fields are displayed as hyphens (-) on the engine O&M page.
Precautions
Tasks of specific EMR services can reuse YARN applications when the tasks are run. After a YARN application is reused, the same job ID (application ID) is displayed on the engine O&M page for such a task when the task is run in different DataWorks services.
For example, the kyuubi.engine.share.level
parameter that specifies the share level of an EMR Kyuubi engine is set to USER
by default. This indicates that each user uses an engine. All jobs that are initiated by the same user on the engine share the same application ID. When you run an EMR Kyuubi task in DataWorks DataStudio, an application ID is generated for the task. If you proceed to analyze the task in DataAnalysis, no new application ID is generated for the task on the engine O&M page. The application ID that is generated when you run the task in DataStudio is reused. The usage of YARN applications varies based on the EMR service type.
The engine O&M page displays only the application ID that is generated when you run an EMR job in DataWorks for the first time.
After the DataWorks task instance to which an EMR job belongs is successfully run or fails to be run, the corresponding YARN application may still be in the RUNNING state. For example, the
kyuubi.session.engine.idle.timeout
parameter that specifies the timeout period for an idle session is used to determine whether to retain a desired YARN application for a period of time. If thekyuubi.session.engine.idle.timeout
parameter is set toPT30M
, the corresponding YARN application is still retained for 30 minutes after an EMR Kyuubi job finishes running. You can go to the EMR on ECS page to view the parameter settings of Kyuubi.
Prerequisites
An EMR cluster is registered to your DataWorks workspace, and relevant EMR tasks are run in DataWorks.
For more information about how to register an EMR cluster, see Register an EMR cluster to DataWorks.
For more information about how to run an EMR task, see Usage notes for development of EMR nodes in DataWorks.
Go to the EMR engine O&M page
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane of the Operation Center page, choose
.
View EMR jobs
On the EMR engine O&M page, you can view the list of the EMR jobs that are created in all DataWorks workspaces in the current region. You can also view the details of the jobs and perform O&M operations on the jobs based on your business requirements.
Search for an EMR job in Area 1.
You can specify different conditions, such as job ID and job type, in the upper part of the EMR engine O&M page to search for an EMR job.
NoteBy default, the EMR engine O&M page displays the data of the previous three days.
If you want to search for EMR jobs by DataWorks instance ID, you can enter only the IDs of instances that are run in Operation Center in the DataWorks Instance ID field. If you search for EMR jobs by job ID or DataWorks instance ID, you can query only instances to which the EMR jobs belong in the previous seven days.
Perform O&M operations on EMR jobs in Area 2.
In this section, you can view the details of a selected job and perform O&M operations on the job based on your business requirements.
Feature
Description
View job details
You can view the basic information about an EMR job, including the job ID, job status, running duration, job source, and the DataWorks task instance to which the job belongs.
Job status:
NEW: The EMR job is newly created.
NEW_SAVING: The EMR job is being saved.
SUBMITTED: The EMR job is submitted for running.
ACCEPTED: The submitted request for running the job is approved by the scheduling system.
RUNNING: The EMR job is running.
NoteIf an EMR job is in the RUNNING state for an extended period of time, you can manually terminate the DataWorks task instance to which the EMR job belongs. This prevents the EMR job from occupying resources and affecting the descendant instances.
FINISHED: The EMR job finishes running.
SUCCESSED: The EMR job is successfully run.
FAILED: The EMR job fails to be run. If an EMR job is in the FAILED state, you must identify and troubleshoot issues at the earliest opportunity. This can prevent the EMR job from affecting the running of the DataWorks task instance to which the EMR job belongs and its descendant task instances. You can click the job ID or the ID of the DataWorks instance to which the job belongs to go to the details page and troubleshoot issues.
KILLED: The EMR job is terminated by the user who runs the job or the administrator.
DataWorks instance ID:
Different EMR jobs may belong to the same DataWorks task instance. If EMR jobs start to run at different points in time, the EMR jobs are considered to belong to different DataWorks task instances. To determine whether EMR jobs belong to the same DataWorks task instance, you can view the ID in the Node Instance ID column for each EMR job.
NoteNo instance ID is generated for tasks that are triggered to run in specific DataWorks services, such as Data Quality, DataStudio, and DataAnalysis. In this case, the system displays hyphens (-) in the Node Instance ID column for the corresponding jobs.
EMR job type: You can view only EMR jobs of the MapReduce and Spark types.
Sorting by running time: You can sort jobs by start time or end time in ascending or descending order. This way, you can clearly view the running sequence, running durations, and status of EMR jobs.
Job source: You can view the DataWorks service in which an EMR job is run. You can go to the corresponding service page to view the details of the instance to which the job belongs by clicking a button in the Actions column.
Queue usage (%): the percentage of queue resources allocated by the cluster resource manager YARN when you run the current job.
Perform operations on instances to which EMR jobs belong
Terminate a DataWorks task instance.
If an EMR job is in the RUNNING state for an extended period of time, you can manually terminate the job. An EMR job may be in the RUNNING state for an extended period of time because of an internal error, and the job cannot be automatically terminated. To prevent the job from occupying resources and affecting the running of other jobs, you must manually terminate the job and troubleshoot issues at your earliest opportunity.
Terminate a single job: Find the job that you want to terminate and click Terminate Running in the Actions column.
Terminate multiple jobs at a time: Select the jobs that you want to terminate and click Stop DataWorks Node Instances in the lower-left corner of the EMR engine O&M page to terminate the DataWorks task instances to which the selected jobs belong at a time.
ImportantOnly the workspace administrator, users to which the O&M role is assigned, and task owners can terminate task instances.
If multiple EMR jobs belong to the same DataWorks task instance and you terminate one of the EMR jobs, the DataWorks task instance enters the FAILED state.
Only DataWorks task instances that are in the running state can be terminated.
After you terminate a running EMR job in a DataWorks task instance, the DataWorks task instance enters the FAILED state. In this case, descendant instances of the DataWorks task instance are blocked. Exercise caution when you terminate a running EMR job.
Go to service pages to view instance details
If you want to view the details of an instance to which an EMR job belongs, you can find the EMR job on the EMR engine O&M page and click a button that corresponds to a service, such as DataStudio, in the Actions column to go to the service page in which the instance is triggered to run.
NoteDataAnalysis: Only file owners can go to the DataAnalysis page to view SQL query files.
DataStudio: For an instance that is triggered to run in DataStudio, all developers in the current workspace can view the instance on the DataStudio page. Only the user who triggers the instance to run can view the running history of the instance.