The Operation and Maintenance feature of MaxCompute allows you to view historical jobs and jobs that are running, view job details, and analyze the resource load of a job when the job is running. This helps you manage jobs.
Feature description
The Operation and Maintenance feature of MaxCompute allows you to view and manage historical jobs and jobs that are running in your project.
Data developers can use this feature to view job details, identify job exceptions, and troubleshoot job issues at the earliest opportunity. For example, data developers can terminate one or more jobs in which exceptions occur to handle job issues.
Administrators can use this feature to view the resource load at a specific point in time and allocate and manage system resources in an efficient manner based on the quota group to which the resource belongs. This helps improve job execution efficiency and performance.
You can configure filter conditions to filter jobs on the Jobs page of the MaxCompute console. This helps you query the details of a job and analyze a job. You can perform the following operations on the Jobs page.
Operations
Filter jobs
You can configure filter parameters to query the details of jobs. The following table describes the filter parameters.
Sort jobs
The job filtering results are sorted by job completion time in descending order, with unfinished jobs appearing at the top. Basic single-column sorting and advanced multi-column sorting are supported.
Basic single-column sorting: Sort the column with a sort button in the list in ascending or descending order.
Advanced multi-column sorting: Click the Advanced Sorting button in the upper-right corner of the list, add columns by clicking Add Sort, and specify the sort order such as ascending and descending for each column. Click OK to apply the multi-column sorting.
NoteWhen advanced sorting conditions are applied, basic single-column sorting cannot be performed. You need to click the Advanced Sorting button in the upper-right corner of the list, then click Reset and OK before you can perform basic single-column sorting again.
View job details
To view the details of a job, perform the following steps: In the job list, find the desired job and click LogView in the Actions column to go to the LogView page. On the page that appears, view the status, details, and results of the job.
Terminate jobs
You can terminate one or more jobs that are in the
Running
state at a time.Jobs Insight
You can perform insight operations on individual jobs to view job summaries, resource consumption, and resource allocation for Quota at a specific point in time, as well as to trigger the Intelligent diagnostics for jobs.
NoteIntelligent diagnostics are available exclusively for SQL jobs.
Jobs with a runtime less than 2 minutes or jobs of types other than SQL, MapReduce, Spark, and Mars do not have job-level resource consumption data.
View the chart that displays job statistics
The chart displays the number of jobs on a stacked column chart by time and job state based on the query results. This helps you view the overall status of a job.
Jobs
Query results are obtained based on filter conditions and provide job information for you to manage jobs.
The following job information cannot be collected:
Snapshot information of some jobs. The system collects the snapshot information at an interval of 3 minutes. In this case, the system does not collect the snapshots of jobs that are started within 3 minutes before collection.
Information about specific MaxCompute jobs that are created based on the PAI service, especially the jobs that are created by using RAM users.
Information about jobs in the projects of the MaxCompute developer edition. The MaxCompute developer edition will be phased out.
Data is processed at specific intervals. Therefore, some jobs are in the Running
state in the query results but the jobs on the LogView page are complete. In most cases, this issue occurs when a job is run for an excessively short period of time. If this issue occurs, use the job state on the LogView page.
Examples of O&M scenarios
View the details about a specific job
Scenario
You want to view the details of a specific MaxCompute job or a job that is scheduled by a DataWorks node on an hourly basis.
Procedure
Log on to the MaxCompute console. In the left-side navigation pane, click Jobs.
Specify the Time Range parameter based on your business requirements.
Click Search.
Select ExtNodeId or Instance ID from the drop-down list below the query results and enter the value of ExtNodeId or Instance ID for your job.
Click the icon to filter the jobs.
In the query results, you can find the desired instance and click LogView in the Actions column to view the details of the job on the LogView page. For more information about LogView, see Use LogView V2.0 to view job information.
View the details about a job in a specific time range
Scenario
You want to view the jobs that are managed on the last day for the Project_1 and Project_2 projects, identify failed jobs, and troubleshoot errors.
Procedure
Log on to the MaxCompute console. In the left-side navigation pane, click Jobs.
On the Jobs page, set the Time Range parameter to 1d or set the time range from
the current time
of the last day to the current time of the current day.Select Project_1 and Project_2 from the Choose Project drop-down list.
In the query results, you can find the desired instance and click LogView in the Actions column to view the details of the job on the LogView page. For more information about LogView, see Use LogView V2.0 to view job information.
View the resources occupied by a job with a subscription quota at a specific point in time
Scenario
A large number of resources in the quota group named
Subscription Default Quota
are occupied. As a result, multiple jobs are waiting for the resources of the quota group. You can use the following method to view the jobs that use the quota.Procedure
Log on to the MaxCompute console. In the left-side navigation pane, click Jobs.
Set the Time Range parameter to 1h. Alternatively, specify a custom
start time
and set theend time
to the current time. The end time is the time when you observe the job.Set the Select Quota parameter to
Subscription Default Quota
.Click Search.
You can view the CPU Utilization Percentage Snapshot and Memory Usage Percentage Snapshot parameters of the jobs whose Snapshot Status is
Running
in the query results. You can check whether the job that has large values of the CPU Utilization Percentage Snapshot and Memory Usage Percentage Snapshot parameters meets your business requirements. You can determine whether the job runs as expected or whether the job needs to be terminated based on other job information.NoteIn the query results, you can find the desired instance and click LogView in the Actions column to view the details of the job on the LogView page. For more information about LogView, see Use LogView V2.0 to view job information.
View details of an MCQA job
Scenario
You want to view the status and details of the MCQA job in the last day.
Procedure
Log on to the MaxCompute console. In the left-side navigation pane, click Jobs.
Set the Time Range parameter to 1d and select SQLRT (Query Acceleration) from the Job Type drop-down list.
Click Search.
In the query results, you can find the desired instance and click LogView in the Actions column to view the details of the job on the LogView page. For more information about LogView, see Use LogView V2.0 to view job information.
NoteFor MCQA jobs, multiple SQL statements may be executed in the same session. One session corresponds to one instance ID. You can click an instance ID to view the status of all SQL statements in a session on the LogView page. Take note of the following issues when you query a job of this type on the Operation and Maintenance page:
An active session indicates that some SQL statements are still being executed. If a session remains active, the job is in the
Running
state.If a session expires or is closed, the job is in the
Canceled
state.
View the resource consumption of a job and the resource allocation of computing quotas at a specific point in time
Scenario
If a job is not complete for a long period of time and you cannot locate the cause on the LogView page, you can analyze the job to check whether the issue occurs due to insufficient resources. After the job is complete, if the job runs at a low speed, you can analyze the job to check whether the issue occurs due to insufficient resources.
Procedure
Log on to the MaxCompute console. In the left-side navigation pane, click Jobs.
On the Jobs page, specify the Time Range and Select Quota parameters and click Search to filter MaxCompute jobs.
In the obtained results, find the desired job and click Insight in the Actions column to go to the Job Insights page.
On the CU Usage tab, you can view the resource consumption in the lifecycle of the job.
You can view the trend of the number of used compute units (CUs) and the number of CUs that wait to be used by a job within a specific period of time, and the trend of the CU metrics at the quota group level within a specific period of time based on the resource consumption chart. If the number of CUs used by a job is small, but the number of CUs used by a job in a quota group is large or even continuously reaches the upper limit, the resources in the quota group are insufficient. In this case, other jobs preempt computing resources from the current job.
You can click a time point in the horizontal axis of the resource consumption trend chart to view the resource allocation in the quota group at the point in time. You can view the number of jobs that are using CUs and the number of jobs that wait to use CUs and view the statistics on the priorities of existing jobs. You can click the legend that corresponds to the desired priority to go to the job list and view the details of the jobs. This way, you can identify the jobs that preempt computing resources from the current job. You can adjust job priorities or manage computing resources to optimize job execution based on your business requirements. For more information, see Job priority or Manage quotas in the new MaxCompute console.
What to do next
A job occupies an excessive number of resources and affects the execution of other jobs.
If the job does not meet your business requirements, you can terminate the job.
If the job meets your business requirements, invalid settings of the resources in the quota group exist. In this case, you must optimize the resource configuration plan. For more information, see Computing cost optimization.
References
You can run commands to view the details and status of a job and terminate a job. For more information, see Instance operations.