When you use MaxCompute, you may need to monitor MaxCompute subscription resources and consumption of pay-as-you-go jobs to learn about the status of resources. This helps you upgrade resources or plan jobs at the earliest opportunity. You can also configure alert rules. If the status of resources meets an alert rule, CloudMonitor automatically sends you an alert notification. This way, you can learn about the status of the resources at the earliest opportunity.
Monitoring and alerting scheme
The monitoring and alerting feature of MaxCompute allows you to perform the following operations:
Use CloudMonitor to configure metrics. This way, you can monitor subscription resources and real-time job consumption.
You can log on to the MaxCompute console, view the number of alerts for each metric in the Alert and Risk Warnings section of the Overview page.
View monitoring charts on a dashboard to learn about the changes of metrics in real time. For more information, see Configure a dashboard.
Configure custom alert rules and add alert contacts. If the value of a metric reaches or exceeds the specified alert threshold, CloudMonitor automatically sends alert notifications to the specified contact. Alert notifications can be sent by using phone calls, text messages, emails, or DingTalk chatbot. For more information, see Configure alert rules.
Use the MaxCompute client to monitor resource consumption of an SQL statement. For more information about how to monitor the resource consumption of SQL statements, see Configure an upper limit for the resources that are consumed by an SQL statement.
Metrics
The following table describes the metrics.
Metric type | Metric category | Metric | Description |
Metric type | Metric category | Metric | Description |
MaxCompute_Subscription | level1 | PrePayQuotaCpuUsagePercentageOnLevel1 | The CPU utilization of level-1 quotas (sum of the CPU utilization of reserved CUs and elastically reserved CUs). Unit: %. The system collects data once every minute. |
PrePayQuotaCpuUsageValueOnLevel1 | The total number of utilized CPU cores of level-1 quotas. Unit: core. The system collects data once every minute. | ||
PrePayQuotaMemUsagePercentageOnLevel1 | The memory usage of level-1 quotas (sum of the memory usage of reserved CUs and elastically reserved CUs). Unit: %. The system collects data once every minute. | ||
PrePayQuotaMemUsageValueOnLevel1 | The total size of used memory resources of level-1 quotas. Unit: MB. The system collects data once every minute. | ||
level2 | PrePayQuotaCpuUsagePercentageOnLevel2 | The CPU utilization of level-2 quotas (sum of the CPU utilization of reserved CUs specified by minCU and elastically reserved CUs). Unit: %. The system collects data once every minute. | |
PrePayQuotaCpuUsageValueOnLevel2 | The total number of used CPU cores of level-2 quotas. Unit: core. The system collects data once every minute. | ||
PrePayQuotaMemUsagePercentageOnLevel2 | The memory usage of level-2 quotas (sum of the memory usage of reserved CUs specified by minCU and elastically reserved CUs). Unit: %. The system collects data once every minute. | ||
PrePayQuotaMemUsageValueOnLevel2 | The total size of used memory resources of level-2 quotas. Unit: MB. The system collects data once every minute. | ||
PrePayQuotaJobWaitingNumberOnLevel2 | The number of waiting jobs that use level-2 quotas. The system collects data once every minute. | ||
MaxCompute_PayAsYouGo | N/A | PayAsYouGo jobs' daily consumption in USD | The metric that is used to monitor the daily fees of SQL and MapReduce jobs in a project. You can specify the maximum daily consumption amount (USD). If the daily consumption amount of a project reaches or exceeds the value of this parameter, an alert is triggered. |
PayAsYouGo jobs' monthly consumption in USD | The metric that is used to monitor the monthly fees of SQL and MapReduce jobs in a project. You can specify the maximum monthly consumption amount (USD). If the monthly consumption amount of a project reaches or exceeds the value of this parameter, an alert is triggered. |
You can configure a dashboard or alert rules for a metric. For more information, see Configure a dashboard or Configure alert rules.
Configure a dashboard
Log on to the CloudMonitor console.
In the left-side navigation pane, choose
.On the Custom Dashboards page, click Create Dashboard. On the page that appears, click Add Chart in the upper-right corner.
On the New Board page, select a chart type and metric.
Operation
Parameter
Description
Operation
Parameter
Description
Select a chart type
Line
Select a chart type based on your business requirements. The following chart types are provided on a dashboard: Line, Area, Table, Heat Map, and Pie Chart.
Area
Table
Heat Map
Pie chart
Select metrics
Product Name
Select a metric type of MaxCompute from the Cloud Services drop-down list on the Dashboards tab. For more information about the metric types of MaxCompute, see Metrics.
Metric Name
Select a metric from the Monitoring indicators drop-down list. For more information about the metrics of MaxCompute, see Metrics.
Resource
Select the regions and projects that you want to monitor from the Resource drop-down list.
After the configuration is complete, click OK. Then, you can view the charts of the metrics in your custom dashboard.
For more information about how to add a monitoring chart, see Manage the monitoring charts of a custom dashboard.
Configure alert rules
You can configure alert rules for each metric in Metrics.
You can configure an alert rule to trigger alerts for a metric of a resource group. This way, if the CPU utilization or memory usage of CUs in a quota group of MaxCompute_ subscription exceeds the specified threshold, an alert is triggered. For example, you configured 150 CUs for the resource group that you want to monitor, the full CPU utilization of a CU is 100%, and the maximum CPU utilization of all CUs is 15000%. You can configure the following alert rule: An alert is triggered if the CPU utilization of CUs exceeds the specified threshold 12000%. If you receive the alert, the resource group is about to be fully utilized. Jobs may be queued if you continue to submit jobs. You can upgrade the configurations of resource groups or plan jobs based on your business requirements. To configure alert rules in this scenario, perform the following steps:
Log on to the CloudMonitor console.
In the left-side navigation pane, choose
.On the Alert Rules page, click Create Alert Rule.
In the Create Alert Rule panel, configure alert rule parameters for this scenario. For more information about the parameters, see Create an alert rule. For more information about how to configure an alert contact, see Create an alert contact or alert contact group.
The following table describes the key parameters that you must configure for an alert rule in the preceding scenario.
Parameter
Description
Parameter
Description
Product
Select MaxCompute_ subscription from the Product drop-down list.
Resource Range
Select Instances.
Associated Resources
Region: Select the region in which the MaxCompute project resides from the Region drop-down list in the upper-left corner of the pane that appears.
QuotaGroup: Select the quota group that you want to monitor from the quota group list. For more information about quota groups, see Quota management (new).
Add Rule
Rule Name: the name of the alert rule.
Metric Type: Select Single Metric.
Metric: Select Subscription quota cpu usage from the drop-down list.
You can also select QuotaGroupJobWaitQueue from the Rule Description drop-down list. If the CPU utilization is high and a large number of jobs are queued for N consecutive periods, manual intervention may be required.
Click Confirm.
References
For more information about the limits on the consumption of pay-as-you-go computing tasks and related alerts, see Consumption control.