Cluster CPU utilization exceeds the threshold

Updated at: 2023-04-19 07:25

Description

This alert is triggered when CPU utilization of an OBServer in an OceanBase cluster exceeds the threshold. The CPU utilization of the OBServer refers to the CPU utilization of the entire server on the OS.

Alert rule

For more information about how to add an alert rule, see Add an alert rule

Alert item

Metric type

Metric

Default threshold

Duration (consecutive cycles for triggering an alert)

Detection cycle

Alert level

Alert item

Metric type

Metric

Default threshold

Duration (consecutive cycles for triggering an alert)

Detection cycle

Alert level

CPU utilization on an OBServer

Single metric

cpu_util

90

15

1 minute

Warning

Impact on the system

  • The overloaded CPU slows down the handling of read and write requests and may even cause services to time out. This issue degrades the service performance of the system.

Possible causes

  • The application queries a large amount of data or generates hotspot data.

  • The resource plan of the cluster cannot cope with business requirements or hotspot data is generated.

Solutions

  1. Check whether the load is normal for the application.

    1. Log on to the ApsaraDB for OceanBase console. Click the target cluster name on the Clusters list to go to the Cluster Workspace page.

    2. Click Tenant Management, and then click the target tenant to go to the Tenant Workspace page.

    3. On the Performance Monitoring page of the tenant workspace, view the CPU Utilization curve and check whether the CPU utilization at the alert time was an abrupt increase compared with the CPU utilization in the past one to seven days.

      • If yes, the CPU load was abnormal.

      • Otherwise, the high CPU load was caused by normal access traffic. In this case, consider scaling out the tenant.

        High CPU utilization on an OBServer -1
  2. If the high CPU utilization was caused by querying a large amount of data or hotspot traffic, perform the following steps based on the actual scenarios.

    • Large SQL queries were executed. In this case, click the TopSQL tab on the Diagnosis page, and then check whether SQL queries with high CPU utilization exist.

      • If yes, optimize the SQL queries that caused high CPU utilization.

      • Otherwise, high CPU utilization was not caused by large SQL queries.

        High CPU utilization on an OBServer -2
    • The high CPU utilization was caused by slow SQL queries. In this case, click the SlowSQL tab on the Diagnosis page, and then check the diagnosis result for slow SQL queries. Optimize the slow SQL queries that you found.

    • Enable throttling on the Diagnosis page for problematic SQL statements.

  • On this page (0, O)
  • Description
  • Alert rule
  • Impact on the system
  • Possible causes
  • Solutions
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare