All Products
Search
Document Center

ApsaraDB for OceanBase:Exception handling

Last Updated:Oct 22, 2024

The exception handling feature displays the abnormal events that have occurred or are occurring in the database cluster in the last 3 days. You can use this feature to quickly obtain the health status of the cluster and, when abnormal events occur, perform root cause analysis to pinpoint the cause of the problem.

View the list of abnormal events for all instances

  1. Log on to the ApsaraDB for OceanBase console.

  2. In the left-side navigation pane, choose Autonomy Service > Exception Handling.

  3. In the Abnormal Events section, view the list of abnormal events for all instances.

    By default, the system displays all abnormal events in the last 3 days, including those that are still ongoing and those that have been restored. Currently, the following types of abnormal events are supported: Node CPU Exception, Tenant CPU Exception, Tenant SQL Queue Wait Time Exception, Data Disk I/O Usage Exception, Tenant Active Session Count Exception, and Tenant Disk I/O Time Exception.

    image

View abnormal events for a single instance

  1. In the Abnormal Events section, view the Root Cause Analysis in the Operation column of the target instance.

    The system automatically redirects to the Exception Handling page of the diagnostics center.

  2. In the Abnormal Events section, view abnormal events of the target instance, including Object, Exception Type, Abnormal Performance, Current Status, Occurrence Time, Recovery Time, Duration, and Operation.

  3. Click the Root Cause Analysis in the Operation column of a single abnormal event to view the root cause analysis and optimization suggestions for the abnormal event.

    • If the cause of the abnormal event is in the analysis graph, the system will highlight the cause in red and provide optimization suggestions.

      Note

      In the analysis graph, each node represents an analysis rule. When performing root cause analysis, the system traverses the graph to find the root cause node. The root cause node is highlighted in red, while the green node indicates that the rule does not hit the root cause.

      The following is an example:

      Upon detecting Tenant Queue Waiting Becomes Longer within the specified time period, the system provides a prompt that the CPU usage is too high. In the Suspicious Cause section, you can click the red highlighted box to view the corresponding root cause analysis.image

      In the SQL Summary Information section, the system displays SQL Summary Time Period, Total Executions, Total Number of Error Executions, Maximum Elapsed Time, CPU Time, and Plan Generation Time by default. You can view more information by clicking Manage Columns.image

      In the Possible Root Cause SQL section, you can view the SQL that may cause the problem and click View SQL Details in the Actions column.

      image

    • If the cause of the abnormal event is not in the analysis graph, the system will provide optimization suggestions in the Solution section.

      The following is an example:

      Upon detecting Tenant CPU Exception, the system will still display the analysis graph and provide optimization suggestions in the Solution section.

      image