You can use the recovery notification feature to check whether monitored objects recover. If you enable the recovery notification feature, Simple Log Service sends a recovery notification when the monitored object recovers.
For example, an alert monitoring rule is created to monitor the CPU metrics of each host. If the CPU utilization of a host exceeds 95%, an alert is triggered. Then, if the CPU utilization decreases to 95% or less, a recovery notification is sent. The following figure shows the configurations. For more information about the parameters in an alert monitoring rule, see Create an alert monitoring rule for logs.
Query Statistics: Specify
* | select promql_query_range('cpu_util') from metrics limit 1000
.This query statement is used to calculate the CPU utilization of each host.
Group Evaluation: Select Auto Tag.
This value specifies that the query and analysis result of time series data is automatically grouped.
Trigger Condition: Select data matches the expression, enter value > 95, and then select Severity: High.
If the value field in the query and analysis result is greater than 95, an alert of high severity is triggered.
Add Annotation: Specify the annotation such as the title and description of an alert. You can reference field variables such as ${host} in the annotation. For more information, see Labels and annotations.
Recovery Notifications: Turn on Recovery Notifications.
Recovery notifications are special alert notifications. In a recovery notification, the alert status is Resolved. In a normal alert notification, the alert status is Firing. For example, the recovery notification feature is enabled in an alert monitoring rule. If an alert is triggered in the previous check, and the trigger condition is not met in the current check, a recovery notification is sent.
Simple Log Service sends a recovery notification in the format of an alert notification. In a recovery notification, Resolved is displayed in the Alert Status field.