This topic provides answers to some frequently asked questions about the Alert Management sub-service of Application Real-Time Monitoring Service (ARMS).
Questions
What are the differences in alert rules between the old and new versions of the alerting feature in Managed Service for Prometheus?
Alibaba Cloud has verified the efficacy of the alert rule template that is provided by the new version of the alerting feature in Managed Service for Prometheus. The alert rules provided in the old version are open source and have not been verified by Alibaba Cloud.
In addition, the new version of the alerting feature provides the Alert Management sub-service, which is not available in the old version. After you create alert rules in ARMS and alerts are generated, the alert events are sent to the Alert Management sub-service. You can subscribe to notifications for specific alerts in Alert Management based on your business requirements.
Alert Management provides the following benefits:
You can configure alert rules in a more efficient manner. You need to configure only trigger rules for alerts when you create alert rules. Then, you can attach notification policies to alerts to send alert notifications based on your requirements.
You can configure notification policies that are more fine-grained than alert rules. For example, you can subscribe to alert notifications based on namespaces of Container Service for Kubernetes (ACK).
You can configure a notification policy and bind the policy to multiple alert rules, which reduces manual operations of setting the notification method for each alert rule.
We recommend that you configure the rules in the notification policy to dispatch alerts to user groups.
Scenarios:
Infrastructure O&M: The user group needs to be subscribed to alerts about the resource usage of the production cluster and alerts about ACK components.
You can configure the following dispatch rules:
Rule 1:
alertName == CPU utilization of nodes higher than 80% & clusterName == Production cluster
Rule 2:
alertname == ApiServer Failures & clusterName == Production cluster
Payment service O&M: The user group needs to be subscribed to alerts from the pay and pay-pre namespaces in the production cluster.
You can configure the following dispatch rule:
namespace Regex match pay.* & clustername == Production cluster
Notifications for alerts at the P1 level: The user group needs to be subscribed to alerts whose severity is critical.
You can configure the following dispatch rule:
severity == critial & clustername == Production cluster
I have configured a new notification policy. Why am I still receiving alert notifications based on earlier notification policies?
Check and note down the value of the Notification Policy field in the alert notification that you received. Go to the ARMS console and find the notification policy. View the dispatch rules in the notification policy.
Check and note down the value of the Notification Policy field in the alert notification that is sent to you based on the earlier notification policy. For more information, see View historical alerts.
Find the notification policy in the ARMS console, and check whether the dispatch rules in the notification policy match the alert whose notification you received. For more information, see Create and manage a notification policy.
If the dispatch rules match the alert, you will still receive alert notifications based on the earlier notification policy.
I have configured a notification policy. Why am I receiving notifications about alerts that I do not want to view?
Check and note down the value of the Notification Policy field in the alert notification that you received. For more information, see View historical alerts.
Find the notification policy in the ARMS console, and check whether the dispatch rules in the notification policy match the alert whose notification you received. For more information, see Create and manage a notification policy.
If the dispatch rules match the alert, you will receive alert notifications based on the notification policy.
Why does the dispatch rule _aliyun_arms_alert_rule_id
appear in a notification policy?
If you specify a notification policy when you create an alert rule, the dispatch rule _aliyun_arms_alert_rule_id == {{Alert rule ID}}
is added to the specified notification policy.
Why do I receive alert notifications after I choose not to specify a notification policy for an alert rule?
Alerts are sent to the Alert Management sub-service regardless of whether you specify a notification policy for an alert rule. If an alert meets the dispatch rules of an existing notification policy, you can receive an alert notification.
Do notification policies have the same priority?
Yes. If an alert meets the dispatch rules of multiple notification policies, multiple notifications are sent.
What is the logical relation between dispatch rules?
The logical relation between different dispatch rules of the same notification policy is OR. A notification is sent if one rule is matched. The logical relation between different conditions of a dispatch rule is AND. A rule is matched only if all conditions are met.
Do I need to specify a notification policy when I create an alert rule?
When you create an alert rule, you can specify a notification policy to dispatch alerts. For example, you can specify a notification policy to send Alert A to Contact B. If you want to sort, mute, group, or process alerts, we recommend that you do not specify a notification policy when you create an alert rule. After you create an alert rule, you can go to the ARMS console to create a custom notification policy based on your business requirements. For more information, see Create and manage a notification policy.
Why are false alerts generated?
The following alerts are false alerts:
The CPU utilization of nodes is higher than 8,000%.
The status of pods is abnormal.
The pods time out during startup.
These alerts are caused by invalid configurations in the alert rule template provided by the old version of the alerting feature. The Alert Management sub-service has updated the template. If you have created alert rules by using the template provided by the old version of the sub-service, you must manually migrate these rules to the new template.
You can perform the following steps to migrate your existing alert rules to the new template:
Delete the alert rules that are created by using the template of the old version.
Create alert rules by using the new template.
The operations to delete and create alert rules vary with the monitoring service:
For more information about how to manage an alert rule in Application Monitoring, see Create and manage an alert rule in Application Monitoring alert rules.
For more information about how to manage an alert rule in Browser Monitoring, see Create and manage a Browser Monitoring alert rule.
For more information about how to manage an alert rule in Managed Service for Prometheus, see Create an alert rule.
What is the relationship between Alert Management and Alertmanager? Can I send alerts from Managed Service for Prometheus to a self-managed Alertmanager instance?
In open source Prometheus, you can send generated alerts to Alertmanager, and you need to manually configure Alertmanager to process alerts, such as dispatching alerts and sending alert notifications. When Managed Service for Prometheus is used, the Alert Management sub-service is equivalent to a multi-tenant Alertmanager instance that is hosted by Alibaba Cloud. Alerts are automatically sent to Alert Management for processing. The Alert Management sub-service supports the main features of the open source Alertmanager.
You cannot configure Managed Service for Prometheus to send alerts to a self-managed Alertmanager instance. Alert Management allows you to use a webhook to send alert notifications in the format used by Alertmanager. For more information, see Format of alert notifications sent by using webhooks.
Why do some alert notifications contain the New event
message that is not preconfigured?
Some alert notifications contain the New event
message that is not preconfigured.
The Alert Management sub-service groups alert events by labels, and sends an alert notification for each event group. When a new event is generated and added to an existing event group, Alert Management sends a new alert notification with the New event
message.
How do I modify the content of a DingTalk alert card?
The alert card consists of two parts. The alert content is configured by using a notification template in the specified notification policy (Icon 1), and other information is configured by using a chatbot.
Configure a notification template in a notification policy
- Log on to the ARMS console.
In the left-side navigation pane, choose . On the Notification Policy page, find the notification policy and click Edit in the Actions column.
In the panel that appears, click the Notification Objects tab, and then modify the Notification Content parameter on the DingTalk/Lark/WeCom tab.
NoteBy default, the Go template syntax is used to render the notification template. For more information about the syntax, see Configure a notification template and a webhook template.
Configure other information in the alert card
In the left-side navigation pane, choose .
Click the DingTalk/Lark/WeCom tab, find the notification policy and then click Edit in the Actions column.
In the panel that appears, you can configure the alert card style based on your business requirements.