If the resource usage of IoT Platform reaches the specified threshold value in an alert rule, an alert is triggered. Then, Alibaba Cloud sends an alert notification to the specified alert group.
Notifications for threshold-triggered alerts
If an alert is triggered based on a threshold-triggered alert rule, an alert notification email whose content is similar to the following information is sent to the specified alert contacts.
Fields of an alert notification email
Field | Description |
IoT Platform instance | The information about the instance for which the alert is triggered. The information contains the ProductKey (productKey), instance ID (instanceId), and region ID (regionId). |
Metric | The code of the metric that you selected when you configured the Rule Description parameter. In this example, MessageCountForwardedThroughRuleEngine_MNS indicates the number of messages that are forwarded by the rules engine. If the number of messages exceeds the specified threshold value in a specified period of time, an alert is triggered. For more information about metric codes, see the following table: Metric codes. |
Alert time | The time when the alert was triggered. |
Count | The total number of messages, the number of forwarded messages, or the number of connected devices that are counted for the specified metric. |
Duration | The period of time for which the alert exists. |
Rule details | The details of the alert rule that you configured in the CloudMonitor console. |
Metric codes
IoT Platform: basic metrics
Code | Description |
MessageCountForwardedThroughRuleEngine_DATAHUB | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to DataHub. |
MessageCountForwardedThroughRuleEngine_FC | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to Function Compute. |
MessageCountForwardedThroughRuleEngine_KAFKA | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to ApsaraMQ for Kafka. |
MessageCountForwardedThroughRuleEngine_LINDORM | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to Lindorm. |
MessageCountForwardedThroughRuleEngine_MNS | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to Message Service (MNS). |
MessageCountForwardedThroughRuleEngine_MQ | The number of messages forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to ApsaraMQ for RocketMQ. |
MessageCountForwardedThroughRuleEngine_OTS | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to Tablestore (OTS). |
MessageCountForwardedThroughRuleEngine_RDS | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to ApsaraDB RDS. |
MessageCountForwardedThroughRuleEngine_REPUBLISH | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data from the current topic to other topics. |
MessageCountForwardedThroughRuleEngine_TSDB | The number of messages that are forwarded by the rules engine. The value equals the number of times that the rules engine forwards data to Lindorm. |
MessageCountSentFromIoT_HTTP_2 | The number of messages that are sent from IoT Platform over HTTP/2. |
MessageCountSentFromIoT_MQTT | The number of messages that are sent from IoT Platform over Message Queuing Telemetry Transport (MQTT). |
MessageCountSentToIoT_CoAP | The number of messages that are sent to IoT Platform over Constrained Application Protocol (CoAP). |
MessageCountSentToIoT_HTTP | The number of messages that are sent to IoT Platform over HTTP. |
MessageCountSentToIoT_HTTP/2 | The number of messages that are sent to IoT Platform over HTTP/2. |
MessageCountSentToIoT_MQTT | The number of messages that are sent to IoT Platform over MQTT. |
OnlineDevicesCount_MQTT | The number of devices that are connected to IoT Platform over MQTT in real time. |
OnlineDevicesCount_HTTP/2 | The number of online devices that are connected to IoT Platform over HTTP/2 in real time. |
OnlineDeviceInstanceWatermark | The threshold percentage of online devices that are connected to an IoT Platform instance. The value is calculated by using the following formula: Number of online devices that are connected to the instance/Number of online devices that can be connected to the instance × 100%. |
DeviceEventReportError | The number of event submission failures. |
DevicePropertyReportError | The number of property submission failures. |
DevicePropertySettingError | The number of property configuration failures. |
DeviceServiceCallError | The number of service call failures. |
DeviceCount_Product | The number of products that are created for the product. |
MessageCountPerMinute | The number of upstream and downstream messages that the instance sends and receives per minute. |
RuleEngineTransmitCountPerMinute | The number of messages that the rules engine forwards for the instance per minute. |
IoT Platform: Enterprise Edition instances
Code | Description |
DeviceNum_instance | The number of online devices. The percentage of devices that are created in IoT Platform. The value is calculated by using the following formula: Number of devices that are created in IoT Platform/Number of devices that can be created for the instance × 100%. |
LinkAnalyticsCU | The data processing unit. |
LinkAnalyticsStorage | The offline storage space. |
MessageWatermarkTps_instance | The transactions per second (TPS) of upstream and downstream messages. The value of the metric is a percentage. The value is calculated by using the following formula: Current TPS at which upstream and downstream messages are consumed/Specified TPS at which upstream and downstream messages can be consumed when you purchased the Enterprise Edition instance × 100%. |
OtaCommercialUpgradeCount | The number of valid device updates. |
RuleEngineWatermarkTps_instance | The message forwarding TPS. The value of the metric is a percentage. The value is calculated by using the following formula: Current TPS at which the rules engine forwards messages/TPS at which the rules engine can forward messages for the instance × 100%. |
HotStorageReadIops | The read IOPS of the time series storage. |
HotStorageWriteIops | The write IOPS of the time series storage. |
HotStorageCapacity | The time series storage space. |
message_elastic_tps_instance | The number of elastic upstream and downstream messages. |
message_elastic_transmit_instance | The number of elastic messages that are forwarded. |
IoT Platform: AMQP consumer groups
Code | Description |
AMQP_Msg_Accumulate | The number of accumulated messages. |
AMQP_Msg_Consume_rate | The message consumption rate. |
Notifications for event-triggered alerts
If an event-triggered alert rule is triggered, an alert notification email whose content is similar to the following information is sent to the specified alert contacts.
Fields of an alert notification email
Field | Description |
Event Name | The code of the event for which the alert is triggered. In this example, the code Device_Connect_QPM_Limit indicates that the number of connection requests that can be sent per minute by a device has reached the upper limit. For more information about event codes, see the following table: Event codes. |
Resource | The resource for which the alert is triggered.
|
Level | All events trigger WARN-level alerts. |
Time | The time when the event occurred |
Event Status | All events are in the Fail state. This state indicates that the subsequent request failed because the number of connection requests that are sent per minute or the number of messages that are sent per second reaches the upper limit. |
Details | The information about the resource that triggers the alert. The information is in the JSON format. The information contains the region ID (regionId), instance ID (instanceId), ProductKey (productKey), and DeviceName (deviceName). The productKey and deviceName parameters are included in the notification only if the number of connection requests that can be sent per minute, the number of messages that can be sent per second, or the number of messages that can be received per second by a device reaches the upper limit. |
Event codes
Code | Description |
Device_Connect_QPM_Limit | The number of connection requests that can be sent per minute by a device reaches the upper limit. |
Device_Uplink_QPS_Limit | The number of messages that can be sent per second by a device reaches the upper limit. |
Device_Downlink_QPS_Limit | The number of messages that can be received per second by a device reaches the upper limit. |
Account_Connect_QPS_Limit | The number of connection requests that the current account can send to IoT Platform per second reaches the upper limit. |
Account_Uplink_QPS_Limit | The number of requests that the current account sends per second reaches the upper limit. |
Account_Downlink_QPS_Limit | The number of requests that the current account sends to devices per second reaches the upper limit. |
Account_RuleEngine_DataForward_QPS_Limit | The number of requests that the current account sends to the rules engine per second reaches the upper limit. |