ApsaraMQ for Kafka provides the real-time diagnostics feature to regularly check for risks on instances. This feature allows you to view issues that are detected, provides fix suggestions, and notifies relevant contacts of the detected risks.
How the feature works
Alert notifications
Alert notifications are sent only for risks that require immediate attention and adversely affect your instance.
If no alert contact is specified, alert notifications are sent to the owner of the Alibaba Cloud account to which the instance belongs.
If an alert contact is specified, alert notifications are sent to the specified alert contact. An alert notification can be received only if the alert notification is sent within the time range that you specified for receiving alert notifications. For more information, see Manage alert contacts.
Check items
If risks are detected on an instance, refer to the suggestions provided in the ApsaraMQ for Kafka console to fix the risks.
Resource type | Metric name |
CPU and memory | CPU Utilization (%) |
Memory Usage (%) | |
TCP connection | TCP Connections |
Public TCP Connections | |
Disk | Disk Skew |
Disk Usage | |
Disk Load Rate | |
Topic | Message Production Duration |
Producer Message Disk Storage Duration | |
Producer Message Queuing Duration | |
Producer Message Throttling Duration | |
Producer Message Format Conversion Duration | |
Topic Quota | |
Topic Format Conversion | |
Synchronous Transmission | |
Fragmented Transmission | |
Single-partition Topic | |
Topic Skew | |
Production Traffic | |
Partition Quota | |
Partition Assignment Strategy | |
Group | Consumer Message Receiving Duration |
Consumer Message Queuing Duration | |
Consumer Message Disk Read Duration | |
Consumer Message Disk Read Duration | |
Consumer Message Format Conversion Duration | |
Group Quota | |
Consumption Traffic | |
Subscriptions | |
Minor Version Upgrade for Servers | |
Consumer Offset Committing Frequency | |
Groups with Rebalancing Triggered | |
Consumer Proactively Leave Queue | |
Go Clients with Sarama Library |
Procedure
Log on to the ApsaraMQ for Kafka console. In the Resource Distribution section of the Overview page, select the region where the ApsaraMQ for Kafka instance that you want to manage resides.
On the Instances page, click the name of the instance that you want to manage.
On the Instance Details page, click the Instance Risks tab.
On the Instance Risks tab, view the risks that are detected on the instance.
Parameter
Description
Example
Risk Type
The risk description.
Group with Long Consumption Time
Metric Level
The risk level. Valid values:
Repair Required
Important
General
Important
Risk Status
The risk status based on which you can check the health status of the instance. Valid values:
To Be Fixed
Fixed
To Be Fixed
Time of Last Alert
The most recent time when the risk was detected.
2022-03-31
Actions
The actions that can be performed on the risk.
Details: View the risk details and fix suggestions.
In the Actions column of the risk, click Details.
Modify Alert Status: After the risk is fixed, you can set the status of the risk to Fixed. If the risk is not fixed, you can ignore the risk for the next 30 days.
In the Actions column of the risk, click Modify Alert Status.
NoteAfter a risk is fixed, the system no longer sends alert notifications for the same risk within seven days. After the seven-day period elapses, the system sends alert notifications for the same risk.
Delete: After the risk is fixed and the status of the risk is changed to Fixed, you can delete the risk.
In the Actions column of the risk, click Delete.
Note: After you select Fixed for the risk status, alerts can be generated again due to causes such as dirty data not cleared in real time. To avoid such issues, we recommend that you wait seven days before you delete the risk.
None
References
For information about frequently asked questions (FAQ) about instances, see FAQ.