You can view the capacity usage details, read/write throughput, and read/write IOPS of a Cloud Parallel File Storage (CPFS) for Lingjun file system by monitoring the metrics for the capacity and performance of the CPFS for Lingjun file system. You can configure alert rules for important metrics of the CPFS for Lingjun file system. This way, you can receive notifications about exceptions and handle the exceptions at the earliest opportunity. This topic describes the metrics that are supported by CPFS for Lingjun and the alert rule configuration for the metrics.
Background information
CloudMonitor is a service that monitors Internet applications and Alibaba Cloud resources. You can use CloudMonitor to monitor the metrics of Alibaba Cloud resources and configure alert rules for specific metrics. This way, you can monitor the usage of your Alibaba Cloud resources and the status of your applications. You can also handle alerts at the earliest opportunity to ensure the availability of your applications. For more information, see What is CloudMonitor?
Retention policy of monitoring data
Monitoring data is retained for 90 days. After the retention period expires, the monitoring data is automatically cleared. The retention period starts when data is generated.
Metrics
CloudMonitor monitors the capacity and performance of CPFS for Lingjun file systems.
Capacity monitoring
Type | Metric | Metric name | Unit | Description |
File system | CPFS Capacity | Total Storage Space | Bytes | The total storage space of a file system within a specific period of time. |
CPFS Capacity Used | Data Volume | Bytes | The amount of data that is actually used by a file system within a specific period of time. | |
CPFS Inode Limit | Maximum Number of Files | Count | The maximum number of files that can be used by a file system within a specific period of time. | |
CPFS Inode Alloc | Number of Allocated Files | Count | The number of files that are allocated by a file system within a specific period of time. | |
CPFS Inode Used | Number of Used Files | Count | The number of files that are used by a file system within a specific period of time. |
Performance monitoring
Type | Metric | Metric name | Unit | Description |
File system | ThruputRead | Read throughput | Bytes/s | The average read throughput per second of a file system within a specific period of time. |
ThruputWrite | Write throughput | Bytes/s | The average write throughput per second of a file system within a specific period of time. | |
IopsRead | Read IOPS | Count/s | The average read IOPS of a file system over a specific period of time. | |
IopsWrite | Write IOPS | Count/s | The average write IOPS of a file system over a specific period of time. | |
Dataflow | ThroughputImport | Import throughput | Bytes/s | The average import throughput per second of a dataflow within a specific period of time. |
ThroughputExport | Export throughput | Bytes/s | The average export throughput per second of a dataflow within a specific period of time. | |
QPSImportMeta | Metadata QPS for Data Flow Import | Count/s | The average number of requests that are sent by a data import task for metadata per second within a specific period of time. | |
QPSExportMeta | Metadata QPS for Data Flow Export | Count/s | The average number of requests that are sent by a data export task for metadata per second within a specific period of time. | |
IOPSImport | Import IOPS | Count/s | The average IOPS of a data import task over a specific period of time. | |
IOPSEXport | Export IOPS | Count/s | The average IOPS of a data export task over a specific period of time. | |
LatencyImport | Import latency | Microseconds | The average latency of a data import task over a specific period of time. | |
LatencyExport | Export latency | Microseconds | The average latency of a data export task over a specific period of time. |
Alert rules
You can configure alert rules for various metrics in the CloudMonitor console. If the metric value of a resource meets the alert condition, CloudMonitor automatically sends notifications to the specified recipients. The following table describes the alert severity, notification method, and alert condition that you can configure for alert rules.
Alert severity | Notification method | Alert condition |
Critical | Phone call, text message, email, and DingTalk chatbot | The average value of the metric reaches the specified threshold for consecutive N cycles. You can configure the value of N based on the alert severity. Note The alert condition varies based on the type of the metric that is used. |
Warning | Text message, email, and DingTalk chatbot | |
Info | Email and DingTalk chatbot |