Data monitoring

Updated at: 2024-12-24 07:47

You can view the capacity usage details, read/write throughput, and read/write IOPS of a Cloud Parallel File Storage (CPFS) for Lingjun file system by monitoring the metrics for the capacity and performance of the CPFS for Lingjun file system. You can configure alert rules for important metrics of the CPFS for Lingjun file system. This way, you can receive notifications about exceptions and handle the exceptions at the earliest opportunity. This topic describes the metrics that are supported by CPFS for Lingjun and the alert rule configuration for the metrics.

Background

Cloud Monitor is a service that monitors Internet applications and Alibaba Cloud resources. You can use Cloud Monitor to monitor the metrics of Alibaba Cloud resources and configure alert rules for specific metrics. This way, you can monitor the usage of your Alibaba Cloud resources and the status of your applications. You can also handle alerts at the earliest opportunity to ensure the availability of your applications. For more information, see What is Cloud Monitor?

Retention policy of monitoring data

Monitoring data is retained for 90 days. After the retention period expires, the monitoring data is automatically cleared. The retention period starts when data is generated.

Metrics

Cloud Monitor monitors the capacity and performance of CPFS for Lingjun file systems. Cloud Monitor also monitors the performance of clients on a compute node.

Capacity monitoring

Type

Metric

Metric name

Unit

Description

Type

Metric

Metric name

Unit

Description

File system

CPFS Capacity

Total Storage Space

Bytes

The total storage space of a file system within a specific period of time.

CPFS Capacity Used

Data Volume

Bytes

The amount of data that is actually used by a file system within a specific period of time.

CPFS Inode Limit

Maximum Number of Files

Count

The maximum number of files that can be used by a file system within a specific period of time.

CPFS Inode Alloc

Number of Allocated Files

Count

The number of files that are allocated by a file system within a specific period of time.

CPFS Inode Used

Number of Used Files

Count

The number of files that are used by a file system within a specific period of time.

Fileset

BMCPFSFsetCapacityLimit

Allocated Capacity

Bytes

The maximum storage space that can be used by a fileset to write data. If the size of written data reaches the upper limit, the fileset cannot write data.

BMCPFSFsetCapacityUsed

Used Capacity

Bytes

The storage space that is actually used by a fileset.

BMCPFSFsetInodeLimit

Number of Files Allocated by Fileset

Count

The maximum number of files that can be used by a fileset to write data. If the number of used files reaches the upper limit, the fileset cannot write data.

BMCPFSFsetInodeUsed

Number of Files Used by Fileset

Count

The number of files that are actually used by a fileset.

Performance monitoring

Type

Metric

Metric name

Unit

Description

Type

Metric

Metric name

Unit

Description

File system

ThruputRead

Read throughput

Bytes/s

The average read throughput per second of a file system within a specific period of time.

ThruputWrite

Write throughput

Bytes/s

The average write throughput per second of a file system within a specific period of time.

IopsRead

Read IOPS

Count/s

The average read IOPS of a file system over a specific period of time.

IopsWrite

Write IOPS

Count/s

The average write IOPS of a file system over a specific period of time.

Dataflow

ThroughputImport

Import throughput

Bytes/s

The average import throughput per second of a dataflow within a specific period of time.

ThroughputExport

Export throughput

Bytes/s

The average export throughput per second of a dataflow within a specific period of time.

QPSImportMeta

Metadata QPS for Data Flow Import

Count/s

The average number of requests that are sent by a data import task for metadata per second within a specific period of time.

QPSExportMeta

Metadata QPS for Data Flow Export

Count/s

The average number of requests that are sent by a data export task for metadata per second within a specific period of time.

IOPSImport

Import IOPS

Count/s

The average IOPS of a data import task over a specific period of time.

IOPSEXport

Export IOPS

Count/s

The average IOPS of a data export task over a specific period of time.

LatencyImport

Import latency

Microseconds

The average latency of a data import task over a specific period of time.

LatencyExport

Export latency

Microseconds

The average latency of a data export task over a specific period of time.

Client

ClientReadIops

Client Read IOPS

Count/s

The average number of a client that reads IOPS per second within a specific period of time.

ClientWriteIops

Client Write IOPS

Count/s

The average number of a client that writes IOPS per second within a specific period of time.

ClientReadLatency

Client Average Read IOPS

us

The average latency of a client that reads IOPS per second within a specific period of time.

ClientWriteLatency

Client Average Write Latency

us

The average latency of a client that writes IOPS per second within a specific period of time.

ClientReadThroughput

Client Read Throughput

Bytes/s

The average throughput of a client that reads IOPS per second within a specific period of time.

ClientWriteThroughput

Client Write Throughput

Bytes/s

The average throughput of a client that writes IOPS per second within a specific period of time.

Note
  • Elastic File Client (EFC) is a client developed by CPFS team. The EFC is installed on a compute node to connect the CPFS for Lingjun file system.

  • You can log on to the Cloud Monitor console or call the Cloud Monitor API to view the performance monitoring data of the client. For more information, see the Use the Cloud Monitor console or Use the Cloud Monitor API section of the "View the performance monitoring data of a CPFS file system" topic.

  • If you use the CPFS for Lingjun file system in the EFC or single-tenant Lingjun resources, hostname is the host name of a node.

  • If you use the CPFS for Lingjun file system in the general computing resources or Lingjun resources, hostname is the pod id of a task.

Alert rules

You can configure alert rules for various metrics in the Cloud Monitor console. If the metric value of a resource meets the alert condition, Cloud Monitor automatically sends notifications to the specified recipients. The following table describes the alert severity, notification method, and alert condition that you can configure for alert rules.

Alert severity

Notification method

Alert condition

Alert severity

Notification method

Alert condition

Critical

Phone call, text message, email, and DingTalk chatbot

The average value of the metric reaches the specified threshold for consecutive N cycles. You can configure the value of N based on the alert severity.

Note

The alert condition varies based on the type of the metric that is used.

Warning

Text message, email, and DingTalk chatbot

Info

Email and DingTalk chatbot

References

  • On this page (1)
  • Background
  • Retention policy of monitoring data
  • Metrics
  • Capacity monitoring
  • Performance monitoring
  • Alert rules
  • References
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare