This topic describes the metrics that are used in the Application Monitoring sub-service of Application Real-Time Monitoring (ARMS). You can use these metrics to create custom Grafana dashboards.
Business metrics
Common dimensions
Dimension name | Dimension key |
Service name | service |
Service PID | pid |
Server IP address | serverIp |
Interface | rpc |
Metrics
The following table describes the metrics that available for all access types. When you perform a query operation, you only need to replace $callType
with a specific access type. For more information about access types, see the Service access types and available dimensions section.
For example, to query the number of HTTP requests, you only need to change arms_$callType_requests_count
to arms_http_requests_count
.
Metric description | Metric | Measurement | Collection interval (seconds) | Unit | Dimension |
Number of requests | arms_$callType_requests_count | Gauge | 15 | None | Different service access types have different dimensions. For more information, see Service access types and available dimensions. |
Number of failed requests | arms_$callType_requests_error_count | Gauge | 15 | None | |
Request duration | arms_$callType_requests_seconds | Gauge | 15 | Seconds | |
Number of slow requests | arms_$callType_requests_slow_count | Gauge | 15 | None | |
Quantile of request duration | arms_$callType_requests_latency_seconds | Summary | 15 | Seconds | This metric is used only when the service access type is HTTP and quantile statistics is enabled. For more information, see Advanced settings. Quantile values:
|
All preceding metrics other than the arms_$callType_requests_latency_seconds
metric are of the Gauge type. For one of these Gauge metrics provided by ARMS, the value at each point represents the cumulative total within the collection interval, which is different from the Gauge metrics generated by open source application frameworks. For example, to calculate the average queries per second (QPS) in a minute, the Prometheus Query Language (PromQL) statement for an ARMS Gauge metric is sum_over_time(arms_$callType_requests_count[1m])/60
, and that for a metric in an open source application framework is rate(http_server_requests_count[1m])
.
Aggregation metrics
Different business metrics are provided for various call types. However, an application related to multiple call types, such as HTTP and Dubbo, leads to lengthy PromQL statements.
Business metrics are used to monitor all dimensions of an application, which is not necessary in some scenarios and may increase the performance of metric query.
To solve the preceding problems, ARMS provides aggregation metrics.
Categories
Aggregation metrics are classified into the following categories:
General-purpose
Monitors the number of requests, number of errors, number of slow requests, and average request duration of all calls types.
Database
Monitors the number of requests, number of errors, number of slow requests, and average request duration for database calls.
SQL
Monitors the number of requests, number of errors, number of slow requests, and average request duration for database calls, including SQL database calls.
Exception
Monitors the number of requests and the average request duration for exceptions of all call types.
Status code
Monitors the number of requests with different HTTP status codes.
Quantile
Monitors the quantiles of request duration for all call types.
Aggregation metrics other than quantile-specific metrics are further classified into basic aggregation metrics named in the xxx_raw format, and dimensionality reduction metrics named in the xxx_ign_x_y format. In the name of a dimensionality reduction metric, x and y are aggregated dimensions.
Measurement and data collection interval
Unless intentionally specified, all aggregation metrics use Gauge, and data is collected every 15 seconds.
Common dimensions
The following dimensions are applicable to all aggregation metrics.
Dimension | Description |
pid | The application PID. |
service | The application name. |
serverIp | The IP address of the server. |
source | The source of the metric. The following sources are supported:
|
Metrics
Category | Metric description | Metric | Unit | Dimension |
General-purpose | Number of requests | arms_app_requests_count_raw | None |
|
arms_app_requests_count_ign_destid_endpoint_rpc | None | destId, endpoint, and rpc are excluded. | ||
arms_app_requests_count_ign_destid_endpoint_ppid_prpc | None | destId, endpoint, ppid, and prpc are excluded. | ||
arms_app_requests_count_ign_destid_endpoint_ppid_prpc_rpc | None | destId, endpoint, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_count_ign_parent_ppid_prpc_rpc | None | parent, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_count_ign_endpoint_parent_ppid_prpc_rpc | None | endpoint, parent, ppid, prpc, and rpc are excluded. | ||
Number of request errors | arms_app_requests_error_count_raw | None |
| |
arms_app_requests_error_count_ign_destid_endpoint_rpc | None | destId, endpoint, and rpc are excluded. | ||
arms_app_requests_error_count_ign_destid_endpoint_ppid_prpc | None | destId, endpoint, ppid, and prpc are excluded. | ||
arms_app_requests_error_count_ign_destid_endpoint_ppid_prpc_rpc | None | destId, endpoint, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_error_count_ign_parent_ppid_prpc_rpc | None | parent, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_error_count_ign_endpoint_parent_ppid_prpc_rpc | None | endpoint, parent, ppid, prpc, and rpc are excluded. | ||
Number of slow requests | arms_app_requests_slow_count_raw | None |
| |
arms_app_requests_slow_count_ign_destid_endpoint_rpc | None | destId, endpoint, and rpc are excluded. | ||
arms_app_requests_slow_count_ign_destid_endpoint_ppid_prpc | None | destId, endpoint, ppid, and prpc are excluded. | ||
arms_app_requests_slow_count_ign_destid_endpoint_ppid_prpc_rpc | None | destId, endpoint, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_slow_count_ign_parent_ppid_prpc_rpc | None | parent, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_slow_count_ign_endpoint_parent_ppid_prpc_rpc | None | endpoint, parent, ppid, prpc, and rpc are excluded. | ||
Request duration | arms_app_requests_seconds_raw | Seconds |
| |
arms_app_requests_seconds_ign_destid_endpoint_rpc | Seconds | destId, endpoint, and rpc are excluded. | ||
arms_app_requests_seconds_ign_destid_endpoint_ppid_prpc | Seconds | destId, endpoint, ppid, and prpc are excluded. | ||
arms_app_requests_seconds_ign_destid_endpoint_ppid_prpc_rpc | Seconds | destId, endpoint, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_seconds_ign_parent_ppid_prpc_rpc | Seconds | parent, ppid, prpc, and rpc are excluded. | ||
arms_app_requests_seconds_ign_endpoint_parent_ppid_prpc_rpc | Seconds | endpoint, parent, ppid, prpc, and rpc are excluded. | ||
Database | Number of database requests | arms_db_requests_count_raw | None |
|
arms_db_requests_count_ign_rpc | None | The interface dimension is excluded. | ||
Number of database request errors | arms_db_requests_error_count_raw | None |
| |
arms_db_requests_error_count_ign_rpc | None | The interface dimension is excluded. | ||
Number of slow database requests | arms_db_requests_slow_count_raw | None |
| |
arms_db_requests_slow_count_ign_rpc | None | The interface dimension is excluded. | ||
Database request duration | arms_db_requests_seconds_raw | Seconds |
| |
arms_db_requests_seconds_ign_rpc | Seconds | The interface dimension is excluded. | ||
SQL | Number of SQL requests | arms_sql_requests_count_raw |
| |
arms_sql_requests_count_ign_rpc | The interface dimension is excluded. | |||
Number of SQL request errors | arms_sql_requests_error_count_raw | None |
| |
arms_sql_requests_error_count_ign_rpc | None | The interface dimension is excluded. | ||
Number of slow SQL requests | arms_sql_requests_slow_count_raw | None |
| |
arms_sql_requests_slow_count_ign_rpc | None | The interface dimension is excluded. | ||
SQL request duration | arms_sql_requests_seconds_raw | Seconds |
| |
arms_sql_requests_seconds_ign_rpc | Seconds | The interface dimension is excluded. | ||
Exception | Number of requests with exceptions | arms_exception_requests_count_raw | None |
|
arms_exception_requests_count_ign_rpc | None | The interface dimension is excluded. | ||
Duration of requests with exceptions | arms_exception_requests_seconds_raw | Seconds |
| |
arms_exception_requests_seconds_ign_rpc | Seconds | The interface dimension is excluded. | ||
Status code | Number of different status code requests | arms_requests_by_status_count_raw | None |
|
arms_requests_by_status_count_ign_rpc | None | The interface dimension is excluded. | ||
Quantile | Quantile of request duration Note The ARMS agent v4.X and later supports the metric. | arms_uni_requests_latency_seconds |
|
Examples
How do I select a metric if I want to use promQL to monitor and query the number of requests of all interfaces for an application?
First, narrow down the metrics to general-purpose metrics because only general-purpose metrics can be used to monitor the number of interface requests.
Secondly, select metrics that include the interface dimension rather than other dimensions, such as upstream interface, upstream application, or remote address.
In summary, the optimal metric is arms_app_requests_count_ign_destid_endpoint_ppid_prpc.
JVM metrics
Common dimensions
Dimension name | Dimension key |
Service name | service |
Service PID | pid |
Server IP address | serverIp |
Metrics
Metric description | Metric | Measurement | Collection interval (seconds) | Unit | Dimension |
Cumulative GC occurrences | arms_jvm_gc_total | Counter | 15 | None | GC generation:
|
Cumulative GC duration | arms_jvm_gc_seconds_total | Counter | 15 | Seconds | |
Occurrences of GC between two collection intervals | arms_jvm_gc_delta | Gauge | 15 | None | |
Duration of GC between two collection intervals | arms_jvm_gc_seconds_delta | Gauge | 15 | Seconds | |
Number of JVM threads | arms_jvm_threads_count | Gauge | 15 | None | Thread status:
|
Initial size of JVM memory area | arms_jvm_mem_init_bytes | Gauge | 15 | Bytes | Area:
ID space:
|
Maximum size of JVM memory area | arms_jvm_mem_max_bytes | Gauge | 15 | Bytes | |
Used size of JVM memory area | arms_jvm_mem_used_bytes | Gauge | 15 | Bytes | |
Committed size of JVM memory area | arms_jvm_mem_committed_bytes | Gauge | 15 | Bytes | |
Usage ratio of JVM memory area | arms_jvm_mem_usage_ratio | Gauge | 15 | Ratio (0 to 1) | |
Loaded JVM classes | arms_class_load_loaded | Counter | 15 | None | None |
Unloaded JVM classes | arms_class_load_un_loaded | Counter | 15 | None | None |
JVM cache pool size | arms_jvm_buffer_pool_total_bytes | Gauge | 15 | Bytes | ID space:
|
Used size of JVM cache pool | arms_jvm_buffer_pool_used_bytes | Gauge | 15 | Bytes | |
Number of JVM cache pools | arms_jvm_buffer_pool_count | Gauge | 15 | None | |
Number of opened file descriptors | arms_file_desc_open_count | Gauge | 15 | None | None |
File descriptor opening ratio (Number of opened file descriptors/Maximum number allowed) | arms_file_desc_open_ratio | Gauge | 15 | Ratio (0 to 1) | None |
System metrics
Common dimensions
Dimension name | Dimension key |
Service name | service |
Service PID | pid |
Server IP address | serverIp |
Metrics
Metric description | Metric | Measurement | Collection interval (seconds) | Unit |
Idle CPU percentage | arms_system_cpu_idle | Gauge | 15 | Percentage |
I/O wait CPU percentage | arms_system_cpu_io_wait | Gauge | 15 | Percentage |
System CPU percentage | arms_system_cpu_system | Gauge | 15 | Percentage |
User CPU percentage | arms_system_cpu_user | Gauge | 15 | Percentage |
System load (1 minute) | arms_system_load | Gauge | 15 | None |
Idle disk size | arms_system_disk_free_bytes | Gauge | 15 | Bytes |
Total disk size | arms_system_disk_total_bytes | Gauge | 15 | Bytes |
Disk usage | arms_system_disk_used_ratio | Gauge | 15 | Ratio (0 to 1) |
Memory buffer size | arms_system_mem_buffers_bytes | Gauge | 15 | Bytes |
Memory cache size | arms_system_mem_cached_bytes | Gauge | 15 | Bytes |
Idle memory size | arms_system_mem_free_bytes | Gauge | 15 | Bytes |
Idle memory swap size | arms_system_mem_swap_free_bytes | Gauge | 15 | Bytes |
Memory swap size | arms_system_mem_swap_total_bytes | Gauge | 15 | Bytes |
Memory size | arms_system_mem_total_bytes | Gauge | 15 | Bytes |
Used memory size | arms_system_mem_used_bytes | Gauge | 15 | Bytes |
Inbound network traffic | arms_system_net_in_bytes | Gauge | 15 | Bytes |
Outbound network traffic | arms_system_net_out_bytes | Gauge | 15 | Bytes |
Number of network ingress errors | arms_system_net_in_err | Gauge | 15 | None |
Number of network egress errors | arms_system_net_out_err | Gauge | 15 | None |
Thread pool and connection pool metrics
Common dimensions
Dimension name | Dimension key |
Service name | service |
Service PID | pid |
Server IP address | serverIp |
Thread pool name (ARMS agent earlier than V4.1.x) | name |
Thread pool type (ARMS agent earlier than V4.1.x) | type |
Metrics
ARMS agent V4.1.x and later
Thread pool metrics
Metric description | Metric | Measurement | Collection interval (seconds) | Dimension |
Number of core threads | arms_thread_pool_core_pool_size | Gauge | 15 |
|
Maximum number of idle connections | arms_thread_pool_max_pool_size | Gauge | 15 |
|
Number of active threads | arms_thread_pool_active_thread_count | Gauge | 15 |
|
Current number of threads | arms_thread_pool_current_thread_count | Gauge | 15 |
|
Maximum historical number of threads | arms_thread_pool_max_thread_count | Gauge | 15 |
|
Number of scheduled tasks | arms_thread_pool_scheduled_task_count | Counter | 15 |
|
Number of completed tasks | arms_thread_pool_completed_task_count | Counter | 15 |
|
Number of rejected tasks | arms_thread_pool_rejected_task_count | Counter | 15 |
|
Task queue size | arms_thread_pool_queue_size | Gauge | 15 |
|
Connection pool metrics
Metric description | Metric | Measurement | Collection interval (seconds) | Dimension |
Number of connections | arms_connection_pool_connection_count | Gauge | 15 |
|
Minimum number of idle connections | arms_connection_pool_connection_min_idle_count | Gauge | 15 |
|
Maximum number of idle connections | arms_connection_pool_connection_max_idle_count | Gauge | 15 |
|
Maximum number of connections | arms_connection_pool_connection_max_count | Gauge | 15 |
|
Number of blocked connection requests | arms_connection_pool_pending_request_count | Counter | 15 |
|
ARMS agent earlier than V4.1.x
Metric description | Metric | Measurement | Collection interval (seconds) | Dimension |
Number of core threads | arms_threadpool_core_size | Gauge | 15 | None |
Maximum number of threads | arms_threadpool_max_size | Gauge | 15 | None |
Number of active threads | arms_threadpool_active_size | Gauge | 15 | None |
Thread pool queue size | arms_threadpool_queue_size | Gauge | 15 | None |
Current size of the thread pool | arms_threadpool_current_size | Gauge | 15 | None |
Number of tasks in different states in the thread pool | arms_threadpool_task_total | Gauge | 15 | The status of the task. Valid values:
|
Scheduled task metrics
The following metrics are available only for scheduled tasks.
Common dimensions
Dimension name | Dimension key |
Service name | service |
Service PID | pid |
Server IP address | serverIp |
Task ID | rpc |
Metrics
Metric description | Metric | Measurement | Collection interval (seconds) | Unit |
Scheduling delay | arms_$callType_delay_milliseconds | Gauge | 15 | Milliseconds |
Service access types and available dimensions
Clients
Access types
http_client
dubbo_client
hsf_client
dsf_client
notify_client
grpc_client
thrift_client
sofa_client
mq_client
kafka_client
Dimensions
parent: the name of the upstream service
ppid: the PID of the upstream service
destId: the extension information of the request peer
endpoint: the endpoint of the request peer
excepType: the ID of the exception
excepInfo: the ID encoding rule of the exception
excepName: the name of the exception
stackTraceId: the ID of the exception stack
Databases
Access types
mysql
oracle
mariadb
postgresql
ppas
sqlserver
mongodb
dmdb
Dimensions
parent: the name of the upstream service
ppid: the PID of the upstream service
destId: the name of the database
endpoint: the endpoint of the database
excepType: the ID of the exception
excepInfo: the ID encoding rule of the exception
excepName: the name of the exception
stackTraceId: the ID of the exception stack
sqlId: the ID of the SQL statement
Servers
Access types
http
dubbo
hsf
dsf
user_method
mq
kafka
grpc
thrift
sofa
Dimensions
prpc: the upstream interface
parent: the name of the upstream service
ppid: the PID of the upstream service
endpoint: the endpoint of the service
excepType: the ID of the exception
excepInfo: the ID encoding rule of the exception
excepName: the name of the exception
stackTraceId: the ID of the exception stack
Scheduled tasks
Access types
xxl_job
spring_scheduled
quartz
elasticjob
jdk_timer
schedulerx
Dimensions
prpc: the upstream interface
parent: the name of the upstream service
ppid: the PID of the upstream service
excepType: the ID of the exception
excepInfo: the ID encoding rule of the exception
excepName: the name of the exception
stackTraceId: the ID of the exception stack