ApsaraDB RDS for PostgreSQL provides abundant Enhanced Monitoring metrics, including the operating system metrics and database metrics. This topic describes how to view the Enhanced Monitoring metrics of an ApsaraDB RDS for PostgreSQL instance in the ApsaraDB RDS console.
Procedure
- Go to the Instances page. In the top navigation bar, select the region in which the RDS instance resides. Then, find the RDS instance and click the ID of the instance.
In the left-side navigation pane, click Monitoring and Alerts.
On the Enhanced Monitoring tab, click Manage Metrics. On the OS Metrics tab and the Database Metrics tab of the Manage Metrics dialog box, select the metrics that you want to view. For more information, see References.
NoteA maximum of 30 metrics can be displayed on the Enhanced Monitoring tab.
You can apply the selected metrics to all RDS instances in the region of the current RDS instance. This includes the existing RDS instances and all new RDS instances that you create at a later time.
If the current RDS instance uses cloud disks, you can apply the selected metrics to all RDS instances that use cloud disks in the region of the current RDS instance.
If the current RDS instance uses local disks, you can apply the selected metrics to all RDS instances that use local disks in the region of the current RDS instance.
Click Update Metrics. On the Enhanced Monitoring tab, view the metrics that you selected.
On the Enhanced Monitoring tab, specify query criteria based on which you want to query monitoring data.
No.
Feature
Description
1
Time range
You can query monitoring data over a preset time range or a custom time range.
The preset time range can be 30 minutes, 1 hour, 2 hours, 6 hours, 1 day, 7 days, or 30 days.
The custom time range is specified by a start time and an end time in the following format: YYYY-MM-DD hh:mm:ss - YYYY-MM-DD hh:mm:ss.
2
Aggregation method
You can specify the method based on which ApsaraDB RDS aggregates monitoring data. The following aggregation methods are supported:
Average
Maximum
Minimum
3
Layout
You can adjust the layout in which charts displayed. The following layouts are supported:
One column
Two columns
Three columns
Four columns
4
Time granularity
You can specify the time granularity of the x-axis in each chart that is displayed.
The time granularity varies based on the time range that you specify. The following relationships exist between the time granularity and the time range:
If the time range is less than or equal to 1 hour, the time granularity is 5 seconds.
If the time range is greater than 1 hour and less than or equal to 2 hours, the time granularity is 10 seconds.
If the time range is greater than 2 hours and less than or equal to 6 hours, the time granularity is 30 seconds.
If the time range is greater than 6 hours and less than or equal to 12 hours, the time granularity is 1 minute.
If the time range is greater than 12 hours and less than or equal to 1 day, the time granularity is 2 minutes.
If the time range is greater than 1 day and less than or equal to 5 days, the time granularity is 10 minutes.
If the time range is greater than 5 days and less than or equal to 15 days, the time granularity is 30 minutes.
If the time range is greater than 15 days and less than or equal to 30 days, the time granularity is 1 hour.
5
Pointer link
You can turn on Pointer Link. When you move the pointer over a specific point in time on the x-axis of a chart, all charts on the Enhanced Monitoring tab display the monitoring data that is collected at that specific point in time.
6
Refresh
You can manually refresh the Enhanced Monitoring tab to update monitoring data.
References
The following table describes the operating system metrics and database metrics that are supported. In the following table, ticks (️) indicate that a metric is supported, and crosses () indicate that a metric is not supported.
OS Metrics
Category | Metric name | Description | Unit | RDS instance that uses local disks | RDS instance that uses cloud disks |
Network traffic |
| The throughput of inbound traffic of the server. | MB/s | ❌ | ✔️ |
| The throughput of outbound traffic of the server. | MB/s | ❌ | ✔️ | |
CPU utilization |
| The system CPU utilization. The value of this metric is calculated based on the following formula: System CPU utilization = CPU resources consumed to run kernel code/Total CPU resources. | % | ✔️ | ✔️ |
| The user CPU utilization. The value of this metric is calculated based on the following formula: User CPU utilization = CPU resources consumed to run code in user mode/Total CPU resources. | % | ✔️ | ✔️ | |
| The CPU utilization for the server. The value of this metric is calculated based on the following formula: CPU utilization for the server = CPU resources consumed to both run kernel code and run code in user mode/Total CPU resources | % | ✔️ | ✔️ | |
CPU consumption by process |
| The CPU utilization for the backend process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the backend process. | % | ✔️ | ✔️ |
| The CPU utilization for the bgwriter process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the bgwriter process. | % | ✔️ | ✔️ | |
| The CPU utilization for the checkpoint process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the checkpoint process. | % | ✔️ | ✔️ | |
| The CPU utilization for the logger process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the logger process. | % | ✔️ | ✔️ | |
| The CPU utilization for the pgstat process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the pgstat process. | % | ✔️ | ✔️ | |
| The CPU utilization for the walwriter process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the walwriter process. | % | ✔️ | ✔️ | |
| The CPU utilization for the autovacuum process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the autovacuum process. | % | ✔️ | ✔️ | |
| The CPU utilization for the walsender process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the walsender process. | % | ✔️ | ✔️ | |
| The CPU utilization for the postmaster process. If one CPU is consumed, the CPU utilization is 100%. If two CPUs are consumed, the CPU utilization is 200%. In this way, you can calculate the CPU utilization for the postmaster process. | % | ✔️ | ✔️ | |
Memory details |
| The memory size of the instance type. | MB | ✔️ | ✔️ |
| The amount of the memory that is used. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used as page cache. | MB | ✔️ | ✔️ | |
| The amount of the shared memory that is used. | MB | ✔️ | ✔️ | |
| The amount of the RSS memory that is used. | MB | ✔️ | ✔️ | |
| The amount of the huge-page memory that is used. For this metric, the size of a huge page is 2 MB. | MB | ✔️ | ✔️ | |
Memory used by process |
| The amount of the memory that is used by the backend process. | MB | ✔️ | ✔️ |
| The amount of the memory that is used by the bgwriter process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the checkpoint process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the logger process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the pgstat process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the walwriter process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the autovacuum process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the walsender process. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used by the postmaster process. | MB | ✔️ | ✔️ | |
Memory usage |
| Memory usage | % | ✔️ | ✔️ |
IOPS |
| The disk read and write IOPS of the server. | Counts/s | ❌ | ✔️ |
| The disk read IOPS of the server. | Counts/s | ❌ | ✔️ | |
| The disk write IOPS of the server. | Counts/s | ❌ | ✔️ | |
| The IOPS of the local data disk. | Counts/s | ✔️ | ❌ | |
| The IOPS of the local log disk. | Counts/s | ✔️ | ❌ | |
I/O throughout |
| The disk read and write throughput of the server. | MB/s | ❌ | ✔️ |
| The disk read throughput of the server. | MB/s | ❌ | ✔️ | |
| The disk write throughput of the server. | MB/s | ❌ | ✔️ | |
| The read and write throughput of the local data disk. | MB/s | ✔️ | ❌ | |
| The read and write throughput of the local log disk. | MB/s | ✔️ | ❌ | |
Disk usage |
| The disk usage of the server. | % | ❌ | ✔️ |
Disk space |
| The used disk space of the server. | MB | ❌ | ✔️ |
| The total disk space of the server. | MB | ❌ | ✔️ | |
| The size of log files. This includes audit log files, error log files, and slow SQL log files. | MB | ✔️ | ✔️ | |
| The size of WAL files. | MB | ✔️ | ✔️ | |
| The size of data files. This excludes log files and WAL files. | MB | ✔️ | ✔️ |
Database metrics
For more information about database metrics, see PostgreSQL documentation.
Category | Metric name | Description | Unit | RDS instance that uses local disks | RDS instance that uses cloud disks |
Connections |
| The number of connections in the active state. | Counts | ✔️ | ✔️ |
| The number of connections in the waiting state. | Counts | ✔️ | ✔️ | |
| The number of connections in the idle state. | Counts | ✔️ | ✔️ | |
| The total number of connections. | Counts | ✔️ | ✔️ | |
| The maximum number of connections that are allowed. | Counts | ✔️ | ✔️ | |
SQL |
| The number of rows that are returned per second. | Tuples/s | ✔️ | ✔️ |
| The number of rows that are read per second. | Tuples/s | ✔️ | ✔️ | |
| The number of rows that are inserted per second. | Tuples/s | ✔️ | ✔️ | |
| The number of rows that are deleted per second. | Tuples/s | ✔️ | ✔️ | |
| The number of rows that are updated per second. | Tuples/s | ✔️ | ✔️ | |
Slow SQL statements |
| The number of SQL statements that have been running for 1 second. | Counts | ✔️ | ✔️ |
| The number of SQL statements that have been running for 3 seconds. | Counts | ✔️ | ✔️ | |
| The number of SQL statements that have been running for 5 seconds. | Counts | ✔️ | ✔️ | |
Long transactions |
| The number of transactions that have been running for 1 second. | Counts | ✔️ | ✔️ |
| The number of transactions that have been running for 3 seconds. | Counts | ✔️ | ✔️ | |
| The number of transactions that have been idle for 1 second. | Counts | ✔️ | ✔️ | |
| The number of transactions that have been idle for 3 seconds. | Counts | ✔️ | ✔️ | |
| The number of transactions that have been idle for 5 seconds. | Counts | ✔️ | ✔️ | |
| The number of two-phase transactions that have been running for 1 second. | Counts | ✔️ | ✔️ | |
| The number of two-phase transactions that have been running for 3 seconds. | Counts | ✔️ | ✔️ | |
| The number of two-phase transactions that have been running for 5 seconds. | Counts | ✔️ | ✔️ | |
Temporary files |
| The number of temporary files that are generated per second. | Counts/s | ✔️ | ✔️ |
Temporary file size |
| The size of temporary files that are generated per second. | Bytes/s | ✔️ | ✔️ |
Maximum transaction ID |
| The maximum transaction ID on the RDS instance. | xids | ✔️ | ✔️ |
Synchronization latency to read-only instances |
| The latency at which the attached read-only RDS instances replay logs. | s | ✔️ | ✔️ |
| The latency at which the attached read-only RDS instances write data. | s | ✔️ | ✔️ | |
| The latency at which the attached read-only RDS instances flush data. | s | ✔️ | ✔️ | |
Database memory distribution |
| The memory size of the instance type. | MB | ✔️ | ✔️ |
| The amount of the shared_buffer memory that is used. Note The level 1 cache memory remains unchanged after up to 25% of cache memory is used. | MB | ✔️ | ✔️ | |
| The amount of the resident set size (RSS) memory that is used. Note The memory allocated to the PostgreSQL process by calling the malloc function varies based on the number of established connections and running SQL statements. The PostgreSQL process and the page cache indicated by
| MB | ✔️ | ✔️ | |
| The amount of the free memory. Note The free memory will gradually be used up. PostgreSQL allocates the free memory to db.mem_size.cache as much as possible. This helps make full use of the instance memory. | MB | ✔️ | ✔️ | |
| The amount of the memory that is used as page cache. Note The level-2 cache and the page cache indicated by
| MB | ✔️ | ✔️ | |
Available database memory |
| The amount of the available database memory. Note You can calculate the available memory by using the following formula: | MB | ✔️ | ✔️ |
Database memory availability ratio |
| The availability ratio of the database memory. Note
| % | ✔️ | ✔️ |
Shared buffer hit ratio |
| The proportion of requests for which the requested content is hit in the shared buffers. | % | ✔️ | ✔️ |
Shared buffer hits |
| The number of requests for which the requested content is hit in the shared buffers per second. | Blocks/s | ✔️ | ✔️ |
I/O |
| The number of operations that are performed by the backend process per second to read data from the disks to the buffers. | Counts/s | ✔️ | ✔️ |
| The number of operations that are performed by the backend process per second to write data from the buffers to the disks. | Counts/s | ✔️ | ✔️ | |
| The number of operations that are performed by the checkpoint process per second to write data from the buffers to the disks. | Counts/s | ✔️ | ✔️ | |
| The number of operations that are performed by the bgwriter process per second to write data from the buffers to the disks. | Counts/s | ✔️ | ✔️ | |
| The number of times that the backend process calls the fsync() function on the disks per second. | Counts/s | ✔️ | ✔️ | |
Checkpoint quantity |
| The number of checkpoint processes that are scheduled by the database engine per second. | Counts/s | ✔️ | ✔️ |
| The number of checkpoint processes that are requested by the user per second. | Counts/s | ✔️ | ✔️ | |
TPS |
| The number of write transactions that are committed per second. | Counts/s | ✔️ | ✔️ |
| The number of write transactions that are rolled back per second. | Counts/s | ✔️ | ✔️ | |
Transaction statuses |
| The number of transactions in the active state. | Counts | ✔️ | ✔️ |
| The number of transactions in the waiting state. | Counts | ✔️ | ✔️ | |
| The number of transactions in the idle state. We recommend that you check and process these transactions at the earliest opportunity. | Counts | ✔️ | ✔️ | |
Swell point in time |
| The execution duration of the longest transaction. | s | ✔️ | ✔️ |
ReplicationSlot latency |
| The maximum latency that is allowed for the replication slot to replicate WAL records. The WAL records that follow the replication start position must be retained. If the replication start position indicates a WAL record that has a relatively high log sequence number (LSN), WAL records may pile up. In this case, we recommend that you make sure these WAL records are processed at the earliest opportunity. | MB | ✔️ | ✔️ |
Checkpoint write duration |
| The amount of time that the checkpoint process spends per second in running the fsync() function on the disks. | ms/s | ✔️ | ✔️ |
| The amount of time that the checkpoint process spends per second in writing data from the buffers to the disks. | ms/s | ✔️ | ✔️ | |
PgBouncer connections |
| The number of active connections on the client. Note You can view connection pool-related metrics on the Enhanced Monitoring tab only after you enable the connection pooling feature. For more information, see Enable or disable the connection pooling feature. | Counts | ❌ | ✔️ |
| The number of waiting connections on the client. | Counts | ❌ | ✔️ | |
| The number of active connections on the server. | Counts | ❌ | ✔️ | |
| The number of idle connections on the server. | Counts | ❌ | ✔️ | |
| The total number of connections in a connection pool. | Counts | ❌ | ✔️ | |
| The number of connection pools. | Counts | ❌ | ✔️ |
Related operations
Operation | Description |
Queries the performance data of an instance. | |
Queries the list of available Enhanced Monitoring metrics. | |
Modifies displayed Enhanced Monitoring metrics. | |
Queries enabled Enhanced Monitoring metrics. |