Memory metrics - Application Real-Time Monitoring Service

This topic describes the common query commands and metrics that are related to memory.

Linux memory

The value of the MemTotal metric is less than the RAM capacity because the BIOS and kernel initialization of the Linux boot process consume memory. The value of the MemTotal metric can be obtained by running the free command.

dmesg | grep Memory
Memory: 131604168K/134217136K available (14346K kernel code, 9546K rwdata, 9084K rodata, 2660K init, 7556K bss, 2612708K reserved, 0K cma-reserved)

Linux memory query commands:

free command

free
            total       used       free     shared    buff/cached    available
Mem:    131641168    1827360  122430044          0          63308      3415776

/proc/meminfo command

cat /proc/meminfo
MemTotal:       131641168 kB
MemFree:        122430044 kB
MemAvailable:   124968912 kB
Buffers:           63308 kB
Cached:          3415776 kB
SwapCached:            0 kB
Active:           613436 kB
Inactive:        7674576 kB
Active(anon):       3504 kB
Inactive(anon):  4784612 kB
Active(file):     609932 kB
Inactive(file):  2889964 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:              1472 kB
Writeback:             0 kB
AnonPages:       4641928 kB
Mapped:          1346848 kB
Shmem:              6972 kB
KReclaimable:     174888 kB
Slab:             352948 kB
SReclaimable:     174888 kB
SUnreclaim:       178060 kB
KernelStack:       48416 kB
PageTables:        30296 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    65820584 kB
Committed_AS:   22967072 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       77312 kB
VmallocChunk:          0 kB
Percpu:            42752 kB
HardwareCorrupted:     0 kB
AnonHugePages:   2852864 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
DupText:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:      288568 kB
DirectMap2M:    12294144 kB
DirectMap1G:    123731968 kB

You can use the following formula to calculate the Linux memory based on the queried memory data:

total = used + free + buff/cache    // Total memory = Used memory + Free memory + Cache memory

The used memory includes the memory consumed by the kernel and all processes.

Note

kernel used=Slab + VmallocUsed + PageTables + KernelStack + HardwareCorrupted + Bounce + X

Process memory

The memory consumed by processes includes the following parts:

The physical memory to which the virtual address space is mapped.
The memory consumed for generating a page cache for disk reading and writing.

Physical memory to which the virtual address space is mapped

Physical memory: the memory consumed for installing the hardware (RAM capacity).
Virtual memory: the memory provided by the operating system for program running. Programs run in two modes: user mode and kernel mode.
- User mode: User mode is the non-privileged mode for user programs.
  User space consists of the following items:
  - Stack: the function stack used for function calls
  - Memory mapping segment (MMap): the area for memory mapping
  - Heap: the dynamically allocated memory
  - BBS: the space where uninitialized static variables reside
  - Data: the space where initialized static constants reside
  - Text: the space where binary executable codes reside
  Programs running in user mode use MMap to map the virtual address to the physical memory.
- Kernel mode: Running programs need to access the kernel data of the operating system.
  Kernel space consists of the following items:
  - Direct mapping space: uses simple mapping to map virtual addresses to the physical memory.
  - Vmalloc: the dynamic mapping space of the kernel. Vmalloc is used to map continuous virtual addresses to non-contiguous physical memory.
  - Persistent kernel mapping space: maps virtual addresses to high-end memory of the physical memory.
  - Fixed mapping space: meets specific mapping requirements.

The physical memory to which virtual addresses are mapped is divided into shared memory and exclusive memory. As shown in the following figure, Memory 1 and 3 is exclusively occupied by Process A, Memory 2 is exclusively occupied by Process B, and Memory 4 is shared by Processes A and B. Memory mapping table

Page cache

To map process files, you can use MMap files for direct mapping. You can also use syscall that is related to buffered I/O to write data to the page cache. In this case, the page cache occupies some memory. pagecache

Process memory metrics

Memory metrics of single processes

Process resources are stored in the following locations:

Anonymous (anonymous pages): Anonymous memory is the stack space used by the programs. No corresponding files exist in the disk.
File-backed (file pages): Resources are stored in disk files that contain code blocks and font information.

Note

The following metrics are related to single-process memory:

anno_rss (RSan): the exclusive memory for all types of resources.
file_rss(RSfd): all memory occupied by file-backed resources.
shmem_rss(RSsh): the shared memory of the Anonymous resources.

The following table describes the commands used to query metrics.

Command	Metric	Description	Formula
`top`	VIRT	The virtual address space.	None
	RES	The physical memory of RSS mapping.	anno_rss + file_rss + shmem_rss
	SHR	The shared memory.	file_rss + shmem_rss
	MEM%	The memory usage.	RES / MemTotal
`ps`	VSZ	The virtual address space.	None
	RSS	The physical memory of RSS mapping.	anno_rss + file_rss + shmem_rss
	MEM%	The memory usage.	RSS / MemTotal
`smem`	USS	The exclusive memory.	anno_rss
	PSS	The memory proportionally allocated.	anno_rss + file_rss/m + shmem_rss/n
	RSS	The physical memory of RSS mapping.	anno_rss + file_rss + shmem_rss

Memory metrics

Note

Memory Working Set Size (WSS) is a reasonable calculation method to evaluate the memory required to keep processes running. However, WSS cannot be accurately calculated due to the restriction of Linux page claim.

Memory metrics of cgroups

Control groups (cgroups) are used to limit, account for, and isolate a group of Linux process resources. For more information, see Red Hat Linux 6 documentation.

Cgroups are hierarchically managed. Each hierarchy is attached to one or more subsystems and contains a set of files. The files contain the metrics of the subsystems. For example, the memory control group (memcg) file contains memory metrics. cgroup architecture

The memcg file contains the following metrics:

cgroup.event_control       # Call the eventfd operation.
memory.usage_in_bytes      # View the used memory.
memory.limit_in_bytes      # Configure or view the current memory limit.
memory.failcnt             # View the number of times that the memory usage reaches the limit.
memory.max_usage_in_bytes  # View the historical maximum memory usage.
memory.soft_limit_in_bytes # Configure or view the current soft limit of the memory.
memory.stat                # View the memory usage of the current cgroup.
memory.use_hierarchy       # Specify whether to include the memory usage of child cgroups into the memory usage of the current cgroup, or check whether the memory usage of child cgroups is included into the memory usage of the current cgroup.
memory.force_empty         # Reclaim as much memory as possible from the current cgroup.
memory.pressure_level      # Configure notification events for memory pressure. This metric is used with cgroup.event_control.
memory.swappiness          # Configure or view the current swappiness value.
memory.move_charge_at_immigrate # Specify whether the memory occupied by a process is moved when the process is moved to another cgroup.
memory.oom_control         # Configure or view the oom controls configurations.
memory.numa_stat           # View the numa-related memory.

We recommend that you pay attention to the following metrics:

memory.limit_in_bytes: You can use the metric to configure or view the memory limit of the current cgroup. This metric is similar to the memory limit metric of Kubernetes and Docker.
memory.usage_in_bytes: You can use the metric to view the total memory used by all processes in the current cgroup. The metric value is approximately equal to the value of the RSS+Cache metric in the memory.stat file.

memory.stat: You can use the metric to view the memory usage of the current cgroup.

Field in the memory.stat file	Description
cache	The size of cached pages.
rss	The sum of anno_rss memory of all processes in the cgroup.
mapped_file	The sum of file_rss and shmem_rss memory of all processes in the cgroup.
active_anon	The memory occupied by all anonymous processes and swap cache in the active Least Recently Used (LRU) cache list, including `tmpfs` (`shmem`). Unit: bytes.
inactive_anon	The memory occupied by all anonymous processes and swap cache in the inactive LRU cache list, including `tmpfs` (`shmem`). Unit: bytes.
active_file	Memory used by all file-backed processes in the active LRU list. Unit: bytes.
inactive_file	Memory used by all file-backed processes in the inactive LRU list. Unit: bytes.
unevictable	The unevictable memory. Unit: bytes.

Metrics prefixed with total_ apply to the current cgroup and all child cgroups. For example, the total_rss metric indicates the sum of the RSS metric value of the current cgroup and the RSS metric values of all child cgroups.

Summary

The following table lists the differences between single-process metrics and cgroup metrics.

Metric	Single process	cgroup (memcg)
RSS	anon_rss + file_rss + shmem_rss	anon_rss
mapped_file	None	file_rss + shmem_rss
cache	None	PageCache

The anno_rss metric is the only RSS metric of cgroups. anno_rss is similar to the USS metric of a single process. Therefore, the value of mapped_file plus the value of RSS equals to the RSS metric value of a single process.
You need to separately calculate page cache data in a single process. The memory calculated in the memcg file of a cgroup already contains page cache data.

Memory statistics in Docker and Kubernetes

Memory statistics in Docker and Kubernetes are similar to Linux memcg statistics, except that the definitions of memory usage are different.

docker stats command

The following figure provides a sample response. docker stats file

Note

For more information about how to run the docker stats command, see Docker documentation.

func calculateMemUsageUnixNoCache(mem types.MemoryStats) float64 {
    return float64(mem.Usage - mem.Stats["cache"])
}

LIMIT is similar to the memory.limit_in_bytes metric of cgroups.
MEM USAGE is similar to the memory.usage_in_bytes-memory.stat [total_cache] metric of cgroups.

kubectl top pod command

The kubectl top command uses metrics-server and Heapster to obtain the value of working_set in Cadvisor that indicates the size of memory used by pods (excluding pause containers). The following code shows how to obtain the memory of a pod in a metrics-server. For more information, see Kubernetes documentation.

func decodeMemory(target *resource.Quantity, memStats *stats.MemoryStats) error {
    if memStats == nil || memStats.WorkingSetBytes == nil {
        return fmt.Errorf("missing memory usage metric")
    }

    *target = *uint64Quantity(*memStats.WorkingSetBytes, 0)
    target.Format = resource.BinarySI

    return nil
}

The following code shows how to calculate the value of working_set in Cadvisor. For more information, see Cadvisor documentation.

func setMemoryStats(s *cgroups.Stats, ret *info.ContainerStats) {
    ret.Memory.Usage = s.MemoryStats.Usage.Usage
    ret.Memory.MaxUsage = s.MemoryStats.Usage.MaxUsage
    ret.Memory.Failcnt = s.MemoryStats.Usage.Failcnt

    if s.MemoryStats.UseHierarchy {
        ret.Memory.Cache = s.MemoryStats.Stats["total_cache"]
        ret.Memory.RSS = s.MemoryStats.Stats["total_rss"]
        ret.Memory.Swap = s.MemoryStats.Stats["total_swap"]
        ret.Memory.MappedFile = s.MemoryStats.Stats["total_mapped_file"]
    } else {
        ret.Memory.Cache = s.MemoryStats.Stats["cache"]
        ret.Memory.RSS = s.MemoryStats.Stats["rss"]
        ret.Memory.Swap = s.MemoryStats.Stats["swap"]
        ret.Memory.MappedFile = s.MemoryStats.Stats["mapped_file"]
    }
    if v, ok := s.MemoryStats.Stats["pgfault"]; ok {
        ret.Memory.ContainerData.Pgfault = v
        ret.Memory.HierarchicalData.Pgfault = v
    }
    if v, ok := s.MemoryStats.Stats["pgmajfault"]; ok {
        ret.Memory.ContainerData.Pgmajfault = v
        ret.Memory.HierarchicalData.Pgmajfault = v
    }

    workingSet := ret.Memory.Usage
    if v, ok := s.MemoryStats.Stats["total_inactive_file"]; ok {
        if workingSet < v {
            workingSet = 0
        } else {
            workingSet -= v
        }
    }
    ret.Memory.WorkingSet = workingSet
}

Therefore, the Memory Usage queried by running the kubectl top pod command can be calculated based on the following formula: Memory Usage = Memory WorkingSet = memory.usage_in_bytes - memory.stat[total_inactive_file].

Summary

Command	Ecosystem	Calculation method of Memory Usage
`docker stat`	Docker	memory.usage_in_bytes - memory.stat[total_cache]
`kubectl top pod`	Kubernetes	memory.usage_in_bytes - memory.stat[total_inactive_file]

If you use the Top and PS commands to query memory usage metrics, the Memory Usage metric in cgroups can be calculated with these metrics by using the following formulas.

cgroup ecosystem	Formula
Memcg	rss + cache (active cache + inactive cache)
Docker	rss
K8s	rss + active cache

Java statistics

Virtual address spaces of Java processes

The following figure shows the distribution of data storage in the virtual address spaces of Java processes.

Use JMX to obtain memory metrics

You can obtain the memory metrics of Java processes by using exposed JMX data. For example, you can use JConsole to obtain memory metrics.

Memory data is revealed through MBeans. MBean

Exposed JMX metrics do not contain all the memory metrics of JVM processes. For example, the memory consumed by Java threads is not contained in the exposed JMX metrics. Therefore, the accumulated result of exposed JMX memory-usage data is not equal to the RSS metric value of JVM processes.

JMX MemoryUsage tool

JMX reveals MemoryUsage by using MemoryPool MBeans. For more information, see Oracle documentation. JMX MemoryUsage

The used metric indicates the consumed physical memory.

NMT tool

Java Hotspot VM provides the Native Memory Tracking (NMT) for memory tracking. For more information, see Oracle documentation.

Note

NMT is not suitable for production environments due to overheads.

You can use NMT to obtain the following metrics:

jcmd 7 VM.native_memory

Native Memory Tracking:

Total: reserved=5948141KB, committed=4674781KB
-                 Java Heap (reserved=4194304KB, committed=4194304KB)
                            (mmap: reserved=4194304KB, committed=4194304KB)

-                     Class (reserved=1139893KB, committed=104885KB)
                            (classes #21183)
                            (  instance classes #20113, array classes #1070)
                            (malloc=5301KB #81169)
                            (mmap: reserved=1134592KB, committed=99584KB)
                            (  Metadata:   )
                            (    reserved=86016KB, committed=84992KB)
                            (    used=80663KB)
                            (    free=4329KB)
                            (    waste=0KB =0.00%)
                            (  Class space:)
                            (    reserved=1048576KB, committed=14592KB)
                            (    used=12806KB)
                            (    free=1786KB)
                            (    waste=0KB =0.00%)

-                    Thread (reserved=228211KB, committed=36879KB)
                            (thread #221)
                            (stack: reserved=227148KB, committed=35816KB)
                            (malloc=803KB #1327)
                            (arena=260KB #443)

-                      Code (reserved=49597KB, committed=2577KB)
                            (malloc=61KB #800)
                            (mmap: reserved=49536KB, committed=2516KB)

-                        GC (reserved=206786KB, committed=206786KB)
                            (malloc=18094KB #16888)
                            (mmap: reserved=188692KB, committed=188692KB)

-                  Compiler (reserved=1KB, committed=1KB)
                            (malloc=1KB #20)

-                  Internal (reserved=45418KB, committed=45418KB)
                            (malloc=45386KB #30497)
                            (mmap: reserved=32KB, committed=32KB)

-                     Other (reserved=30498KB, committed=30498KB)
                            (malloc=30498KB #234)

-                    Symbol (reserved=19265KB, committed=19265KB)
                            (malloc=16796KB #212667)
                            (arena=2469KB #1)

-    Native Memory Tracking (reserved=5602KB, committed=5602KB)
                            (malloc=55KB #747)
                            (tracking overhead=5546KB)

-        Shared class space (reserved=10836KB, committed=10836KB)
                            (mmap: reserved=10836KB, committed=10836KB)

-               Arena Chunk (reserved=169KB, committed=169KB)
                            (malloc=169KB)

-                   Tracing (reserved=16642KB, committed=16642KB)
                            (malloc=16642KB #2270)

-                   Logging (reserved=7KB, committed=7KB)
                            (malloc=7KB #267)

-                 Arguments (reserved=19KB, committed=19KB)
                            (malloc=19KB #514)

-                    Module (reserved=463KB, committed=463KB)
                            (malloc=463KB #3527)

-              Synchronizer (reserved=423KB, committed=423KB)
                            (malloc=423KB #3525)

-                 Safepoint (reserved=8KB, committed=8KB)
                            (mmap: reserved=8KB, committed=8KB)

JVM is divided into various memory areas with different purposes, such as Java heap and class, and additional memory blocks. In addition, exposed JMX data does not contain the memory usage of threads. However, Java programs generally have tens of thousands of threads that consume a large amount of memory.

Note

For more information about the memory types of hotspots, see Corretto documentation.

Reserved and Committed metrics

NMT statistics reveal Reserved and Committed metrics. However, neither Reserved nor Committed can map used physical memory.

The following figure shows the mapping relationships among the Reserved and Committed metrics of virtual addresses, and physical addresses. The value of the Committed metric is always greater than the value of the Used metric. In this case, the Used metric is similar to the RSS metric of JVM processes. Physical memory of NMT mapping

Summary

The metrics collected by common Java application tools are mainly exposed by JMX. JMX exposes some memory pools that can be tracked within JVM. However, the sum of these memory pools cannot be mapped to the RSS metric of JVM processes.
NMT exposes the details of the internal memory usage of JVM, but the measurement result is not the Used metric, but the Committed metric. Therefore, the total value of Committed may be slightly greater than the value of RSS.
NMT cannot track some memory outside JVM. For example, NMT cannot track the memory usage if Java programs have additional Malloc behaviors. In this case, the value of the RSS metric is greater than the memory usage data of NMT.
If the value of the Committed metric obtained by NMT and the value of the RSS metric are extremely different, memory leaks may have occurred.
You can use other NMT metrics for further troubleshooting:
1. Use NMT baseline and diff to troubleshoot in JVM areas.
2. Use NMT and pmap to troubleshoot memory issues outside JVM.