Use flame graphs to locate performance bottlenecks - Application Real-Time Monitoring Service

The out-of-the-box continuous profiling feature of Application Real-Time Monitoring Service (ARMS) Application Monitoring effectively discovers bottlenecks caused by CPU, memory, or I/O in Java programs, and helps developers optimize programs, reduce latency, increase throughput, and save costs. This topic describes how to use flame graphs of the continuous profiling feature to locate performance bottlenecks.

Prerequisites

The continuous profiling feature is enabled. For more information, see Use the continuous profiling feature.

Definition

Flame graphs visualize program performance and display information about the function calls of a program, including the time consumed, to help developers track them.

f13b95a2436706e37974aad93e9e0a40

A flame graph consists of an x-axis, a y-axis, and multiple boxes. Each box represents a function in the stack. The x-axis measures the proportion of resource usage of a function, and the y-axis measures the depth of a function. By comparing flame graphs at different time points, you can efficiently diagnose and handle the performance bottlenecks of a program.

Use a flame graph

As a flame graph represents a stack, functions with wide boxes consume more CPU than those with narrow boxes.

In computer science, a stack is an abstract data type that serves as a collection of elements with two main operations: Push and Pop. Push operations insert elements into the stack, and Pop operations remove elements from the stack. The stack bottom contains functions that are initially called, and the stack top contains child functions that are recently called. When the last child function is executed at the top, it is removed from the stack. The more time consumed to execute the function, the more time consumed by its parent function and the wider its box, as shown in the following figure.

You can perform the following steps to analyze a flame graph:

Find the top based on the flame graph type.
If the total resource usage of the flame graph is high, check whether the stack top has wide boxes.
If the stack top has a wide box, search from top to bottom, find the first function defined by the application, and then check whether the function can be optimized.

Example

The following figure shows a flame graph with high resource usage. Perform the following steps to discover performance bottlenecks.

As it is an icicle graph with the stack top at the bottom and the stack bottom at the top, you need to analyze it from the bottom up.
The java.util.LinkedList.node(int) function on the right side of the stack top has a wide box.
Because the java.util.LinkedList.node(int) function is a library function of Java Development Kit (JDK), you need to search up further, and you can find the java.util.LinkedList.get(int) function and its parent function com.alibaba.cloud.pressure.memory.HotSpotAction.readFile(). As the first service function defined by the application, the com.alibaba.cloud.pressure.memory.HotSpotAction.readFile() function consumes 3.89 seconds, accounting for 76.06% of the stack. Therefore, a conclusion can be drawn that the com.alibaba.cloud.pressure.memory.HotSpotAction.readFile() function consumes a large amount of resources in the specified time period. You can use the function to analyze the logic of relevant functions and check whether they can be optimized.
In addition, based on the java.net.SocketInputStream function in the lower-left corner of the flame graph, you can find the first parent function defined by the application is com.alibaba.cloud.pressure.memory.HotSpotAction.invokeAPI, accounting for about 23% of the stack.

Application Real-Time Monitoring Service:Use flame graphs to locate performance bottlenecks

Prerequisites

Definition

Categories

Use a flame graph

Example