All Products
Search
Document Center

Application Real-Time Monitoring Service:Use flame graphs to locate performance bottlenecks

Last Updated:Apr 09, 2024

The out-of-the-box continuous profiling feature of Application Real-Time Monitoring Service (ARMS) Application Monitoring effectively discovers bottlenecks caused by CPU, memory, or I/O in Java programs, and helps developers optimize programs, reduce latency, increase throughput, and save costs. This topic describes how to use flame graphs of the continuous profiling feature to locate performance bottlenecks.

Prerequisites

The continuous profiling feature is enabled. For more information, see Use the continuous profiling feature.

Definition

Flame graphs visualize program performance and display information about the function calls of a program, including the time consumed, to help developers track them.

f13b95a2436706e37974aad93e9e0a40

A flame graph consists of an x-axis, a y-axis, and multiple boxes. Each box represents a function in the stack. The x-axis measures the proportion of resource usage of a function, and the y-axis measures the depth of a function. By comparing flame graphs at different time points, you can efficiently diagnose and handle the performance bottlenecks of a program.

Categories

Flame graphs are classified into two categories: flame graph (narrow sense) and icicle graph. In a flame graph in a narrow sense, the top elements are at the top, and the bottom elements are at the bottom, as shown in Figure 1. In an icicle graph, the top elements are at the bottom, whereas the bottom elements are at the top, as shown in Figure 2.

Figure 1. Flame graph (narrow sense)

f13b95a2436706e37974aad93e9e0a40

Figure 2. Icicle graph

image

Use a flame graph

As a flame graph represents a stack, functions with wide boxes consume more CPU than those with narrow boxes.

In computer science, a stack is an abstract data type that serves as a collection of elements with two main operations: Push and Pop. Push operations insert elements into the stack, and Pop operations remove elements from the stack. The stack bottom contains functions that are initially called, and the stack top contains child functions that are recently called. When the last child function is executed at the top, it is removed from the stack. The more time consumed to execute the function, the more time consumed by its parent function and the wider its box, as shown in the following figure.

image

You can perform the following steps to analyze a flame graph:

  1. Find the top based on the flame graph type.

  2. If the total resource usage of the flame graph is high, check whether the stack top has wide boxes.

  3. If the stack top has a wide box, search from top to bottom, find the first function defined by the application, and then check whether the function can be optimized.

Example

The following figure shows a flame graph with high resource usage. Perform the following steps to discover performance bottlenecks.

image

  1. As it is an icicle graph with the stack top at the bottom and the stack bottom at the top, you need to analyze it from the bottom up.

  2. The java.util.LinkedList.node(int) function on the right side of the stack top has a wide box.

  3. Because the java.util.LinkedList.node(int) function is a library function of Java Development Kit (JDK), you need to search up further, and you can find the java.util.LinkedList.get(int) function and its parent function com.alibaba.cloud.pressure.memory.HotSpotAction.readFile(). As the first service function defined by the application, the com.alibaba.cloud.pressure.memory.HotSpotAction.readFile() function consumes 3.89 seconds, accounting for 76.06% of the stack. Therefore, a conclusion can be drawn that the com.alibaba.cloud.pressure.memory.HotSpotAction.readFile() function consumes a large amount of resources in the specified time period. You can use the function to analyze the logic of relevant functions and check whether they can be optimized.

    In addition, based on the java.net.SocketInputStream function in the lower-left corner of the flame graph, you can find the first parent function defined by the application is com.alibaba.cloud.pressure.memory.HotSpotAction.invokeAPI, accounting for about 23% of the stack.