All Products
Search
Document Center

:Collect performance data reported by an Alibaba Cloud Java agent from Java programs

Last Updated:Apr 12, 2024

Continuous Profiler Agent is an Alibaba Cloud Java agent that is developed by the JVM team of Alibaba Cloud to collect performance data. Continuous Profiler Agent has been tested in large-scale production environments. It provides high performance and high stability. You can use Logtail to collect performance data reported by Continuous Profiler Agent from Java programs to the Full-stack Observability application for visualized monitoring and analysis.

Prerequisites

A Full-stack Observability instance is created. For more information, see Create an instance.

Limits

  • Only Linux Logtail V1.7 or later is supported.

  • The following Linux distributions are supported: CentOS, Red Hat, Alibaba Cloud Linux, Ubuntu, and Debian. The kernel version must be 2.6.32-431.23.3.el6.x86_64 or later. GNU C Library and MUSL Library are supported.

  • JDK versions are supported. The following table describes the details.

    Engine type

    CPU

    Memory

    AUTO engine

    • OpenJDK 8u272 and later, JDK 11, and JDK 17 are supported.

    • OracleJDK 11 and OracleJDK 17 are supported.

    • OracleJDK 8 is not supported.

    • OpenJDK 8u352 and later, OpenJDK 11.0.17 and later, and OpenJDK 17.0.5 and later are supported.

    • OracleJDK 8 is not supported.

    • OracleJDK 11.0.21 and later, and OracleJDK 17.0.9 and later are supported.

    async_profiler engine

    OpenJDK 8, OpenJDK 11, OpenJDK 17, OracleJDK 8, OracleJDK 11, and OracleJDK 17 are supported.

    OpenJDK 8, OpenJDK 11, OpenJDK 17, OracleJDK 8, OracleJDK 11, and OracleJDK 17 are supported.

Resource consumption description

In most scenarios, the performance overhead for Java programs is less than 5%.

Step 1: Create a Logtail configuration

  1. Log on to the Simple Log Service console.

  2. In the Log Application section, click the Intelligent O&M tab. Then, click Full-stack Observability.

  3. On the Simple Log Service Full-stack Observability page, click the instance that you want to manage.

  4. In the left-side navigation pane, click Performance Monitoring.

    If this is your first time to use Performance Monitoring in the instance, click Enable.

  5. In the left-side navigation tree, click Data Import. On the Data Access Configurations page, find Common Push Import in the Performance Monitoring section.

    The first time you create a Logtail configuration for this type of performance data, turn on the switch to go to the configuration page. If you created a Logtail configuration, click the 创建 icon to go to the configuration page.

  6. Create a machine group.

    • If a machine group is available, click Use Existing Machine Groups.

    • If no machine groups are available, perform the following steps:

      1. Check your server type.

        • If you use an Elastic Compute Service (ECS) instance that belongs to the same Alibaba Cloud account as Simple Log Service, click the ECS Instances tab, select Manually Select Instances and your ECS instance, and then click Create.

          For more information, see Install Logtail on ECS instances.

        • If your server is an ECS instance that belongs to another Alibaba Cloud account, a server provided by a third-party cloud service provider, or a server deployed in a self-managed data center, you must manually install Linux Logtail V1.7 or later on the server. For more information, see Install Logtail on a Linux server.

          Important

          After you manually install Logtail, you must configure a user identifier for the server. For more information, see Configure a user identifier.

        • If you use a Kubernetes cluster, install Logtail components by following the instructions in Collect monitoring data about Kubernetes resources.

      2. After Logtail is installed, click Complete Installation.

      3. In the Create Machine Group step, configure the Name parameter and click Next.

        Simple Log Service allows you to create IP address-based machine groups and custom identifier-based machine groups. For more information, see Create an IP address-based machine group and Create a custom identifier-based machine group.

        Important

        If you install Logtail in a Kubernetes cluster, a machine group named in the {instanceId}-{clusterId}-k8s-cluster format is automatically generated. You can skip this step.

  7. In the Machine Group Settings step, move your server from the Source Server Groups section to the Applied Server Groups section and click Next.

    Important

    If you enable a machine group immediately after you create the machine group, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group is not connected to Simple Log Service. To resolve this issue, you can click Automatic Retry. If the issue persists, see What do I do if a Logtail machine group has no heartbeats?

  8. In the Specify Data Source step, configure the parameters and click Complete. The following table describes the parameters.

    Parameter

    Description

    Config Name

    The name of the Logtail configuration. You can enter a custom name.

    Cluster

    The name of the cluster. You can enter a custom name.

    After you configure this parameter, Simple Log Service adds a cluster=<Cluster name> tag to the performance data that is collected by using the Logtail configuration.

    Important

    Make sure that the cluster name is unique. Otherwise, data conflicts may occur.

    Address

    The address for data collection. The default value is http://:4040, where 4040 is the default port of Pyroscope. If you retain the default value, the HTTP server uses the local address.

    • If you use an ECS instance, specify the value in the following format: IP address of the ECS instance:4040.

    • If you use a server that resides in a Kubernetes cluster, set the value to logtail-kubernetes-metrics.sls-monitoring:4040.

    • If you use a server that is from a third-party cloud service provider or a data center, specify the value in the following format: IP address of the server:4040.

    Endpoint

    The default endpoint of Pyroscope. Default value: /ingest.

    Read Timeout Period

    The timeout period for data read operations. Default value: 10. Unit: seconds.

    Maximum Body Size

    The maximum size of data that can be collected.

After you configure the settings, Simple Log Service automatically creates assets such as Metricstores. For more information, see Assets.

Step 2: Download a Java agent

  • Regions in China

    wget https://logtail-release-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/jvm/continuous-profile-collector-agent-1.9.0.jar
  • Regions outside China

    wget https://logtail-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/jvm/continuous-profile-collector-agent-1.9.0.jar

Step 3: Configure a Java program to push performance data

  • Configure the Java program by using JVM parameters

  • java \
    -Dprofiling.app.name=your_service_name \
    -Dprofiling.agent.upload.server="http://{host}:{port}" \
    -Dprofiling.cpu.engine={engine} \
    -javaagent:{path for javaagent} \
    -jar demo.jar

    Parameter

    Description

    profiling.app.name

    The name of the service.

    profiling.agent.upload.server

    The address for data upload.

    • If you use an ECS instance, specify the value in the following format: IP address of the ECS instance:4040.

    • If you use a server that resides in a Kubernetes cluster, set the value to logtail-kubernetes-metrics.sls-monitoring:4040.

    • If you use a server that is from a third-party cloud service provider or a data center, specify the value in the following format: IP address of the server:4040.

    profiling.cpu.engine

    The engine used for CPU hotspot monitoring. Default value: off. Valid values: auto, async_profiler, jfr, and off.

    The value off specifies that CPU hotspot monitoring is disabled. Other values specify that CPU hotspot monitoring is enabled. We recommend that you set the value to auto.

  • Configure the Java program by using environment variables

  • export PROFILING_APP_NAME="your_service_name"
    export PROFILING_AGENT_UPLOAD_SERVER="http://{host}:{port}"
    export PROFILING_CPU_ENGINE="{engine}"
    export PROFILING_ALLOC_ENGINE="{engine}"
    

    Parameter

    Description

    PROFILING_APP_NAME

    The name of the service.

    PROFILING_AGENT_UPLOAD_SERVER

    The address for data upload.

    • If you use an ECS instance, specify the value in the following format: IP address of the ECS instance:4040.

    • If you use a server that resides in a Kubernetes cluster, set the value to logtail-kubernetes-metrics.sls-monitoring:4040.

    • If you use a server that is from a third-party cloud service provider or a data center, specify the value in the following format: IP address of the server:4040.

    PROFILING_CPU_ENGINE

    The engine used for CPU hotspot monitoring. Default value: off. Valid values: auto, async_profiler, jfr, and off.

    The value off specifies that CPU hotspot monitoring is disabled. Other values specify that CPU hotspot monitoring is enabled. We recommend that you set the value to auto.

    Remarks

    JVM parameter

    Environment variable

    Description

    profiling.app.name

    PROFILING_APP_NAME

    The name of the application.

    profiling.agent.upload.server

    PROFILING_AGENT_UPLOAD_SERVER

    The address of the server to which the Java Flight Recorder (JFR) file is uploaded. Default value: http://localhost:4040.

    • Do not start the address with http. The system automatically adds the http prefix to the address.

    • Do not end the address with a forward slash (/). The system automatically appends a forward slash (/) to the address.

    profiling.agent.timeout

    PROFILING_AGENT_TIMEOUT

    The timeout period for uploading the JFR file. Default value: 10. Unit: seconds.

    profiling.agent.ingest.max.tries

    PROFILING_AGENT_INGEST_MAX_TRIES

    The maximum number of retries that are allowed for uploading the JFR file. Default value: 2.

    profiling.app.http.headers

    PROFILING_APP_HTTP_HEADERS

    The HTTP header that is used when you upload the JFR file. This parameter is empty by default. Example: SESSION_ID=1111;XXX=YYY.

    profiling.app.labels

    PROFILING_APP_LABELS

    The tag that is added to the JFR file when you upload the JFR file. This parameter is empty by default. Example: env=dev;lang=java;biz=member.

    profiling.agent.log.level

    PROFILING_AGENT_LOG_LEVEL

    The log level. Default value: info. Valid values: info, debug, and error.

    profiling.agent.log.file

    PROFILING_AGENT_LOG_FILE

    The path to the log file. You can set the value to /path/to/profiling.log. By default, the path is written to Java stdout and stderr.

    profiling.period

    PROFILING_PERIOD

    The interval at which performance data is uploaded. Default value: 1. Unit: minutes.

    profiling.delay

    PROFILING_DELAY

    The performance monitoring latency. Default value: 0, which indicates that performance monitoring starts immediately after the performance monitoring engine is enabled. If you set the value to N, performance monitoring starts N seconds after the performance monitoring engine is enabled.

    profiling.start.at.zero.second

    PROFILING_START_AT_ZERO_SECOND

    Specifies whether to start performance monitoring at the 0th second of every minute. If you want to start performance monitoring at the 0th second of every minute, set the value to true. For example, if the value is set to true and the current time is 30 seconds of the current minute, the system automatically waits for 30 seconds before it starts performance monitoring.

    Default value: false.

    profiling.compression.mode

    PROFILING_COMPRESSION_MODE

    The compression mode. Default value: none. Valid values: gzip and none.

    • none: The file is not compressed and is suffixed with .jfr.

    • gzip: The file is compressed and is suffixed with .jfr.gzip.

    profiling.trigger.mode

    PROFILING_TRIGGER_MODE

    The trigger mode. You can trigger periodic or one-time performance monitoring. Default value: periodic. Valid values: periodic and api.

    We recommend that you set the value to periodic in agent mode.

    profiling.output.format

    PROFILING_OUTPUT_FORMAT

    The format of the file. Default value: jfr. Valid values: jfr and collapsed.

    profiling.cpu.engine

    PROFILING_CPU_ENGINE

    The engine used for CPU hotspot monitoring. Default value: off. Valid values: auto, async_profiler, jfr, and off.

    The value off specifies that CPU hotspot monitoring is disabled. Other values specify that CPU hotspot monitoring is enabled. We recommend that you set the value to auto.

    profiling.cpu.interval

    PROFILING_CPU_INTERVAL

    The interval at which CPU hotspot monitoring is performed. A small value increases the overhead. Default value: 10. Unit: milliseconds.

    profiling.wallclock.engine

    PROFILING_WALLCLOCK_ENGINE

    The engine used for the monitoring of wall clock hotspots. Default value: off. Valid values: auto, async_profiler, and off.

    The value off specifies that the monitoring of wall clock hotspots is disabled. Other values specify that the monitoring of wall clock hotspots is enabled. We recommend that you set the value to off.

    profiling.wallclock.interval

    PROFILING_WALLCLOCK_INTERVAL

    The interval at which the monitoring of wall clock hotspots is performed. A small value increases the overhead. Default value: 20. Unit: milliseconds.

    profiling.wallclock.thread.filter

    PROFILING_WALLCLOCK_THREAD_FILTER

    The thread filter used for the monitoring of wall clock hotspots. Default value: 0, which indicates that no threads are involved.

    The following list provides examples on how to specify values:

    • Empty: ""

    • Single thread: 123

    • Multiple threads: 122,123

    Thread range: 122 to 134

    profiling.wallclock.threads.per.tick

    PROFILING_WALLCLOCK_THREADS_PER_TICK

    The maximum number of threads used to monitor wall clock hotspots. Default value: 8.

    profiling.alloc.engine

    PROFILING_ALLOC_ENGINE

    The engine used for Alloc hotspot monitoring. Default value: off. Valid values: auto, async_profiler, jfr, and off. Alloc hotspot monitoring refers to the monitoring of memory request hotspots.

    The value off specifies that Alloc hotspot monitoring is disabled. Other values specify that Alloc hotspot monitoring is enabled. We recommend that you set the value to auto.

    profiling.alloc.interval

    PROFILING_ALLOC_INTERVAL

    The interval at which Alloc hotspot monitoring is performed. A small value increases the overhead. Default value: 256. Unit: kilo bytes.

    profiling.jfr.max.size

    PROFILING_JFR_MAX_SIZE

    The upper limit of the size for the JFR file. If the size reaches the upper limit, the data in the file is automatically discarded. Default value: 64m. Example values: 256k and 10m.

    profiling.jfr.max.age

    PROFILING_JFR_MAX_AGE

    The upper limit of the age for the JFR file. If the age reaches the upper limit, the data in the file is automatically discarded. Default value: 10m. Example values: 1m, 1h, and 1d.

    profiling.jfr.max.stack.depth

    PROFILING_JFR_MAX_STACK_DEPTH

    The maximum stack depth that is allowed during JFR sampling. Default value: 64.

What to do next

After you collect the performance data from Java programs to Full-stack Observability, you can use the performance monitoring feature to troubleshoot performance issues. For more information, see Data query and Data comparison.