This topic describes the trace sampling modes that are supported by Application Real-Time Monitoring Service (ARMS). You can select an appropriate mode based on your scenarios so that you can obtain the trace data that you want at a low cost.
Terms
span: a specific operation in a request, such as a remote call or an internal method call.
root span: the first span in a trace.
local root span: the first span of a trace in a single service.
span context: the context of a span. A span context is associated with a specific operation in a request.
head-based sampling: makes a sampling decision upfront at the root span and ensures that whole traces are sampled.
non-head based sampling: takes effect if head-based sampling is not triggered, and may be triggered at any local root span in a trace. In most cases, the integrity of the trace cannot be guaranteed.
Sampling policies and marks
ARMS provides two head-based sampling policies and three non-head based sampling policies, to help you sample the significant trace data.
Head-based sampling policies
Non-head based sampling policies
Sampling marks
Sampling marks specify whether to sample trace data when trace contexts are passed across processes by using EagleEye protocol. The key in the request header is EagleEye-Sampled, and the valid values are:
s0: not sampled
s1: sampled
Sampling marks can also record sampling reasons in the local root span where trace data is sampled. The marks are stored in spans in the form of attributes. The key is sample.reason and valid values are:
s2: minimum sampling for all interfaces
s3: custom sampling
s4: fixed-rate sampling
s5: reserved
s6: adaptive sampling
s7: reserved
s8: Basic Edition sampling
s9: sampling for failed requests
s10: sampling for slow requests
s11: sampling for abnormal calls
Head-based sampling policies
ARMS supports two head-based sampling policies: fixed-rate sampling and adaptive sampling. Fixed-rate sampling is the most common head-based trace sampling policy. Adaptive sampling is a cost-effective head-based sampling policy developed by ARMS.
Fixed-rate sampling
Traces are sampled based on the specified sampling rate at the ingress service. Spans that are sampled carry an attribute whose key is sample.reason
and value is s4
.
To configure a fixed-rate sampling policy, perform the following steps:
Log on to the ARMS console. In the left-side navigation pane, choose .
On the Application List page, select a region in the top navigation bar and click the name of the application that you want to manage.
NoteIcons displayed in the Language column indicate languages in which applications are written.
: Java application
: Go application
: Python application
Hyphen (-): application monitored in Managed Service for OpenTelemetry.
In the top navigation bar, choose
.In the Sampling Settings section, you can set a sampling rate. Set the Sampling strategy parameter to Fixed sampling rate. In the Sample Rate Percentage field, enter a percent value. For example, if you enter 10, the sampling rate is 10%.
NoteThe modifications take effect immediately. You do not need to restart the application. The default value is 10. If you increase the sampling rate, additional system resources are consumed. We recommend that you keep the default value.
Click Save.
Adaptive sampling
The traffic of different business may vary greatly. The interface reading traffic is often excessively larger than the writing traffic, whereas the trace data related to interface writing is more significant than the trace data related to interface reading. To prevent imbalance in the sampling between the significant trace data and the trace data that is less significant, ARMS provides adaptive sampling. Traces of 1,000 interfaces with the most requests are separately sampled based on the Least Frequently Used (LFU) algorithm. 10 traces are sampled for each of these traces per minute, and 10 traces are sampled for all other interfaces per minute. Spans that are sampled carry an attribute whose key is sample.reason
and value is s6
.
To configure an adaptive sampling policy, perform the following steps:
Log on to the ARMS console. In the left-side navigation pane, choose .
On the Application List page, select a region in the top navigation bar and click the name of the application that you want to manage.
NoteIcons displayed in the Language column indicate languages in which applications are written.
: Java application
: Go application
: Python application
Hyphen (-): application monitored in Managed Service for OpenTelemetry.
In the top navigation bar, choose
.In the Sampling Settings section, set the Sampling strategy parameter to Adaptive Sampling.
NoteThe modifications take effect immediately. You do not need to restart the application.
Click Save.
Non-head based sampling policies
Head-based sampling may be triggered at any span in a trace and cannot guarantee the integrity of the trace. You may be unable to sample all significant trace data that you care about, such as spans related to slow or failed requests, or spans that are infrequent or user-defined.
Minimum sampling for all interfaces
The traces of each interface are automatically sampled at least once in a minute. Spans that are sampled carry an attribute whose key is sample.reason
and value is s2
.
Sampling for failed or slow requests
Before you sample traces for failed or slow requests, go to the application details page, choose Configuration > Custom Configurations from the top navigation bar, and then turn on the Call chain compression switch in the Advanced Settings section. The switch is turned on by default.
If a request meets one of the following conditions, the relevant traces are automatically sampled.
For an HTTP interface, a status code other than 200 is returned. For other interfaces, exceptions are thrown by the methods used for instrumentation.
An exception occurs during the internal execution of the interface, and is not thrown to the ingress service of the framework.
The duration of operation calls exceeds the slow call threshold configured on the Custom Configurations page.
NoteIf quantiles are enabled, calls with a duration greater than the 99th percentile of that operation are also recognized as slow calls.
Spans that are sampled carry an attribute whose key is sample.reason
and value is s9
, s11
, or s10
. The specific value depends on which condition is met.
Custom sampling
You can specify names, prefixes, or suffixes to specify the interfaces whose traces you want to completely sample. Spans that are sampled carry an attribute whose key is sample.reason
and value is s3
.
To configure a custom sampling policy, perform the following steps:
Log on to the ARMS console. In the left-side navigation pane, choose .
On the Application List page, select a region in the top navigation bar and click the name of the application that you want to manage.
NoteIcons displayed in the Language column indicate languages in which applications are written.
: Java application
: Go application
: Python application
Hyphen (-): application monitored in Managed Service for OpenTelemetry.
In the top navigation bar, choose
.In the Sampling Settings section, specify the interface names, prefixes, or suffixes.
NoteThe modifications take effect immediately. You do not need to restart the application.
Click Save.
Flowchart
Take a trace that is generated among the A, B and C services as an example. The preceding sampling policies determine whether the spans are sampled. The following flowchart describes how sampling decisions are made. Each decision needs to be made when the request is at A, B, or C, and whether the current span is a local root span or a root span.
The flowchart uses the following colors:
Purple: indicates head-based sampling, which is triggered only at the root span of the trace. Only one sampling decision is made at A.
Blue: triggers sampling at any span in the trace if head-based sampling is not triggered. Assume that A decides not to sample. When the request is at B, B decides whether to implement custom sampling, minimum sampling, or neither. If the sampling is implemented, the attributes attached to the spans are passed on to C. Three sampling decisions are made at A, B, and C.
Green: triggers sampling at any span in the trace if head-based sampling, custom sampling, and minimum sampling are not triggered. Assume that A decides not to sample. When the request is at B, B decides whether the request is slow or has failed, and whether to implement sampling. If the sampling is implemented, the attributes attached to the spans are not passed on to C. Three sampling decisions are made at A, B, and C.
References
After traces are sampled, you can configure filter conditions and aggregation dimensions to analyze the trace data in real time. For more information, see Trace analysis.