What is convergence?
The Application Real-Time Monitoring Service (ARMS) agent collects various types of metric data, such as the number of requests, the number of errors, and response time. To improve the diversity and accuracy of monitoring data, dimensions, such as the IP address, SQL statement, and URL, are provided together with metrics. However, some dimensions may exhibit high cardinality. An example is given here to describe what is high cardinality. Suppose that a RESTful API provides the /api/v1/users/{ID}/info URL template for querying user information. Numerous request URLs of various user IDs are displayed in the metric data, which causes high cardinality and complicates the display of metrics data. High cardinality brings great pressure to the storage, causes write loss and slow query, and incurs huge costs. To resolve high cardinality and improve observability, ARMS employs various convergence policies. This topic describes the convergence policies, scenarios, and formats.
Convergence formats and scenarios
The following table describes the convergence formats and scenarios.
Format | Scenario |
{ARMS_IP}:80 | The number of IP addresses accessing the same port exceeds the threshold, which is 50 by default. |
{ARMS_STATIC_REQ} or {ARMS_S_XXX} | The URL pertains to static resources. |
{ARMS_ATTACK_REQ} | The URL contains strings that can be exploited by attackers. |
{ARMS_PARAMED_REQ} | The URL contains parameters. |
{ARMS_OTHERS} | The dimension value recorded within a period of time exceeds the threshold. Note For information about the default threshold value, see Cardinality space convergence. |
{ARMS_NUMBER} | Terms split by |
{ARMS_WORD} | Terms split by |
{ARMS_ANY} | Terms split by |
{XXX} | The URL uses Spring Controller annotations. |
A string that contains * Note The format is applicable only to the ARMS agent earlier than V4.x. | Convergence based on the memory statistics of the ARMS agent is used. |
Convergence policies
All convergence policies are enabled by default and can be manually disabled, except for cardinality space convergence.
As different policies support different data types, the data type supported by each policy is clearly specified.
Spring annotation-based convergence
For web APIs other than RESTful APIs, using the request URL as a dimension value is feasible. However, for RESTful APIs with variables, directly recording request URLs leads to dimension divergence. Therefore, ARMS extracts information from annotations (such as @RequestMapping) as the dimension value for applications using the Spring Web framework.
Logic
Reads the path information from the routing annotations of Spring URLs.
Format
The value configured in the routing annotations is used.
Supported data type
URL: Only URLs that provide services to external systems are supported. External URLs are not supported.
Location
ARMS agent
Supported agent versions
2.9.1.2 and later
Example
The following APIController defines a RESTful interface for retrieving user information, using a path variable. The interface path is defined by the annotation @RequestMapping("/api/v1/user/{userId}/info"), which specifies the URL pattern for receiving requests with a user ID. When collecting data from the interface, the actual URL is /api/v1/user/{userId}/info, where {userId} is replaced by the specific user ID.
@RestController
@RequestMapping("/api/v1")
public class APIController {
@RequestMapping("/user/{userId}/info")
public String getUserInfo(@PathVariable("userId") String userId) {
return "hello " + userId;
}
}
Result:
/api/v1/user/1234/info
is converged to /api/v1/user/{userId}/info
.
Memory statistics-based convergence
For applications that do not use the Spring Web framework or scenarios where annotations fail, convergence based on memory statistics is used.
The policy is applicable only to the ARMS agent earlier than V4.x.
Logic
Splits each input using predefined delimiters (such as "/" or "=") to identify a set of words N.
Counts the number of distinct occurrences for each position of the words, which is the cardinality. If the cardinality exceeds a predefined threshold, the word at that position is replaced with
*
.
Format
String containing *
Supported data types
All
Location
ARMS agent
Supported agent versions
2.x, and versions later than 2.x and earlier than 4.x
Example
Application A provides an URL template /api/v1/user/${userId}/info
for querying user information, in which $userId specifies the specific user ID. The memory statistics module detects divergence in user IDs. To handle this, it converges the URL to
/api/v1/user/*/info
.
Custom convergence
The policy allows you to define custom convergence rules to meet specific needs.
Logic
Matches the custom convergence rules one by one. The matched rule is applied.
Format
The format is subject to specific configurations.
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
Custom rule: Converge all URLs that match /api/v1/user/[\d]+/info
to /api/v1/getUserInfo
.
All URLs that match the /api/v1/user/[\d]+/info
regular expression are converged to /api/v1/getUserInfo
.
Convergence for static resources
Some outdated versions of the ARMS agent monitor metrics about static resources. Convergence is enabled for these metrics by default because URLs change frequently.
Logic
Checks whether the suffix of a URL matches the default static resource extensions. If so, the URL is converged.
Default static resource extensions: .log .7z .tgz .jpg .jpeg .png .gif .css .js .ico .woff2 .xml .svg .pdf .txt .text .ppt .word .xlsx .tar.gz .tar.bz2 .sh .yml .yaml .zip .log .gz .ttf .woff .eot .rar .properties
Format
The default format is {ARMS_STATIC_REQ}
. If you have submitted a ticket to enable advanced configurations, the format includes the resource suffix.
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
By default, /api/v1/hello.jpg
is converged to {ARMS_STATIC_REQ}
. If the advanced configurations are enabled, it is converged to{ARMS_S_JPG}
.
Convergence for cyberattack requests
Your services may encounter cyberattack requests (such as attempts to read the /etc/passwd file). These requests are crafted by attackers and change frequently. Recording all of them significantly strains storage resources.
Logic
Checks whether a URL contains characters that can be exploited for cyberattacks. If so, the URL is converged.
Default characters: ' $ \ ' !
Format
{ARMS_ATTACK_REQ}
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
/app/v1/user/info?cmd='more /etc/passwd'
is converged to {ARMS_ATTACK_REQ}
.
Query parameter convergence
By default, the ARMS agent does not obtain the parameter information when collecting URLs. However, in some scenarios, URLs contain query parameters which can lead to dimension divergence.
Logic
Checks whether a URL contains query parameters. If so, the URL is converged.
Default query parameter delimiters:; ? &
Format
The default result is {ARMS_PARAMED_REQ}. If you have submitted a ticket to enable the advanced configurations, the convergence result retains the URL and replaces the parameter with {ARMS_REQ_PARAMS}.
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
By default, /api/v1/user/info?userId=12345
is converged to {ARMS_PARAMED_REQ}
. If the advanced configurations enabled, it is converged to /api/v1/user/info?{ARMS_REQ_PARAMS}
.
Meaningless word convergence
URLs with excessively long words or digits are likely to diverge. By default, ARMS replaces words or digits that are excessively long.
Logic
Splits a URL into an array of terms by slash (/
) and checks each term in the array to see if any term exceeds the specified length threshold. If so, the term is replaced.
Maximum length of a word: 64
Maximum length of digits: 10
Maximum length of digits in a word: 10
Format
Excessively long digits are converged to {ARMS_NUMBER}.
Excessively long words are converged to {ARMS_WORD}.
Excessively long digits in a word are converged to {ARMS_ANY}.
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
/api/2024040710/hello2024040710
is converged to /api/{ARMS_NUMBER}/{ARMS_ANY}
.
Intelligent convergence
After the preceding convergence policies are applied, a large number of divergent URLs may still be recorded. ARMS periodically generates convergence rules using algorithms to replace these URLs.
Logic
The logic is complex. Only a brief description is provided here.
Groups sample URLs using algorithms.
Converges URLs based on the URL pattern of each group and generates convergence rules.
Merges convergence rules of different groups.
Format
Divergent pure digits are replaced with {ARMS_NUMBER}.
Divergent pure letters are replaced with {ARMS_WORD}.
Divergent mixed digits and letters are replaced with {ARMS_ANY}.
Supported data type
URL
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
/api/product/1/info
/api/product/2/info
....
/api/product/N/info
For URLs in the example, the server uses the /api/product/[\d]+/info
regular expression for matching.
Convergence result: /api/product/{ARMS_NUMBER}/info
.
URLs that match the regular expression are converged to /api/product/{ARMS_NUMBER}/info
.
SQL normalization
The ARMS agent may collect a large number of SQL statements due to sharding, annotations, or plain text. By default, ARMS processes each SQL statement by replacing the digits or letters that are likely to diverge.
Logic
The logic is complex. Only a brief description is provided here.
Removes annotations.
Replaces plain text.
Replaces the names of sharded databases or tables.
...
Format
The divergent digits or letters are replaced.
Supported data type
SQL
Location
ARMS agent
Supported agent versions
4.X and later
Example
select * from cache_0 where ckey='23'
Result:
select * from cache_{NUM} where ckey=?
IP address convergence
If your application depends on many external services that are accessed through IP addresses, the ARMS agent may collect a large number of IP addresses, which leads to divergence.
Logic
Groups IP addresses by port.
Applies convergence if the number of IP addresses in a group exceeds the specified threshold, which is 50 by default.
Format
{ARMS_IP}:port
Supported data type
IP
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
1.1.1.1:8080
...
1.1.1.255:8080
The preceding IP addresses are converged to {ARMS_IP}:8080
.
Cardinality space convergence
The preceding convergence policies effectively resolve high cardinality issues in URLs. However, for SQL statements, a large number of dimension values may still be recorded. To fix the issue, ARMS limits the number of dimension values recorded within a period of time.
Logic
The logic is complex. Only a brief description is provided here.
Periodically generates a cardinality space of a fixed size.
Checks whether a dimension value exists in the cardinality space. If yes, the value is returned as it is. If no, the value is added to the cardinality space. If the value can be added, it is returned as it is. Otherwise, {ARMS_OTHERS} is returned.
Format
Dimension values that exceed the cardinality space threshold are converged to {ARMS_OTHERS}.
Default threshold
Item | Hourly threshold |
URL interface | 500 |
Scheduling task | 1,000 |
RPC interface | 1,000 |
Upstream interface | 200 |
Normal SQL call | 100 |
Slow SQL call | 100 |
External request URL | 200 |
External request address | 100 |
Supported data types
All
Location
ARMS agent V4.x and later: agent
ARMS agent earlier than V4.x: server
Supported agent versions
All
Example
Assume that the cardinality space size is set to 100 records per hour. IP addresses of the external services:
www.a1.com
www.a2.com
....
www.a1000.com
Only the first 100 IP addresses are recorded per hour, with subsequent IP addresses converged to {ARMS_OTHERS}
.
Execution order
ARMS agent earlier than V4.x
Agent
URL
Spring annotation-based convergence> Memory statistics-based convergence
SQL
Memory statistics-based convergence
IP address and others
Memory statistics-based convergence
Server
URL
Custom convergence > Convergence for cyberattack requests > Query parameter convergence > Convergence for static resources > Meaningless word convergence > Intelligent convergence > Cardinality space convergence
SQL
Custom convergence > Cardinality space convergence
IP address and others
Custom convergence > IP address convergence > Cardinality space convergence
ARMS agent V4.x and later
Agent
URL
Spring annotation-based convergence > Custom convergence > Convergence for cyberattack requests > Query parameter convergence > Convergence for static resources > Meaningless word convergence > Intelligent convergence > Cardinality space convergence
SQL
Custom convergence > SQL normalization > Cardinality space convergence
IP address and others
Custom convergence > IP address convergence > Cardinality space convergence
Server
N/A
Convergence is executed sequentially based on the preceding order. Once a dimension value is converged by a policy, the execution ends.
FAQ
How do I query the original values before convergence?
Both the original and converged values are recorded in trace data. You can use the Trace Explorer feature to view the original values. For more information, see Trace Explorer.
What do I do if the convergence result does not meet my satisfaction?
You can create custom convergence rules to define the convergence.
On the application details page of the ARMS console, choose Configuration > Convergence in the top navigation bar.
Example:
Specify the
/api/v1/user/\d+/info
regular expression to converge all matched URLs to/api/v1/user/userId/info
. Example:/api/v1/user/124343543/info
is converged to/api/v1/user/userId/info
.What do I do to exclude some URLs mistakenly matched from convergence?
Specify the URLs that you do not want to converge in the ARMS console. For more information, see Question 2.
Example:
Assume that you exclude
/api/v1/user/9999/info
from convergence./api/v1/user/9999/info
is not converged to/api/v1/user/userId/info
.What are the differences between agent-side convergence and server-side convergence?
Agent-side convergence means that the convergence is implemented on the ARMS agent, resulting in already-converged data being reported to the server. This greatly reduces the processing pressure on the server and ensures 100% data accuracy.
If the agent version is outdated, many convergence policies are unsupported. In cases of divergence, convergence must be performed on the server side. As no convergence is applied on the agent side, the data packets sent to the server can be quite large. First, data may be missing if the packets are excessively large and are rejected. Second, due to the limitations of the server-side processing, the accuracy of the converged data may be affected. Therefore, we recommended that you upgrade to the latest version of the ARMS agent.