This topic describes how to use the LogReduce feature of Simple Log Service. You can enable the feature, view log clustering results and raw logs, and compare the number of clustered logs in different time periods.
Prerequisites
- A Standard Logstore is created. For more information, see Create a Logstore.
Logs are collected. For more information, see Data collection overview.
Indexes are configured. For more information, see Configure indexes.
Background information
When you collect logs, the LogReduce feature can cluster highly similar logs and extract patterns from the logs. This way, you can quickly have an overall understanding of the logs. The feature can cluster text logs in multiple formats. You can use the feature to perform O&M operations in DevOps scenarios. For example, you can use the feature to identify errors, detect anomalies, and roll back versions. You can also use the feature to detect intrusions in security scenarios. You can save log clustering results as charts to a dashboard and view the clustered data in real time.
Benefits
The feature can cluster logs in multiple formats. Examples: Log4j, JSON, and single-line logs.
The feature can cluster hundreds of millions of logs in seconds.
The feature can cluster logs by multiple patterns.
You can retrieve raw logs that are clustered by pattern based on pattern signatures.
You can compare patterns that are extracted in different time periods.
You can adjust the precision of log clustering based on your business requirements.
Index traffic
After you enable the LogReduce feature, the total size of indexes increases by 10% of the size for raw logs. For example, if the size of raw logs is 100 GB per day, the total size of indexes increases by 10 GB after you enable the LogReduce feature for the raw logs.
Size of raw logs | Index percentage | Size of indexes that are generated by LogReduce | Total size of indexes |
100 GB | 20% (20 GB) | 100 * 10% | 30 GB |
100 GB | 40% (40 GB) | 100 * 10% | 50 GB |
100 GB | 100% (100 GB) | 100 * 10% | 110 GB |
Enable the LogReduce feature
Log on to the Simple Log Service console.
In the Projects section, click the project that you want to manage.
In the left-side navigation pane, click Log Storage. In the Logstores list, click the Logstore that you want to manage.
Enable the LogReduce feature.
Choose .
If the indexing feature is not enabled, click Enable.
In the Search & Analysis panel, turn on LogReduce.
Optional:Configure a whitelist or a blacklist to cluster logs by field.
NoteYou cannot configure both a whitelist and a blacklist.
LogReduce Filter
Description
Whitelist
After you configure a whitelist, Simple Log Service uses the fields in the whitelist to cluster logs.
Blacklist
After you configure a blacklist, Simple Log Service does not use the fields in the blacklist to cluster logs.
No whitelist or blacklist configured
If you do not configure a blacklist or a whitelist, Simple Log Service clusters logs based on all fields and the clustering rules that you specify.
Click OK.
View log clustering results and raw logs
On the query and analysis page, enter a search statement in the search box, specify the query time range, and then click Search & Analyze.
NoteYou can use only search statements to filter logs. You cannot use analytic statements to filter logs because the LogReduce feature cannot cluster analysis results.
Click the LogReduce tab to view the log clustering results.
You can click Add to New Dashboard to save the log clustering results to a dashboard.
Parameter
Description
Number
The ordinal number of the log cluster.
Count
The number of logs for the pattern in the specified query time range.
Pattern
The log pattern. Each log cluster has one or more sub-patterns.
Move the pointer over a number in the Count column to view the sub-patterns of the log cluster. You can also view the percentage of each sub-pattern in the log cluster. Click the plus sign (+) next to a number in the Count column to expand the sub-pattern list.
Click a number in the Count column. You are navigated to the Raw Logs tab. On this tab, you can view the raw logs of the pattern.
Change the precision of log clustering
On the LogReduce tab, you can adjust the Pattern Count slider to change the precision of log clustering.
If you adjust the slider toward Many, you can obtain a more precise log clustering result that has more detailed patterns.
If you adjust the slider toward Little, you can obtain a less precise log clustering result that has less detailed patterns.
Compare the number of logs that are clustered in different time periods
On the LogReduce tab, click Log Compare.
Specify a time range and click OK.
For example, if you set the time range to 15 minutes when you query logs and specify 1Day for Log Compare, the start time and end time of log comparison are automatically displayed. The time ranges for comparison are the last 15 minutes and the 15 minutes on the previous day.
Parameter
Description
Number
The ordinal number of the log cluster.
Pre_Count
The number of logs for the pattern in the time range that is specified by Log Compare.
Count
The number of logs for the pattern in the time range that is specified for the query.
Diff
The difference between the numbers of logs in the Pre_Count and Count columns and the growth rate.
Pattern
The log pattern.
Examples of query statements
You can use query statements to obtain log clustering results.
Obtain log clustering results
Query statement
* | select a.pattern, a.count,a.signature, a.origin_signatures from (select log_reduce(3) as a from log) limit 1000
NoteWhen you view log clustering results, you can click Copy Query to obtain the query statement of the log clustering results.
Modify parameters
Modify the parameter settings in log_reduce(precision) of the query statement. The precision parameter specifies the precision of log clustering. A smaller value indicates a higher precision and more patterns. Valid values: 1 to 16. Default value: 3.
Returned fields
You can view log clustering details on the Graph tab.
Parameter
Description
pattern
The log pattern.
count
The number of logs for the pattern in the time range that is specified for the query.
signature
The signature of the log pattern.
origin_signatures
The secondary signature of the log pattern. You can use the secondary signature to retrieve the raw logs.
Compare the number of logs that are clustered in different time periods.
Query statement
* | select v.pattern, v.signature, v.count, v.count_compare, v.diff from (select compare_log_reduce(3, 86400) as v from log) order by v.diff desc limit 1000
NoteWhen you use Log Compare to compare log clustering results in different time periods, you can click Copy Query to obtain the query statement of the log clustering results.
Modify parameters
Modify the parameter settings in compare_log_reduce(precision, compare_interval) of the query statement.
The precision parameter specifies the precision of log clustering. A smaller value indicates a higher precision and more patterns. Valid values: 1 to 16. Default value: 3.
The compare_interval parameter specifies the time difference between the two time ranges for comparison. The value is a positive integer. Unit: seconds.
Returned fields
Parameter
Description
pattern
The log pattern.
count_compare
The number of logs for the pattern in the previous time range that is specified for comparison.
count
The number of logs for the pattern in the time range that is specified for the query.
diff
The difference between the numbers of logs in the count and count_compare columns.
signature
The signature of the log pattern.
Disable the LogReduce feature
If you no longer need to use the LogReduce feature, you can disable the feature.
On the query and analysis page of the Logstore for which you want to disable this feature, choose .
Turn off LogReduce.
Click OK.