All Products
Search
Document Center

Simple Log Service:Field analysis

Last Updated:Jun 25, 2024

The field analysis feature is supported. You can use the feature for statistical analysis on fields that are of the text, long, or double type. The feature analyzes the basic distribution and related metrics of each field and provides a time series chart for the top 5 values of each field. You can use the analysis results for data insight and visualization.

Prerequisites

The indexing and statistics features are enabled for the fields that you want to analyze. For more information, see Create indexes.

For example, if a log entry contains the request_method and request_time fields, you can configure indexes for the two fields. The following figure shows the configurations.指定字段查询

Limits

The data range for field analysis is all logs that meet query conditions within the time range on the query page. If the number of log entries is less than 0.1 billion, all log entries are analyzed. If the number of log entries exceeds 0.1 billion, about 0.1 billion log entries are sampled and analyzed. If you do not want to sample log entries, we recommend that you reduce the time range or add filter conditions.

View field analysis results

  1. Log on to the Simple Log Service console.

  2. In the Projects section, click the project that you want to manage.

    image

  3. In the left-side navigation pane, click Log Storage. In the Logstores list, click the Logstore that you want to manage.

    image

  4. View field analysis results.

Field description

Details of a field of the text type

Basic Distribution

Parameter

Description

Total Logs

The total number of logs based on the time range and query conditions that are specified on the current query page.

Total Logs of Current Field

The total number of logs that contain the specified field in the current query conditions.

Total Missing Logs

Total Logs - Total Logs of Current Field

Missing Log Proportion

Total Missing Logs/Total Logs

Total Distinct Values

The number of distinct values of the current field, which is calculated by using the approx_distinct function.

Distinct Value Proportion

Total Distinct Values/Total Logs

Statistical Metrics

Parameter

Description

Maximum Length

The maximum length of the field value.

Minimum Length

The minimum length of the field value.

Average Length

The average length of the field values.

Time Series Chart of TOP N Values

The top 5 values within a time range are collected, and the value changes over time are displayed.

Click the image icon on the right of Time Series Chart of TOP N Values to add the time series chart to the dashboard. For more information, see Create a dashboard.

Details of a field of the long or double type

Basic Distribution

Parameter

Description

Total Logs

The total number of logs based on the time range and query conditions that are specified on the current query page.

Total Logs of Current Field

The total number of logs that contain the specified field in the current query conditions.

Total Distinct Values

The number of distinct values of the current field, which is calculated by using the approx_distinct function.

Distinct Value Proportion

Total Distinct Values/Total Logs

Statistical Metrics

Parameter

Description

Maximum Value

The maximum value of the field.

Minimum Value

The minimum value of the field.

Average

The average value of the field.

Median Value

The item in the middle position after sorting the data in the ascending order.

Quartile Q1

The item in the 25% position after sorting the data in the ascending order.

Quartile Q3

The item in the 75% position after sorting the data in the ascending order.

Sample Standard Deviation

The sample standard deviation of the field, which is calculated by using the stddev_samp function.

Population Standard Deviation

The population standard deviation of the field, which is calculated by using the stddev_pop function.

Kurtosis

A statistical concept that indicates the degree of concentration in the data distribution.

Skewness

A statistical concept that indicates the degree of asymmetry in the data distribution.

Numerical Value Distribution Histogram

The distribution is split into 10 intervals and the approximate histogram is generated.

Click the image icon on the right of Numerical Value Distribution Histogram to add the numerical value distribution histogram to the dashboard. For more information, see Create a dashboard.