All Products
Search
Document Center

Simple Log Service:Algorithms

Last Updated:Aug 15, 2024

The intelligent inspection feature of Simple Log Service allows you to inspect data such as metrics and business logs and identify anomalies in the data in an automated, intelligent, and adaptive manner. The intelligent inspection feature uses the stream graph algorithm, stream decomposition algorithm, and supervised anomaly detection algorithm to inspect data. This topic describes the scenarios, parameter settings, and preview descriptions of the stream graph algorithm, stream decomposition algorithm, and supervised anomaly detection algorithm.

Stream graph algorithm

The stream graph algorithm is developed based on Time2Graph. This algorithm can reduce data noise and calculate the offset of each abnormal sample. This algorithm is suitable for scenarios in which you want to inspect a large number of time series with significant data noise and insignificant cyclic changes. For more information, see Time-Series Event Prediction with Evolutionary State Graph.

Scenarios

The stream graph algorithm uses online machine learning methods to analyze each sample and learn from the sample data in real time. You can use this algorithm to identify anomalies in the following types of time series:

  • Machine-level metrics, such as CPU utilization, memory usage, and disk read and write speeds

  • Performance metrics, such as queries per second (QPS), traffic volume, success rates, and latency

  • Golden metrics

Parameters

You can configure the parameters of the stream graph algorithm in the Algorithm Configurations section in the Algorithm Configurations step of the Create Intelligent Inspection Job wizard. For more information, see Use SQL statements to aggregate metrics for real-time inspection.

Parameter

Subparameter

Description

(Required) Advanced Parameters

Time Series Segments

The number of segments into which the time series of the specified metric are discretized. The discretization helps construct metric charts.

  • Default value: 8.

  • We recommend that you set this parameter to a value within the range of 5 to 20.

  • The sensitivity of anomaly detection linearly decreases based on the value of this parameter.

Observation Length

The number of historical samples that you want to inspect during anomaly detection.

  • Default value: 2880.

  • We recommend that you set this parameter to a value within the range of 200 to 4000.

  • If the time series that you want to inspect are cyclical, we recommend that you configure this parameter based on the number of samples that you want to inspect within two observation cycles. For example, if the observation granularity is 1 minute and the observation cycle is 1 day, Simple Log Service can inspect 2,880 samples for the metric within 2 days. We recommend that you set this parameter to a value that is greater than or equal to 2880.

Period-over-period Comparison Length

The period of time based on which period-over-period analysis is performed. Unit: days. Period-over-period analysis is performed on the metrics that you want to inspect during anomaly detection. If you set this parameter to 0, the algorithm does not perform period-over-period analysis.

Major Capture Type

The type of time series anomalies that require special attention. Valid values:

  • Upward Spike: The value of the metric suddenly increases at a specific point in time.

  • Downward Spike: The value of the metric suddenly decreases at a specific point in time.

  • Upward Shift: The value of the metric increases and stabilizes over a specific period of time.

  • Downward Shift: The value of the metric decreases and stabilizes over a specific period of time.

  • Upward Trend: The value of the metric continuously increases over a specific period of time.

  • Downward Trend: The value of the metric continuously decreases over a specific period of time.

Trees

The number of decision trees. The stream graph algorithm uses decision trees for auxiliary inspection.

Sample Size per Tree

The number of samples to collect from the data that you want to inspect when the stream graph algorithm constructs detection trees during anomaly detection.

Overall Anomaly Rate

The estimated rate of anomalous data that is included in the time series. Valid values: 0.001 to 0.01.

Minimum Window of Anomaly Type Check

The minimum length of the time series that are referenced during anomaly capture.

Maximum Window of Anomaly Type Check

The maximum length of the time series that are referenced during anomaly capture.

Minimum Window for Anomaly Confirmation

The minimum length of the time series that you want to inspect during anomaly capture.

Maximum Window for Anomaly Confirmation

The maximum length of the time series that you want to inspect during anomaly capture.

Single-dimension Feature Configuration

-

The features of the time series that you want to inspect. You must separately configure the following features:

  • Maximum Value: the maximum value of the time series.

  • Minimum Value: the minimum value of the time series.

  • Normalization: the method that is used to normalize the time series when you inspect the time series.

  • Anomaly Type to Follow: the type of the anomalies that require special attention when you inspect the time series.

Notification Sensitivity Configuration

-

The threshold based on which alert notifications are sent. You must configure different thresholds for anomalies that are detected in different periods of time. For example, you can ignore anomalies that are detected during the scheduled weekly maintenance period of the service.

Stream decomposition algorithm

The stream decomposition algorithm is developed based on RobustSTL. This algorithm supports batch processing but generates higher costs than the stream graph algorithm. The stream decomposition algorithm is suitable for scenarios in which you want to inspect a small number of performance metrics in a precise manner. If you want to analyze a large amount of data, we recommend that you split the data into batches or use the stream graph algorithm. For more information, see RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series.

Scenarios

You can use the stream decomposition algorithm to inspect data that has major cyclic changes. For example, you can use the stream decomposition algorithm to inspect performance metrics that have major cyclic changes.

Note

Data with cyclic changes includes the number of visits to a game and the number of orders placed by customers.

Parameters

You can configure the parameters of the stream decomposition algorithm in the Algorithm Configurations section in the Algorithm Configurations step of the Create Intelligent Inspection Job wizard. For more information, see Use SQL statements to aggregate metrics for real-time inspection.

  1. Configure an algorithm

    Parameter

    Subparameter

    Description

    Automatic Periodic Detection

    -

    Specifies whether to enable automatic periodic detection. Automatic periodic detection is suitable for scenarios in which time series data has seasonality. If the seasonality of the time series is constant, we recommend that you disable automatic periodic detection and manually configure the period length.

    Periodic Detection Frequency

    -

    The frequency at which periodic detection is performed. This parameter takes effect only if you enable automatic periodic detection. The algorithm periodically updates the seasonality of the time series based on the configured frequency. For example, if you set the value to 12 hours, the algorithm automatically detects and updates the seasonality of the time series every 12 hours.

    Period Length

    -

    The time length of the seasonality of the time series. This parameter takes effect only if you disable automatic periodic detection. If the time series has no seasonality, set the value to 0.

    Observation Length

    -

    The length of time during which historical data is referenced during anomaly detection. If the time series has seasonality, we recommend that you set the value to three times the value of the Period Length parameter. For example, if you set the Period Length parameter to 1 day, set this parameter to 3 days.

    Sensitivity

    -

    The detection sensitivity. The number of detected anomalies and the anomaly score linearly increase with the value of this parameter. If you set this parameter to a large value, the anomaly recall rate is high and the detection accuracy is low.

    Advanced Parameters

    Trend Component Sensitivity

    The sensitivity of the trend component. The algorithm decomposes the time series into the trend component, seasonal component, and noise component. During the anomaly detection of the trend component, the number of detected anomalies and the anomaly score linearly increase with the sensitivity of the trend component. If you set this parameter to a large value, the anomaly recall rate is high and the detection accuracy is low.

    Noise Sensitivity

    The sensitivity of the noise component. The algorithm decomposes the time series into the trend component, seasonal component, and noise component. During the anomaly detection of the noise component, the number of detected anomalies and the anomaly score linearly increase with the sensitivity of the trend component. If you set this parameter to a large value, the anomaly recall rate is high and the detection accuracy is low.

    Trend Component Sampling Step

    The sampling step of the trend component. The algorithm decomposes the time series into the trend component, seasonal component, and noise component. If the length of the observed time series is excessively long, the analysis of the trend component is slow. If you set this parameter to a large value, the analysis of the trend component is fast. However, the detection accuracy of the trend component may be reduced. For example, if you set this parameter to 8, one data point out of every eight data points is sampled from the original time series for trend component analysis.

    Seasonal Component Sampling Step

    The sampling step of the seasonal component. The algorithm decomposes the time series into the trend component, seasonal component, and noise component. If the length of the observed time series is excessively long, the analysis of the seasonal component is slow. If you set this parameter to a large value, the analysis of the seasonal component is fast. However, the detection accuracy of the seasonal component may be reduced. For example, if you set this parameter to 8, one data point out of every eight data points is sampled from the original time series for seasonal component analysis. We recommend that you set this parameter to a value no greater than 5.

    Window Length

    If the length of the observed time series is excessively long, the anomaly detection is slow. After you specify this parameter, the algorithm detects data in segments in sliding windows to improve the detection speed. We recommend that you set this parameter to a value no greater than 5000. If you do not want the algorithm to detect data in sliding windows, set this parameter to 0.

  2. In the preview section, click Show to view the configuration result of the algorithm.

    1. Specify the time range during which detection is performed on the time series. Click Data Query to process the data within the specified time range and generate time series data by using the query statement that is configured in the Data Feature Settings step.

    2. Configure the Entity Information and Feature parameters to determine the sequence of features to be detected. Click Preview to call the detection algorithm to process the specified feature sequence. The detection result is displayed in the lower part of the page. Click Display Parameters to display the configurations of the algorithm.

    3. Trend Component Preview, Seasonal Component Preview, and Noise Preview are displayed in the detection result. You can change the anomaly thresholds for Trend Component Preview and Noise Preview. This way, alerts are generated only when the anomaly score is greater than the specified thresholds.

Supervised anomaly detection algorithm

Supervised anomaly detection algorithm

The supervised anomaly detection algorithm constructs features for time series data. The algorithm uses the features and anomaly labels of time series data to train supervised classification models, such as decision trees and random forests. After the models are trained, the algorithm uses the trained models to perform anomaly detection.

Scenarios

The supervised anomaly detection algorithm is suitable for time series data that contains anomaly labels, and the time series data that cannot be processed by the stream graph algorithm or stream decomposition algorithm.

Parameters

You can configure the parameters of the supervised anomaly detection algorithm in the Algorithm Configurations section in the Algorithm Configurations step of the Create Model Training Job wizard. For more information, see Use SQL statements to aggregate metrics for model training.