All Products
Search
Document Center

Simple Log Service:Machine learning syntax and functions

Last Updated:Dec 13, 2024

Simple Log Service provides the machine learning feature that supports multiple algorithms and calling methods. You can use the analytic statement and machine learning functions to call machine learning algorithms to analyze the characteristics of one or more fields within a period of time. Simple Log Service offers various time series analysis algorithms. You can call these algorithms to solve problems that are related to time series data. For example, you can predict time series, detect time series anomalies, decompose time series, and cluster multiple time series. In addition, the algorithms are compatible with standard SQL functions. This simplifies the usage of the algorithms and improves the efficiency of troubleshooting.

Features

  • Supports various smooth operations on single-time series data.

  • Supports algorithms that are used for the prediction, anomaly detection, change point detection, inflection point detection, and multi-period estimation of single-time series data.

  • Supports decomposition operations on single-time series data.

  • Supports various clustering algorithms of multi-time series data.

  • Supports multi-field pattern mining (based on the sequence of numeric data or text).

Limits

When you use the machine learning feature of Simple Log Service, you must take note of the following limits:

  • The specified time series data must be sampled based on the same interval.

  • The specified time series data cannot contain data that is repeatedly sampled from the same point in time.

  • The processing capacity cannot exceed the maximum capacity. The following table describes the limits.

    Item

    Limit

    Capacity of the time-series data processing

    Data can be collected from a maximum of 150,000 consecutive points in time.

    If the data volume exceeds the processing capacity, you must aggregate the data or reduce the sampling amount.

    Capacity of the density-based clustering algorithm

    A maximum of 5,000 time series curves can be clustered at a time. Each curve cannot contain more than 1,440 points in time.

    Capacity of the hierarchical clustering algorithm

    A maximum of 2,000 time series curves can be clustered at a time. Each curve cannot contain more than 1,440 points in time.

Machine learning functions

Category

Function

Description

Time series

Smooth function

ts_smooth_simple

Uses the Holt Winters algorithm to smooth time series data.

ts_smooth_fir

Uses the finite impulse response (FIR) filter to smooth time series data.

ts_smooth_iir

Uses the infinite impulse response (IIR) filter to smooth time series data.

Multi-period estimation function

ts_period_detect

Estimates time series data by period.

Change point detection function

ts_cp_detect

Detects the intervals in which data has different statistical features. The interval endpoints are change points.

ts_breakout_detect

Detects the points in time at which data experiences dramatic changes.

Maximum value detection function

ts_find_peaks

Detects the local maximum value of time series data in a specified window.

Prediction and anomaly detection function

ts_predicate_simple

Uses default parameters to model time series data, predict time series data, and detect anomalies.

ts_predicate_ar

Uses an autoregressive (AR) model to model time series data, predict time series data, and detect anomalies.

ts_predicate_arma

Uses an autoregressive moving average (ARMA) model to model time series data, predict time series data, and detect anomalies.

ts_predicate_arima

Uses an autoregressive integrated moving average (ARIMA) model to model time series data, predict time series data, and detect anomalies.

ts_regression_predict

Predicts the long-run trend for a single periodic time series.

Sequence decomposition function

ts_decompose

Uses the Seasonal and Trend decomposition using Loess (STL) algorithm to decompose time series data.

Time series clustering function

ts_density_cluster

Uses a density-based clustering method to cluster multiple time series.

ts_hierarchical_cluster

Uses a hierarchical clustering method to cluster multiple time series.

ts_similar_instance

Queries time series curves that are similar to a specified time series curve.

Kernal density estimation functions

kernel_density_estimation

Uses the smooth peak function to fit the observed data points. In this way, the function simulates the real probability distribution curve.

Time series padding function

series_padding

Pads data points that are missing in a time series.

Anomaly comparison function

anomaly_compare

Compares the degree of difference of an observed object in two periods of time.

Pattern mining

Frequent pattern statistical function

pattern_stat

Mines representative combinations of attributes among the given multi-attribute field samples to obtain the frequent pattern in statistical patterns.

Differential pattern statistical function

pattern_diff

Identifies the pattern that causes differences between two collections in specified conditions.

Root cause analysis function

rca_kpi_search

Analyze the subdimension attributes that cause anomalies of the monitoring metric.

Correlation analysis functions

ts_association_analysis

Identifies the metrics that are correlated to a specified metric among multiple observed metrics in the system.

ts_similar

Identifies the metrics that are correlated to specified time series data among multiple observed metrics in the system.

Request URL classification function

url_classify

Classifies a request URL and attaches a tag to the URL. The function also provides the regular expression that defines the pattern of the tag.