How to use SQL statements to aggregate metrics for model training - Simple Log Service

The intelligent inspection feature inspects service data and identifies anomalies in an automated, intelligent, and adaptive manner. This topic describes how to use SQL statements to aggregate metrics for model training.

Prerequisites

Data is collected and stored in a Logstore, which is referred to as the source Logstore. For more information, see Data collection overview.
Indexes are configured for the source Logstore. For more information, see Create indexes.
An Intelligent Anomaly Analysis instance is created. For more information, see Create an instance.

Create an intelligent inspection job

Go to the Create Intelligent Inspection Job wizard

Log on to the Simple Log Service console.
Go to the Create Intelligent Inspection Job wizard.
1. In the Log Application section, click Intelligent Anomaly Analysis.
2. In the instance list, click the ID of the instance for which you want to create an intelligent inspection job.
3. In the left-side navigation pane, click Intelligent Inspection.
4. Click Real-time Inspection.
5. In the Inspection Job section, click Create Now.

Basic Information

In the Basic Information step of the Create Intelligent Inspection Job wizard, configure the parameters and click Next. The following table describes the parameters.

Parameter	Description
Job Name	The name of the intelligent inspection job. You can enter a custom name.
Project	The project to which the source Logstore or Metricstore belongs.
Region	The region where the project resides.
Logstore Type	The storage unit in which your data is stored. If your data is stored in a Logstore, select Logstores. If your data is stored in a Metricstore, select Metricstores.
Source Logstore	The Logstore in which your source data is stored. This parameter is required only if you set the Logstore Type parameter to Logstores.
Metricstores	The Metricstore in which your source data is stored. This parameter is required only if you set the Logstore Type parameter to Metricstores.
Role	The Alibaba Cloud Resource Name (ARN) of `AliyunLogETLRole`. If you completed authorization when you created the instance, the ARN is automatically displayed.
Target Store	The destination Logstore. This parameter is automatically set to `internal-ml-log`.

Data Feature Settings

If the time series data that you want to analyze contains an anomaly label when you configure parameters in the Data Feature Settings step, select the Data Feature Setting tab in the following table. If the time series data that you want to analyze does not contain an anomaly label, select the Anomaly Injection tab in the following table.

For information about query statements, see Log search overview and Log analysis overview.

Data Feature Settings

Sample query statement

* | select (__time__ - __time__%60) as time, entity, count(*) as metric, if(count(*) > 1000, 1, 0) as label from log group by time, entity limit 1000000

Label Name: label
Entity: entity

Feature: metric

Parameter	Description
Time	The field that specifies time in source data.
Granularity	The interval at which data is observed. Unit: seconds. Valid values: 5 to 3600. We recommend that you set this parameter to a value that is no less than 60.
Entity	The field that specifies an entity in source data. The intelligent inspection job aggregates data to generate a time series for the entity based on the specified field.
Feature	The field that specifies a feature in source data.
Label Name	The field that is used to identify the anomaly label in the source data. Valid values: 1: the corresponding data point is anomalous data. 0: the corresponding data point is normal data.

Anomaly Injection

Sample query statement

* | select (__time__ - __time__%60) as time, entity, count(*) as metric from log group by time, entity limit 1000000

Entity: entity
Feature: metric
Anomaly Rate: 0.001

Parameter	Description
Time	The field that specifies time in source data.
Granularity	The interval at which data is observed. Unit: seconds. Valid values: 5 to 3600. We recommend that you set this parameter to a value that is no less than 60.
Entity	The field that specifies an entity in source data. The intelligent inspection job aggregates data to generate a time series for the entity based on the specified field.
Feature	The field that specifies a feature in source data.
Anomaly Injection	Specifies whether to save the anomalous data that is injected.
Anomaly Rate	The ratio of the injected anomalous data to the time series data. For example, if you set this parameter to 0.001, 0.1% of the time series data is anomalous.
Anomaly Type	The types of anomalies that are injected into the feature sequence.

Algorithm Configurations

In the Algorithm Configurations step, configure the Algorithm parameter. Only supervised anomaly detection algorithms are supported.

In the Scheduling Settings section, configure the parameters. The following table describes the parameters.

Parameter	Description
Start At	The start time of the time series whose data the model training task processes.
End Time	The end time of the time series whose data the model training task processes.
End Time of Model Learning	The end time of the time series that is used to perform model training. The value of this parameter must be greater than the value of the Start At parameter and smaller than the value of the End Time parameter. The time series data generated between the time specified by the Start At parameter and the time specified by the End Time of Model Learning parameter is used for model training. The time series data generated between the time specified by the End Time of Model Learning parameter and the time specified by the End Time parameter is used for model validation.

Related operations

Evaluate inspection results in alert notifications