All Products
Search
Document Center

Simple Log Service:Use SQL statements to aggregate metrics for model training

Last Updated:Jul 03, 2024

The intelligent inspection feature inspects service data and identifies anomalies in an automated, intelligent, and adaptive manner. This topic describes how to use SQL statements to aggregate metrics for model training.

Prerequisites

  • Data is collected and stored in a Logstore, which is referred to as the source Logstore. For more information, see Data collection overview.

  • Indexes are configured for the source Logstore. For more information, see Create indexes.

  • An Intelligent Anomaly Analysis instance is created. For more information, see Create an instance.

Create an intelligent inspection job

Go to the Create Intelligent Inspection Job wizard

  1. Log on to the Simple Log Service console.

  2. Go to the Create Intelligent Inspection Job wizard.

    1. In the Log Application section, click Intelligent Anomaly Analysis.

    2. In the instance list, click the ID of the instance for which you want to create an intelligent inspection job.

    3. In the left-side navigation pane, click Intelligent Inspection.

    4. Click Real-time Inspection.

    5. In the Inspection Job section, click Create Now.

Basic Information

In the Basic Information step of the Create Intelligent Inspection Job wizard, configure the parameters and click Next. The following table describes the parameters.

Parameter

Description

Job Name

The name of the intelligent inspection job. You can enter a custom name.

Project

The project to which the source Logstore or Metricstore belongs.

Region

The region where the project resides.

Logstore Type

The storage unit in which your data is stored.

  • If your data is stored in a Logstore, select Logstores.

  • If your data is stored in a Metricstore, select Metricstores.

Source Logstore

The Logstore in which your source data is stored. This parameter is required only if you set the Logstore Type parameter to Logstores.

Metricstores

The Metricstore in which your source data is stored. This parameter is required only if you set the Logstore Type parameter to Metricstores.

Role

The Alibaba Cloud Resource Name (ARN) of AliyunLogETLRole. If you completed authorization when you created the instance, the ARN is automatically displayed.

Target Store

The destination Logstore. This parameter is automatically set to internal-ml-log.

Data Feature Settings

If the time series data that you want to analyze contains an anomaly label when you configure parameters in the Data Feature Settings step, select the Data Feature Setting tab in the following table. If the time series data that you want to analyze does not contain an anomaly label, select the Anomaly Injection tab in the following table.

  1. For information about query statements, see Log search overview and Log analysis overview.

    Data Feature Settings

    • Sample query statement

      * | select (__time__ - __time__%60) as time, entity, count(*) as metric, if(count(*) > 1000, 1, 0) as label from log group by time, entity limit 1000000
    • Label Name: label

    • Entity: entity

    • Feature: metric

      Parameter

      Description

      Time

      The field that specifies time in source data.

      Granularity

      The interval at which data is observed. Unit: seconds. Valid values: 5 to 3600. We recommend that you set this parameter to a value that is no less than 60.

      Entity

      The field that specifies an entity in source data. The intelligent inspection job aggregates data to generate a time series for the entity based on the specified field.

      Feature

      The field that specifies a feature in source data.

      Label Name

      The field that is used to identify the anomaly label in the source data. Valid values:

      • 1: the corresponding data point is anomalous data.

      • 0: the corresponding data point is normal data.

    Anomaly Injection

    • Sample query statement

      * | select (__time__ - __time__%60) as time, entity, count(*) as metric from log group by time, entity limit 1000000
    • Entity: entity

    • Feature: metric

    • Anomaly Rate: 0.001

    Parameter

    Description

    Time

    The field that specifies time in source data.

    Granularity

    The interval at which data is observed. Unit: seconds. Valid values: 5 to 3600. We recommend that you set this parameter to a value that is no less than 60.

    Entity

    The field that specifies an entity in source data. The intelligent inspection job aggregates data to generate a time series for the entity based on the specified field.

    Feature

    The field that specifies a feature in source data.

    Anomaly Injection

    Specifies whether to save the anomalous data that is injected.

    Anomaly Rate

    The ratio of the injected anomalous data to the time series data. For example, if you set this parameter to 0.001, 0.1% of the time series data is anomalous.

    Anomaly Type

    The types of anomalies that are injected into the feature sequence.

Algorithm Configurations

  1. In the Algorithm Configurations step, configure the Algorithm parameter. Only supervised anomaly detection algorithms are supported.

  2. In the Scheduling Settings section, configure the parameters. The following table describes the parameters.

    Parameter

    Description

    Start At

    The start time of the time series whose data the model training task processes.

    End Time

    The end time of the time series whose data the model training task processes.

    End Time of Model Learning

    The end time of the time series that is used to perform model training. The value of this parameter must be greater than the value of the Start At parameter and smaller than the value of the End Time parameter. The time series data generated between the time specified by the Start At parameter and the time specified by the End Time of Model Learning parameter is used for model training. The time series data generated between the time specified by the End Time of Model Learning parameter and the time specified by the End Time parameter is used for model validation.

Related operations

Evaluate inspection results in alert notifications