All Products
Search
Document Center

Simple Log Service:SPL instructions

Last Updated:Mar 21, 2024

This topic describes Simple Log Service Processing Language (SPL) instructions.

Parameter data types

The following table describes the data types of parameters supported in SPL instructions.

Parameter data type

Description

Bool

The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions.

Char

The parameter specifies an ASCII character value. You must enclose the value in single quotation marks (''). For example, 'a' indicates the character a, '\t' indicates the tab character, '\11' indicates the ASCII character whose serial number corresponds to the octal number 11, and '\x09' indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

Integer

The parameter specifies an integer value.

String

The parameter specifies a string value. You must enclose the value in single quotation marks (''). Example: 'this is a string'.

RegExp

The parameter specifies an RE2 regular expression. You must enclose the value in single quotation marks (''). Example: '([\d.]+)'.

For more information about the RE2 syntax, see Syntax.

JSONPath

The parameter specifies a JSONPath value. You must enclose the value in single quotation marks (''). Example: '$.body.values[0]'.

For more information about the JSONPath syntax, see JsonPath.

Field

The parameter specifies a field name. Example: | project level, content.

If a field name contains special characters other than letters, digits, and underscores (_), you must enclose the field name in double quotation marks (""). Example: | project "a:b:c".

Note

For more information about the case sensitivity of field names, see SPL in different scenarios.

FieldPattern

The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must enclose the value in double quotation marks (""). Example: | project "__tag__:*".

Note

For more information about the case sensitivity of field names, see SPL in different scenarios.

SPLExp

The parameter specifies an SPL expression.

SQLExp

The parameter specifies an SQL expression.

Field processing instructions

project

This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.

Important

By default, the __time__ and __time_ns_part__ time fields are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project -wildcard <field-pattern>, <output>=<field>, ...

Parameters

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed.

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Examples

  • Example 1: Retain a field.

    * | project level, err_msg
  • Example 2: Rename a field.

    * | project log_level=level, err_msg
  • Example 3: Retain fields based on __tag__:* in exact match mode.

    * | project "__tag__:*"

project-away

This instruction removes the fields that match the specified pattern and retains all other fields as they are.

Important

By default, the __time__ and __time_ns_part__ time fields are retained. For more information, see Time fields.

Syntax

| project-away -wildcard <field-pattern>, ...

Parameters

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed.

project-rename

This instruction renames the specified fields and retains all other fields as they are.

Important

By default, the __time__ and __time_ns_part__ time fields are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project-rename <output>=<field>, ...

Parameters

Parameter

Type

Required

Description

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Examples

Rename the specified fields.

* | project-rename log_level=level, log_err_msg=err_msg

SQL calculation instructions on structured data

extend

This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the SQL functions that are supported by SPL, see SPL-supported SQL functions.

Syntax

| extend <output>=<sql-expr>, ...

Parameters

Parameter

Type

Required

Description

output

Field

Yes

The name of the field to create. You cannot create the same field to store the results of multiple expressions.

Important

If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value.

sql-expr

SQLExpr

Yes

The data processing expression.

Important

For more information about how to process null values, see Process null values for SQL expressions.

Examples

  • Example 1: Use a computation expression.

    * | extend Duration = EndTime - StartTime
  • Example 2: Use a regular expression.

    * | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
  • Example 3: Extract JSONPath content and convert the data type of a field.

    • SPL statement

      *
      | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b')
      | extend b=cast(b as BIGINT)
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Result

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: 2

where

This instruction filters data based on the result of SQL expression-based data calculation. Data that matches the specified SQL expression is retained. For more information about the SQL functions that are supported by the where instruction, see SPL-supported SQL functions.

Syntax

| where <sql-expr>

Parameters

Parameter

Type

Required

Description

sql-expr

SQLExp

Yes

The SQL expression. Data that matches this expression is retained.

Important

For more information about how to process null values, see Process null values for SQL expressions.

Examples

  • Example 1: Filter data based on field content.

    * | where userId='123'
  • Example 2: Filter data by using a regular expression that matches data based on a field name.

    * | where regexp_like(server_protocol, '\d+')
  • Example 3: Convert the data type of a field to match all data of server errors.

    * | where cast(status as BIGINT) >= 500

Extraction instructions on semi-structured data

parse-regexp

This instruction extracts the information that matches groups in the specified regular expression from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.

  • The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.

Syntax

| parse-regexp <field>, <pattern> as <output>, ...

Parameters

Parameter

Type

Required

Description

field

Field

Yes

The original name of the field from which you want to extract information.

Make sure that this field is included in the input data, the field type is VARCHAR, and the field value is a non-null value. Otherwise, the extract operation is not performed.

pattern

Regexp

Yes

The regular expression. The RE2 syntax is supported.

output

Field

No

The name of the output field that you want to use to store the extraction result of the regular extraction.

Examples

  • Example 1: Use the exploratory match mode.

    • SPL statement

      *
      | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field. 
      | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Result

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'
  • Example 2: Use the full pattern match mode and use unnamed capturing groups in a regular expression.

    • SPL statement

      * | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Result

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'

parse-csv

This instruction extracts information in the CSV format from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.

  • The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.

Syntax

| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...

Parameters

Parameter

Type

Required

Description

delim

String

No

The delimiter of the input data. You can specify one to three valid ASCII characters.

You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

You can also use a combination of multiple characters as a delimiter, such as $$$ and ^_^.

Default value: comma (,).

quote

Char

No

The quote of the input data. You can specify a single valid ASCII character. If the input data contains delimiters, you must specify a quote.

For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01).

Default value: double quotation marks ("").

Important

This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters.

strict

Bool

No

Specifies whether to enable strict pairing when the number of values in the input data is different from the number of fields specified in the output parameter.

  • False: non-strict pairing. The maximum pairing policy is used.

    • If the number of values exceeds the number of fields, the extra values are not returned.

    • If the number of fields exceeds the number of values, the extra fields are returned as empty strings.

  • True: strict pairing. No fields are returned.

Default value: False. If you want to enable strict paring, configure this parameter.

field

Field

Yes

The name of the field that you want to parse.

Make sure that this field is included in the input data, the field type is VARCHAR, and the field value is a non-null value. Otherwise, the extract operation is not performed.

output

Field

Yes

The name of the field that you want to use to store the parsing result of the input data.

Examples

  • Example 1: Match data in simple mode.

    • SPL statement

      * | parse-csv content as x, y, z
    • Input data

      content: 'a,b,c'
    • Result

      content: 'a,b,c'
      x: 'a'
      y: 'b'
      z: 'c'
  • Example 2: Use double quotation marks as the quote to match data that contains special characters.

    • SPL statement

      * | parse-csv content as ip, time, host
    • Input data

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
    • Result

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
      ip: '192.168.0.100'
      time: '10/Jun/2019:11:32:16,127 +0800'
      host: 'example.aliyundoc.com'
  • Example 3: Use a combination of multiple characters as the delimiter.

    • SPL statement

      * | parse-csv -delim='||' content as time, ip, req
    • Input data

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
    • Result

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
      time: '05/May/2022:13:30:28'
      ip: '127.0.0.1'
      req: 'POST /put?a=1&b=2'

parse-json

This instruction extracts the first-layer JSON information from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.

  • The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.

Syntax

| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>

Parameters

Parameter

Type

Required

Description

mode

String

No

The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite.

path

JSONPath

No

The JSON path in the specified field. The JSON path is used to locate the information that you want to extract.

The default value is an empty string. If you use the default value, the complete data of the specified field is extracted.

prefix

String

No

The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string.

field

Field

Yes

The name of the field that you want to parse.

Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.

  • The data type is JSON.

  • The data type is VARCHAR, and the field value is a valid JSON string.

Examples

  • Example 1: Extract all keys and values from the y field.

    • SPL statement

      * | parse-json y
    • Input data

      x: '0'
      y: '{"a": 1, "b": 2}'
    • Result

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: '1'
      b: '2'
  • Example 2: Extract the value of the body key from the content field as different fields.

    • SPL statement

      * | parse-json -path='$.body' content
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Result

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: '2'
  • Example 3: Extract information in preserve mode. For an existing field, retain the original value.

    • SPL statement

      * | parse-json -mode='preserve' y
    • Input data

      a: 'xyz'
      x: '0'
      y: '{"a": 1, "b": 2}'
    • Result

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: 'xyz'
      b: '2'