All Products
Search
Document Center

Simple Log Service:SPL instructions

Last Updated:Dec 21, 2024

This topic describes the Simple Log Service Processing Language (SPL) instructions.

Parameter type description

The table below details the data types for parameters in SPL instructions.

Parameter type

Description

Bool

The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions.

Char

The parameter specifies an ASCII character. You must use single quotation marks ('') to enclose the character. For example, 'a' indicates the character a, '\t' indicates the tab character, '\11' indicates the ASCII character whose serial number corresponds to the octal number 11, and '\x09' indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

Integer

The parameter specifies an integer value.

String

The parameter specifies a string. You must use single quotation marks ('') to enclose the string. For example, 'this is a string'.

RegExp

The parameter specifies a regular expression. The RE2 syntax is supported. You must use single quotation marks ('') to enclose the regular expression. For example, '([\d.]+)'.

For more information, see Syntax.

JSONPath

The parameter specifies a JSON path. You must use single quotation marks ('') to enclose the JSON path. For example, '$.body.values[0]'.

For more information, see JsonPath.

Field

The parameter specifies a field name. For example, | project level, content.

If the field name contains special characters other than letters, digits, or underscores, you must use double quotation marks ("") to enclose the field name. For example, | project "a:b:c".

Note

For more information about case sensitivity of field names, see SPL functionality definitions in different scenarios.

FieldPattern

The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must use double quotation marks ("") to enclose the field pattern. For example, | project "__tag__:*".

Note

For more information about case sensitivity of field names, see SPL functionality definitions in different scenarios.

SPLExp

The parameter specifies an SPL expression.

SQLExp

The parameter specifies an SQL expression.

List of SPL instructions

Instruction category

Instruction name

Description

Field processing instructions

project

This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.

project-away

This instruction removes the fields that match the specified pattern and retains all other fields as they are.

project-rename

This instruction renames the specified fields and retains all other fields as they are.

expand-values

This instruction expands the first-layer JSON object of the specified field and generates multiple results.

SQL calculation instructions on structured data

extend

This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the supported SQL functions, see List of SQL functions supported by SPL.

where

This instruction filters data based on the result of SQL expression-based data calculation. Data that matches the specified SQL expression is retained. For more information about the supported SQL functions, see List of SQL functions supported by SPL.

Extraction instructions on semi-structured data

parse-regexp

This instruction extracts the information that matches groups in the specified regular expression from the specified field.

parse-csv

This instruction extracts information in the CSV format from the specified field.

parse-json

This instruction extracts the first-layer JSON information from the specified field.

parse-kv

This instruction extracts key-value pair information from the specified field.

Field processing instructions

project

The project instruction retains fields that match a specified pattern and renames designated fields. Retain-related expressions are executed prior to rename-related expressions during the execution of the instruction.

Important

By default, the time fields __time__ and __time_ns_part__ are preserved and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project -wildcard <field-pattern>, <output>=<field>, ...

Parameter description

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed.

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwrite of old and new values.

field

Field

Yes​

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Sample statement

  • Example 1: Retain a field.

    * | project level, err_msg
  • Example 2: Rename a field.

    * | project log_level=level, err_msg
  • Example 3: Retain the field that exactly matches __tag__:*.

    * | project "__tag__:*"

project-away

The project-away instruction removes fields that match a specified pattern, keeping all other fields unchanged.

Important

By default, the time fields __time__ and __time_ns_part__ are retained. For more information, see Time fields.

Syntax

| project-away -wildcard <field-pattern>, ...

Parameter description

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed.

project-rename

The project-rename instruction renames specified fields while retaining all others as is.

Important

By default, the time fields __time__ and __time_ns_part__ are preserved and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project-rename <output>=<field>, ...

Parameter description

Parameter

Type

Required

Description

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwrite of old and new values.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Example

Rename the specified fields.

* | project-rename log_level=level, log_err_msg=err_msg

expand-values

This instruction expands the first-layer JSON object in a specified field, generating multiple results.

Important

Syntax

| expand-values -path=<path> -limit=<limit> -keep <field> as <output>

Parameter description

Parameter

Type

Required

Description

path

JSONPath

No​

The JSON path in the specified field. The JSON path is used to locate the information that you want to extract.

The default value is an empty string. If you use the default value, the complete data of the specified field is extracted.

limit

Integer

No

The maximum number of entries that can be expanded from each piece of raw data. The value is an integer from 1 to 10. The default value is 10.

keep

Bool

No

Specifies whether to retain the original field after expansion. By default, the original field is not retained. If you want to retain the original field, you must enable this switch.

field

Field

Yes

The original name of the field to expand. The data type must be VARCHAR. If the specified field does not exist, the expansion operation is not performed.

output

Filed

No

The name of the field to create. If you do not specify this parameter, the output result is written to the input field by default.

The expansion logic for the original content is as follows:

JSON array: The array is expanded based on its elements.

JSON dictionary: The dictionary is expanded based on its key-value pairs.

Other JSON types: The initial value is returned.

Invalid JSON: null is returned.

Sample statement

  • Example 1: Expand an array, outputting multiple result sets.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '[0,1,2]'
    • Output data: The array is expanded into three data sets.

      # Entry 1
      x: 'abc'
      y: '0'
      
      # Entry 2
      x: 'abc'
      y: '1'
      
      # Entry 3
      x: 'abc'
      y: '2'
  • Example 2: Expand a dictionary, outputting multiple result sets.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '{"a": 1, "b": 2}'
    • Output data: The dictionary is expanded into two data sets.

      # Entry 1
      x: 'abc'
      y: '{"a": 1}'
      
      # Entry 2
      x: 'abc'
      y: '{"b": 2}'
  • Example 3: Expand content under a specified JSON path, outputting results to a new field.

    • SPL statement

      * | expand-values -keep content -path='$.body' as body
    • Input data

      content: '{"body": [0, {"a": 1, "b": 2}]}'
    • Output data: The content is expanded into two data sets.

      # Entry 1
      content: '{"body": [1, 2]}'
      body: '0'
      
      # Entry 2
      content: '{"body": [1, 2]}'
      body: '{"a": 1, "b": 2}'

SQL calculation instructions on structured data

extend

This instruction creates fields based on SQL expression-based data calculations. For a list of supported SQL functions, see List of SQL functions supported by SPL.

Syntax

| extend <output>=<sql-expr>, ...

Parameter description

Parameter

Type

Required

Description

output

Field

Yes

The name of the field to create. You cannot create the same field to store the results of multiple expressions.

Important

If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value.

sql-expr

SQLExpr

Yes

The data processing expression.

Important

For more information about null value processing, see Null value processing in SPL expressions.

Sample statement

  • Example 1: Apply a computation expression.

    * | extend Duration = EndTime - StartTime
  • Example 2: Utilize a regular expression.

    * | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
  • Example 3: Extract JSONPath content and convert a field's data type.

    • SPL statement

      *
      | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b')
      | extend b=cast(b as BIGINT)
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Output results

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: 2

where

This instruction filters data based on SQL expression-based calculations, retaining data that matches the specified SQL expression. For a list of supported SQL functions, see List of SQL functions supported by SPL.

Syntax

| where <sql-expr>

Parameter description

Parameter

Type

Required

Description

sql-expr

SQLExp

Yes

The SQL expression. Data that matches this expression is retained.

Important

For more information about null value processing in SQL expressions, see Null value processing in SPL expressions.

Sample statement

  • Example 1: Filter data based on field content.

    * | where userId='123'
  • Example 2: Filter data using a regular expression that matches based on a field name.

    * | where regexp_like(server_protocol, '\d+')
  • Example 3: Convert a field's data type to match all server error data.

    * | where cast(status as BIGINT) >= 500

Extraction instructions on semi-structured data

parse-regexp

This instruction extracts information matching groups in a specified regular expression from a field.

Important
  • The output field data type is VARCHAR. If the new field name conflicts with an existing field name in the input data, refer to Retention and overwrite of old and new values.

  • Operations on time fields __time__ and __time_ns_part__ are not permitted. For more information, see Time fields.

Syntax

| parse-regexp <field>, <pattern> as <output>, ...

Parameter description

Parameter

Type

Required

Description

field

Field

Yes

The original name of the field from which you want to extract information.

Make sure that this field is included in the input data, the data type is VARCHAR, and the field value is not null. Otherwise, the extract operation is not performed.

pattern

Regexp

Yes

The regular expression. The RE2 syntax is supported.

output

Field

No

The name of the output field that you want to use to store the extraction result of the regular extraction.

Sample statement

  • Example 1: Employ exploratory match mode.

    • SPL statement

      *
      | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field.
      | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Output results

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'
  • Example 2: Utilize full pattern match mode with unnamed capturing groups.

    • SPL statement

      * | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Output results

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'

parse-csv

This instruction extracts CSV-formatted information from a specified field.

Important
  • The output field data type is VARCHAR. If the new field name conflicts with an existing field name in the input data, refer to Retention and overwrite of old and new values.

  • Operations on time fields __time__ and __time_ns_part__ are not permitted. For more information, see Time fields.

Syntax

| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...

Parameter description

Parameter

Type

Required

Description

delim

String

No​

The delimiter of the data content. The delimiter can be one to three valid ASCII characters.

You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

You can also use a combination of multiple characters as the delimiter. For example, $$$, ^_^.

The default value is a comma (,).

quote

Char

No​

The quote of the data content. The quote is a single valid ASCII character and is used when the data content contains a delimiter.

For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01).

By default, no quote is used.

Important

This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters.

strict

Bool

No

Specifies whether to enable strict pairing when the number of values in the data content is different from the number of fields specified in output.

  • False: non-strict pairing. The maximum pairing policy is used.

    • If the number of values exceeds the number of fields, the extra values are not returned.

    • If the number of fields exceeds the number of values, the extra fields are returned as empty strings.

  • True: strict pairing. No fields are returned.

Default value: False. If you want to enable strict pairing, configure this parameter.

field

Field

Yes

The name of the field that you want to parse.

Ensure that the data content includes this field, the type must be VARCHAR, and its value is not NULL. Otherwise, the extract operation is not performed.

output

Field

Yes

The name of the field that you want to use to store the parsing result of the input data.

Sample statement

  • Example 1: Match data in simple mode.

    • SPL statement

      <
      * | parse-csv content as x, y, z
    • Input data

      content: 'a,b,c'
    • Output results

      content: 'a,b,c'
      x: 'a'
      y: 'b'
      z: 'c'
  • Example 2: Use double quotes as the quote character to match data containing special characters.

    • SPL statement

      * | parse-csv content as ip, time, host
    • Input data

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
    • Output results

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
      ip: '192.168.0.100'
      time: '10/Jun/2019:11:32:16,127 +0800'
      host: 'example.aliyundoc.com'
  • Example 3: Employ a combination of multiple characters as the separator.

    • SPL statement

      * | parse-csv -delim='||' content as time, ip, req
    • Input data

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
    • Output results

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
      time: '05/May/2022:13:30:28'
      ip: '127.0.0.1'
      req: 'POST /put?a=1&b=2'

parse-json

This instruction extracts first-layer JSON information from a specified field.

Important
  • The output field data type is VARCHAR. If the new field name conflicts with an existing field name in the input data, refer to Retention and overwrite of old and new values.

  • Operations on time fields __time__ and __time_ns_part__ are not permitted. For more information, see Time fields.

Syntax

| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>

Parameter description

Parameter

Type

Required

Description

mode

String

No

The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite.

path

JSONPath

No

The JSON path in the specified field. The JSON path is used to locate the information that you want to extract.

The default value is an empty string. If you use the default value, the complete data of the specified field is extracted.

prefix

String

No​

The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string.

field

Field

Yes

The name of the field that you want to parse.

Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.

  • The data type is JSON.

  • The data type is VARCHAR, and the field value is a valid JSON string.

Sample statement

  • Example 1: Extract all keys and values from the 'y' field.

    • SPL statement

      * | parse-json y
    • Input data

      x: '0'
      y: '{"a": 1, "b": 2}'
    • Output results

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: '1'
      b: '2'
  • Example 2: Extract the 'body' key's value from the 'content' field as separate fields.

    • SPL statement

      * | parse-json -path='$.body' content
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Output results

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: '2'
  • Example 3: Extract information in preserve mode, retaining the original value for existing fields.

    • SPL statement

      * | parse-json -mode='preserve' y
    • Input data

      a: 'xyz'
      x: '0'
      y: '{"a": 1, "b": 2}'
    • Output results

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: 'xyz'
      b: '2'

parse-kv

This instruction extracts key-value pair information from a specified field.

Important
  • The output field data type is VARCHAR. If the new field name conflicts with an existing field name in the input data, refer to Retention and overwrite of old and new values.

  • Operations on time fields __time__ and __time_ns_part__ are not permitted. For more information, see Time fields.

Syntax

| parse-kv -mode=<mode> -prefix=<prefix> -regexp <field>, <pattern>

Parameter

Parameter

Type

Required

Description

mode

String

No

If the new field name is the same as an existing field name in the input data, you can select the data overwrite mode.

The default value is overwrite. For more information, see the field value overwrite mode.

prefix

String

​No

The prefix of the output field name. The default value is an empty string.

regexp

Bool

Yes

Enable the regular extraction mode.

field

Field

Yes

The original name of the field from which you want to extract information.

Make sure that this field is included in the input data, the data type is VARCHAR, and the field value is a non-null value. Otherwise, the extract operation is not performed.

pattern

RegExpr

Yes

The regular expression contains two capturing groups. The first capturing group extracts the field name. The second capturing group extracts the field value. The RE2 syntax is supported.

Sample statement

  • Example 1: In regular extraction mode, process complex delimiters between key-value pairs and separators between keys and values.

    • SPL statement

      * | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Output data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'v1'
      k2: 'v2'
      k3: 'v3'
  • Example 2: In regular extraction mode, extract information in preserve mode, retaining the original value for existing fields.

    • SPL statement

      * | parse-kv -regexp -mode='preserve' content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Output results

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
      k2: 'v2'
      k3: 'v3'
  • Example 3: In regular extraction mode, handle complex unstructured data where the value is a number or a string enclosed in double quotes.

    • SPL statement

      * | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'verb="GET" URI="/healthz" latency="45.911µs" userAgent="kube-probe/1.30+" audit-ID="" srcIP="192.168.123.45:40092" contentType="text/plain; charset=utf-8" resp=200'
    • Output results

      content: 'verb="GET" URI="/healthz" latency="45.911µs" userAgent="kube-probe/1.30+" audit-ID="" srcIP="192.168.123.45:40092" contentType="text/plain; charset=utf-8" resp=200'
      verb: 'GET'
      URI: '/healthz'
      latency: '45.911µs'
      userAgent: 'kube-probe/1.30+'
      audit-ID: ''
      srcIP: '192.168.123.45:40092'
      contentType: 'text/plain; charset=utf-8'
      resp: '200'