This topic describes Simple Log Service Processing Language (SPL) instructions.
Parameter data types
The following table describes the data types of parameters supported in SPL instructions.
Parameter data type | Description |
Bool | The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions. |
Char | The parameter specifies an ASCII character value. You must enclose the value in single quotation marks (''). For example, |
Integer | The parameter specifies an integer value. |
String | The parameter specifies a string value. You must enclose the value in single quotation marks (''). Example: |
RegExp | The parameter specifies an RE2 regular expression. You must enclose the value in single quotation marks (''). Example: For more information about the RE2 syntax, see Syntax. |
JSONPath | The parameter specifies a JSONPath value. You must enclose the value in single quotation marks (''). Example: For more information about the JSONPath syntax, see JsonPath. |
Field | The parameter specifies a field name. Example: If a field name contains special characters other than letters, digits, and underscores (_), you must enclose the field name in double quotation marks (""). Example: Note For more information about the case sensitivity of field names, see SPL in different scenarios. |
FieldPattern | The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must enclose the value in double quotation marks (""). Example: Note For more information about the case sensitivity of field names, see SPL in different scenarios. |
SPLExp | The parameter specifies an SPL expression. |
SQLExp | The parameter specifies an SQL expression. |
Field processing instructions
project
This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.
By default, the __time__ and __time_ns_part__ time fields are retained and cannot be renamed or overwritten. For more information, see Time fields.
Syntax
| project -wildcard <field-pattern>, <output>=<field>, ...
Parameters
Parameter | Type | Required | Description |
wildcard | Bool | No | Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter. |
field-pattern | FieldPattern | Yes | The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed. |
output | Field | Yes | The new name of the field to rename. You cannot rename multiple fields to the same name. Important If the new name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values. |
field | Field | Yes | The original name of the field to rename.
|
Examples
Example 1: Retain a field.
* | project level, err_msg
Example 2: Rename a field.
* | project log_level=level, err_msg
Example 3: Retain fields based on
__tag__:*
in exact match mode.* | project "__tag__:*"
project-away
This instruction removes the fields that match the specified pattern and retains all other fields as they are.
By default, the __time__ and __time_ns_part__ time fields are retained. For more information, see Time fields.
Syntax
| project-away -wildcard <field-pattern>, ...
Parameters
Parameter | Type | Required | Description |
wildcard | Bool | No | Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter. |
field-pattern | FieldPattern | Yes | The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed. |
project-rename
This instruction renames the specified fields and retains all other fields as they are.
By default, the __time__ and __time_ns_part__ time fields are retained and cannot be renamed or overwritten. For more information, see Time fields.
Syntax
| project-rename <output>=<field>, ...
Parameters
Parameter | Type | Required | Description |
output | Field | Yes | The new name of the field to rename. You cannot rename multiple fields to the same name. Important If the new name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values. |
field | Field | Yes | The original name of the field to rename.
|
Examples
Rename the specified fields.
* | project-rename log_level=level, log_err_msg=err_msg
SQL calculation instructions on structured data
extend
This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the SQL functions that are supported by SPL, see SPL-supported SQL functions.
Syntax
| extend <output>=<sql-expr>, ...
Parameters
Parameter | Type | Required | Description |
output | Field | Yes | The name of the field to create. You cannot create the same field to store the results of multiple expressions. Important If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value. |
sql-expr | SQLExpr | Yes | The data processing expression. Important For more information about how to process null values, see Process null values for SQL expressions. |
Examples
Example 1: Use a computation expression.
* | extend Duration = EndTime - StartTime
Example 2: Use a regular expression.
* | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
Example 3: Extract JSONPath content and convert the data type of a field.
SPL statement
* | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b') | extend b=cast(b as BIGINT)
Input data
content: '{"body": {"a": 1, "b": 2}}'
Result
content: '{"body": {"a": 1, "b": 2}}' a: '1' b: 2
where
This instruction filters data based on the result of SQL expression-based data calculation. Data that matches the specified SQL expression is retained. For more information about the SQL functions that are supported by the where instruction, see SPL-supported SQL functions.
Syntax
| where <sql-expr>
Parameters
Parameter | Type | Required | Description |
sql-expr | SQLExp | Yes | The SQL expression. Data that matches this expression is retained. Important For more information about how to process null values, see Process null values for SQL expressions. |
Examples
Example 1: Filter data based on field content.
* | where userId='123'
Example 2: Filter data by using a regular expression that matches data based on a field name.
* | where regexp_like(server_protocol, '\d+')
Example 3: Convert the data type of a field to match all data of server errors.
* | where cast(status as BIGINT) >= 500
Extraction instructions on semi-structured data
parse-regexp
This instruction extracts the information that matches groups in the specified regular expression from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.
The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.
Syntax
| parse-regexp <field>, <pattern> as <output>, ...
Parameters
Parameter | Type | Required | Description |
field | Field | Yes | The original name of the field from which you want to extract information. Make sure that this field is included in the input data, the field type is |
pattern | Regexp | Yes | The regular expression. The RE2 syntax is supported. |
output | Field | No | The name of the output field that you want to use to store the extraction result of the regular extraction. |
Examples
Example 1: Use the exploratory match mode.
SPL statement
* | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field. | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
Input data
content: '10.0.0.0 GET /index.html 15824 0.043'
Result
content: '10.0.0.0 GET /index.html 15824 0.043' ip: '10.0.0.0' method: 'GET'
Example 2: Use the full pattern match mode and use unnamed capturing groups in a regular expression.
SPL statement
* | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
Input data
content: '10.0.0.0 GET /index.html 15824 0.043'
Result
content: '10.0.0.0 GET /index.html 15824 0.043' ip: '10.0.0.0' method: 'GET'
parse-csv
This instruction extracts information in the CSV format from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.
The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.
Syntax
| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...
Parameters
Parameter | Type | Required | Description |
delim | String | No | The delimiter of the input data. You can specify one to three valid ASCII characters. You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09. You can also use a combination of multiple characters as a delimiter, such as $$$ and ^_^. Default value: comma (,). |
quote | Char | No | The quote of the input data. You can specify a single valid ASCII character. If the input data contains delimiters, you must specify a quote. For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01). Default value: double quotation marks (""). Important This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters. |
strict | Bool | No | Specifies whether to enable strict pairing when the number of values in the input data is different from the number of fields specified in the
Default value: False. If you want to enable strict paring, configure this parameter. |
field | Field | Yes | The name of the field that you want to parse. Make sure that this field is included in the input data, the field type is |
output | Field | Yes | The name of the field that you want to use to store the parsing result of the input data. |
Examples
Example 1: Match data in simple mode.
SPL statement
* | parse-csv content as x, y, z
Input data
content: 'a,b,c'
Result
content: 'a,b,c' x: 'a' y: 'b' z: 'c'
Example 2: Use double quotation marks as the quote to match data that contains special characters.
SPL statement
* | parse-csv content as ip, time, host
Input data
content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
Result
content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com' ip: '192.168.0.100' time: '10/Jun/2019:11:32:16,127 +0800' host: 'example.aliyundoc.com'
Example 3: Use a combination of multiple characters as the delimiter.
SPL statement
* | parse-csv -delim='||' content as time, ip, req
Input data
content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
Result
content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2' time: '05/May/2022:13:30:28' ip: '127.0.0.1' req: 'POST /put?a=1&b=2'
parse-json
This instruction extracts the first-layer JSON information from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, the name that is used varies. For more information, see Retain and overwrite old and new values.
The __time__ and __time_ns_part__ time fields are not supported. For more information, see Time fields.
Syntax
| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>
Parameters
Parameter | Type | Required | Description |
mode | String | No | The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite. |
path | JSONPath | No | The JSON path in the specified field. The JSON path is used to locate the information that you want to extract. The default value is an empty string. If you use the default value, the complete data of the specified field is extracted. |
prefix | String | No | The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string. |
field | Field | Yes | The name of the field that you want to parse. Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.
|
Examples
Example 1: Extract all keys and values from the y field.
SPL statement
* | parse-json y
Input data
x: '0' y: '{"a": 1, "b": 2}'
Result
x: '0' y: '{"a": 1, "b": 2}' a: '1' b: '2'
Example 2: Extract the value of the body key from the content field as different fields.
SPL statement
* | parse-json -path='$.body' content
Input data
content: '{"body": {"a": 1, "b": 2}}'
Result
content: '{"body": {"a": 1, "b": 2}}' a: '1' b: '2'
Example 3: Extract information in preserve mode. For an existing field, retain the original value.
SPL statement
* | parse-json -mode='preserve' y
Input data
a: 'xyz' x: '0' y: '{"a": 1, "b": 2}'
Result
x: '0' y: '{"a": 1, "b": 2}' a: 'xyz' b: '2'