This topic describes the syntax of regular expression functions and provides examples of their usage.
Simple Log Service supports the following regular expression functions using RE2 syntax.
Function Name |
Syntax |
Description |
Supported in SQL |
Supported in SPL |
regexp_extract_all(x, regular expression) |
Extracts substrings from a target string that match the given regular expression and returns a collection of all such substrings. |
✓ |
× |
|
regexp_extract_all(x, regular expression, n) |
Extracts substrings from a target string that match the given regular expression and returns a collection of substrings that match the target capturing group. |
✓ |
× |
|
regexp_extract(x, regular expression) |
Extracts and returns the first substring from a target string that matches the given regular expression. |
✓ |
✓ |
|
regexp_extract(x, regular expression, n) |
Extracts substrings from a target string that match the given regular expression and returns the first substring that matches the target capturing group. |
✓ |
✓ |
|
regexp_extract_bool(x, regular expression) |
Extracts substrings from a target string that match a given regular expression and converts them to Boolean type. If the conversion fails, |
✓ |
× |
|
regexp_extract_bool(x, regular expression, n) |
Extracts substrings from a target string that match a given regular expression and returns the substrings corresponding to the nth capturing group, converting them to Boolean type. If the conversion fails, |
✓ |
× |
|
regexp_extract_long(x, regular expression) |
Extracts substrings from a target string that match a given regular expression and converts them to bigint type. If the conversion fails, |
✓ |
× |
|
regexp_extract_long(x, regular expression, n) |
Extracts substrings from a target string that match a given regular expression and returns the substring corresponding to the nth capturing group, converting it to bigint type. If the conversion fails, |
✓ |
× |
|
regexp_extract_double(x, regular expression) |
Extracts the first substring from a target string that matches a given regular expression and converts it to double type. If the conversion fails, |
✓ |
× |
|
regexp_extract_double(x, regular expression, n) |
Extracts substrings from a target string that match a given regular expression and returns the substring corresponding to the nth capturing group, converting it to double type. If the conversion fails, |
✓ |
× |
|
regexp_like(x, regular expression) |
Determines if a target string matches a specific regular expression. |
✓ |
✓ |
|
regexp_replace(x, regular expression) |
Deletes substrings from a target string that match a given regular expression and returns the remaining substrings. |
✓ |
✓ |
|
regexp_replace(x, regular expression, replace string) |
Replaces substrings in a target string that match a given regular expression with the replacement string, returning the modified string. |
✓ |
✓ |
|
regexp_split(x, regular expression) |
Splits a target string into a collection of substrings using the given regular expression as the delimiter. |
✓ |
× |
To use a regular expression function for extracting single quotation marks (') from a string, you need to include an additional single quotation mark (') in the regular expression. For specific examples, see the regexp_extract function (Example 3).
regexp_extract_all function
The regexp_extract_all function retrieves all substrings from a given string that match a specified regular expression.
Syntax
-
Returns an array of substrings that match a specified regular expression within a given string.
regexp_extract_all(x, regular expression)
-
Returns an array of substrings from a given string that match the nth capturing group in the specified regular expression.
regexp_extract_all(x, regular expression, n)
Parameter description
Parameter |
Description |
x |
The parameter value is of varchar type. |
regular expression |
The regular expression with capturing groups, such as |
n |
The nth capturing group, where n is an integer starting from 1. |
Return value type
An array.
Examples
-
Example 1: Extract all digits from the
server_protocol
field value.-
Sample field
server_protocol:HTTP/2.0
-
Query statement (Debug)
*| SELECT regexp_extract_all(server_protocol, '\d+')
-
Query result
-
-
Example 2: Extract the "Chrome" portion from the
http_user_agent
field value and count the number of Chrome browser requests.-
Sample field
http_user_agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.803.0 Safari/535.1
-
Query statement (Debug)
*| SELECT regexp_extract_all(http_user_agent, '(Chrome)',1) AS Chrome, count(*) AS count GROUP BY Chrome
-
Query result
-
regexp_extract function
The regexp_extract function is designed to extract substrings from a specified string that match a given regular expression.
Syntax
-
This function retrieves the first substring that matches the specified regular expression from the input string and returns it.
regexp_extract(x, regular expression)
-
It also extracts substrings that match the specified regular expression from the input string and returns the first substring corresponding to the nth capturing group in the expression.
regexp_extract(x, regular expression, n)
Parameter description
Parameter |
Description |
x |
The parameter value is of varchar type. |
regular expression |
A regular expression that includes capturing groups, such as |
n |
The nth capturing group, where n is an integer starting from 1. |
Return value type
The return value is of the varchar type.
Examples
SQL
-
Example 1: Extract the first digit from the
server_protocol
field value.-
Sample field
server_protocol:HTTP/2.0
-
Query statement (test)
*|SELECT regexp_extract(server_protocol, '\d+')
-
Query result
-
-
Example 2: Extract the file part from the
request_uri
field value and count the access frequency for each file.-
Sample field
request_uri:/request/path-3/file-5
-
Query statement (test)
* | SELECT regexp_extract(request_uri, '.*\/(file.*)', 1) AS file, count(*) AS count GROUP BY file
-
Query result
-
-
Example 3: Extract the single quotation mark (') and digit part from the
message
field value.-
Sample field
message:error'1232
-
Query statement
* | SELECT regexp_extract(message, '''\d+')
NoteTo extract single quotation marks (') using a regular expression function, you must escape them by adding an additional single quotation mark (').
-
Query result
-
SPL
-
Example 1: Extract the first digit from the server_protocol field value.
-
Sample field
-
server_protocol:HTTP/2.0
-
Query statement
* | extend a = regexp_extract(server_protocol, '\d+')
-
Query result
-
Example 2: Extract the file part from the
request_uri
field value.-
Sample field
-
request_uri:/request/path-3/file-5
-
Query statement
* | extend a = regexp_extract(request_uri, '.*\/(file.*)',1)
-
Query result
-
Example 3: Extract the single quotation mark (') and digit part from the
message
field value.-
Sample field
-
message:error'1232
-
Query statement
* | extend a = regexp_extract(message, '''\d+')
Description
To extract single quotation marks (') using a regular expression function, you must escape them by adding an additional single quotation mark (').
-
Query result
regexp_extract_bool function
The regexp_extract_bool function extracts substrings that match a specified regular expression from a given string and converts them to Boolean type. If the conversion fails, it returns null
. The conversion succeeds only if the substring is "true" or "false," regardless of case sensitivity.
Syntax
-
Extracts substrings matching a specified regular expression from a given string and converts them to Boolean type. If the conversion fails, it returns
null
.regexp_extract_bool(x, regular expression)
-
Extracts substrings matching a specified regular expression from a given string, returns the substring that matches the specified capturing group, and converts it to Boolean type. If the conversion fails, it returns
null
.regexp_extract_bool(x, regular expression, n)
Parameter description
Parameter |
Description |
x |
The parameter value is of the varchar type. |
regular expression |
A regular expression that includes capturing groups. For instance, |
n |
The nth capturing group, where n is an integer starting from 1. |
Return value type
Boolean type.
Examples
-
Extract the Boolean value from the field value.
-
Sample Field
false
-
Query Statement (test)
*| select regexp_extract_bool('false', '[a-zA-Z]+')
-
Query and Analysis Results
-
regexp_extract_long function
The regexp_extract_long function retrieves substrings from a given string that match a specified regular expression and converts them into the bigint type. Should the conversion be unsuccessful, it returns null
.
Syntax
-
Extracts substrings matching a specified regular expression from a given string and converts them into the bigint type. Should the conversion fail, it returns
null
.regexp_extract_long(x, regular expression)
-
Extracts substrings matching a specified regular expression from a given string, returns those corresponding to the nth capturing group, and converts them into the bigint type. Should the conversion fail, it returns
null
.regexp_extract_long(x, regular expression, n)
Parameter description
Parameter |
Description |
x |
The parameter value is of varchar type. |
regular expression |
The regular expression includes capturing groups. For instance, |
n |
The nth capturing group. n is an integer starting from 1. |
Return value type
The return value is of bigint type.
Examples
-
Extract digits from the
time
field.-
Sample field
time:19/Dec/2024:06:16:06
-
Query statement and analysis (test)
*|SELECT regexp_extract_long(time, '(\d{2})/', 1)
-
Query and analysis results
-
regexp_extract_double function
The regexp_extract_double function extracts substrings that match a specified regular expression from a given string and converts them to the double type. If the conversion is unsuccessful, it returns null
.
Syntax
Extracts substrings matching a specified regular expression from a given string and converts them to the double type. If the conversion is unsuccessful, it returns
null
.regexp_extract_double(x, regular expression)
Extracts substrings matching a specified regular expression from a given string, returns the substring that matches the nth capturing group, and converts it to the double type. If the conversion is unsuccessful, it returns
null
.regexp_extract_double(x, regular expression, n)
Parameter description
Parameter | Description |
x | The parameter value is of varchar type. |
regular expression | The regular expression that includes capturing groups. For instance, |
n | The nth capturing group, where n is an integer starting from 1. |
Return value type
The type returned is double.
Examples
Extract digits from the
server_protocol
field.Sample Field
server_protocol:HTTP/1.1
Query Statement and Analysis (test)
*|SELECT regexp_extract_double(server_protocol, '\d+')
Query and Analysis Results
Regexp_like function
The regexp_like function determines if a given string matches a specific regular expression.
Syntax
regexp_like(x, regular expression)
Parameter description
Parameter |
Description |
x |
This parameter is a varchar type value. |
regular expression |
Regular expression. |
Return value type
Boolean.
Examples
SQL
Determine if the server_protocol field contains any digits.
-
Sample Field
server_protocol:HTTP/2.0
-
Query Statement and Analysis (debug)
*| select regexp_like(server_protocol, '\d+')
-
Query and Analysis Results
SPL
Verify if the server_protocol field includes digits.
-
Sample Field
server_protocol:HTTP/2.0
-
Query Statement and Analysis
* |extend a = regexp_like(server_protocol, '\d+')
-
Query and Analysis Results
regexp_replace function
The regexp_replace function deletes or replaces substrings in a given string that match a specified regular expression.
Syntax
-
This function removes substrings that match a specified regular expression from a given string and returns the remaining substrings.
regexp_replace(x, regular expression)
-
It replaces substrings that match a specified regular expression in a given string and returns the modified string.
regexp_replace(x, regular expression, replace string)
Parameter description
Parameter |
Description |
x |
The parameter value is of varchar type. |
regular expression |
Regular expression. |
replace string |
The substring used for replacement. |
Return value type
The return value is of the varchar type.
Examples
SQL
-
Example 1: Replace all region names starting with
cn
in theregion
field with China and count the number of requests from China.-
Sample fields
region:cn-shanghai
-
Query statements (debug)
* | select regexp_replace(region, 'cn.*','中国') AS region, count(*) AS count GROUP BY region
-
Query results
-
-
Example 2: Delete the version number from the
server_protocol
field's value and count the number of requests for each communication protocol.-
Sample fields
server_protocol:HTTP/2.0
-
Query statements (debug)
*| select regexp_replace(server_protocol, '.\d+') AS server_protocol, count(*) AS count GROUP BY server_protocol
-
Query results
-
SPL
-
Example 1: Replace all region names starting with cn in the
region
field with China.-
Sample fields
-
region:cn-shanghai
-
Query statements
* | extend a = regexp_replace(region, 'cn.*','中国')
-
Query results
-
Example 2: Delete the version number from the
server_protocol
field's value.-
Sample fields
-
server_protocol:HTTP/2.0
-
Query statements (debug)
* | extend a = regexp_replace(server_protocol, '.\d+')
-
Query results
regexp_split function
The regexp_split function divides a given string into an array of substrings based on a specified regular expression.
Syntax
regexp_split(x, regular expression)
Parameter description
Parameter |
Description |
x |
The value of this parameter is of the varchar type. |
regular expression |
Regular expression. |
Return value type
An array.
Example
Split the request_uri
field value using a forward slash (/).
-
Sample Field
request_uri:/request/path-0/file-7
-
Query Statement (test)
* | SELECT regexp_split(request_uri,'/')
-
Query and Analysis Results