This topic describes the syntax of regular expression functions. This topic also provides examples on how to use the functions.
The following table describes the regular expression functions that are supported by Simple Log Service. The regular expression uses the RE2 syntax.
Function | Syntax | Description | Supported in SQL | Supported in SPL |
regexp_extract_all(x, regular expression) | Extracts the substrings that match a specified regular expression from a specified string and returns an array of all matched substrings. | ✓ | × | |
regexp_extract_all(x, regular expression, n) | Extracts the substrings that match a specified regular expression from a specified string and returns an array of substrings that match the nth capturing group in the regular expression. | ✓ | × | |
regexp_extract(x, regular expression) | Extracts the first substring that matches a specified regular expression from a specified string and returns the substring. | ✓ | ✓ | |
regexp_extract(x, regular expression, n) | Extracts the substrings that match a specified regular expression from a specified string and returns the first substring that matches the nth capturing group in the regular expression. | ✓ | ✓ | |
regexp_like(x, regular expression) | Checks whether a specified string matches a specified regular expression. | ✓ | ✓ | |
regexp_replace(x, regular expression) | Deletes the substrings that match a specified regular expression from a specified string and returns the substrings that remain. | ✓ | ✓ | |
regexp_replace(x, regular expression, replace string) | Replaces the substrings that match a specified regular expression in a specified string and returns the result string. | ✓ | ✓ | |
regexp_split(x, regular expression) | Splits a specified string into multiple substrings by using a specified regular expression and returns an array of the substrings. | ✓ | × |
If you want to use a regular expression function to extract single quotation marks (') from a string, you must add an additional single quotation mark (') to the regular expression. For more information, see regexp_extract function (Example 3).
regexp_extract_all function
The regexp_extract_all function extracts the substrings that match a specified regular expression from a specified string.
Syntax
The following syntax is used to extract the substrings that match a specified regular expression from a specified string and return an array of all matched substrings:
regexp_extract_all(x, regular expression)
The following syntax is used to extract the substrings that match a specified regular expression from a specified string and return an array of substrings that match the nth capturing group in the regular expression:
regexp_extract_all(x, regular expression, n)
Parameters
Parameter | Description |
x | The value of this parameter is of the varchar type. |
regular expression | The regular expression that contains capturing groups. For example, |
n | The nth capturing group. n is an integer that is greater than or equal to 1. |
Return value type
The array type.
Examples
Example 1: Extract all numbers from the value of the
server_protocol
field.Sample field
server_protocol:HTTP/2.0
Query statement (Debug)
*| SELECT regexp_extract_all(server_protocol, '\d+')
Query and analysis results
Example 2: Extract the Chrome information from the value of the
http_user_agent
field and calculate the number of requests that are initiated by the Chrome browser.Sample field
http_user_agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.803.0 Safari/535.1
Query statement (Debug)
*| SELECT regexp_extract_all(http_user_agent, '(Chrome)',1) AS Chrome, count(*) AS count GROUP BY Chrome
Query and analysis results
regexp_extract function
The regexp_extract function extracts the substrings that match a specified regular expression from a specified string.
Syntax
The following syntax is used to extract the first substring that matches a specified regular expression from a specified string and return the substring:
regexp_extract(x, regular expression)
The following syntax is used to extract the substrings that match a specified regular expression from a specified string and return the first substring that matches the nth capturing group in the regular expression:
regexp_extract(x, regular expression, n)
Parameters
Parameter | Description |
x | The value of this parameter is of the varchar type. |
regular expression | The regular expression that contains capturing groups. For example, |
n | The nth capturing group. n is an integer that is greater than or equal to 1. |
Return value type
The varchar type.
Examples
SQL
Example 1: Extract the first digit from the value of the
server_protocol
field.Sample field
server_protocol:HTTP/2.0
Query statement (Debug)
*|SELECT regexp_extract(server_protocol, '\d+')
Query and analysis results
Example 2: Extract the file information from the value of the
request_uri
field and calculate the number of visits for each file.Sample field
request_uri:/request/path-3/file-5
Query statement (Debug)
* | SELECT regexp_extract(request_uri, '.*\/(file.*)', 1) AS file, count(*) AS count GROUP BY file
Query and analysis results
Example 3: Extract single quotation marks (') and digits from the value of the
message
field.Sample field
message:error'1232
Query statement
* | SELECT regexp_extract(message, '''\d+')
NoteIf you want to use a regular expression function to extract single quotation marks (') from a string, you must add an additional single quotation mark (') to the regular expression.
Query and analysis results
SPL
Example 1: Extract the first digit from the value of the server_protocol field.
Sample field
server_protocol:HTTP/2.0
SPL statement
* | extend a = regexp_extract(server_protocol, '\d+')
SPL results
Example 2: Extract the file information from the value of the
request_uri
field.Sample field
request_uri:/request/path-3/file-5
SPL statement
* | extend a = regexp_extract(request_uri, '.*\/(file.*)',1)
SPL results
Example 3: Extract single quotation marks (') and digits from the value of the
message
field.Sample field
message:error'1232
SPL statement
* | extend a = regexp_extract(message, '''\d+')
Note
If you want to use a regular expression function to extract single quotation marks (') from a string, you must add an additional single quotation mark (') to the regular expression.
SPL results
regexp_like function
The regexp_like function checks whether a specified string matches a specified regular expression.
Syntax
regexp_like(x, regular expression)
Parameters
Parameter | Description |
x | The value of this parameter is of the varchar type. |
regular expression | The regular expression. |
Return value type
The Boolean type.
Examples
SQL
Check whether the value of the server_protocol field contains digits.
Sample field
server_protocol:HTTP/2.0
Query statement (Debug)
*| select regexp_like(server_protocol, '\d+')
Query and analysis results
SPL
Check whether the value of the server_protocol field contains digits.
Sample field
server_protocol:HTTP/2.0
SPL statement
* |extend a = regexp_like(server_protocol, '\d+')
SPL results
regexp_replace function
The regexp_replace function deletes or replaces the substrings that match a specified regular expression from or in a specified string.
Syntax
The following syntax is used to delete the substrings that match a specified regular expression from a specified string and return the substrings that remain:
regexp_replace(x, regular expression)
The following syntax is used to replace the substrings that match a specified regular expression in a specified string and return the result string:
regexp_replace(x, regular expression, replace string)
Parameters
Parameter | Description |
x | The value of this parameter is of the varchar type. |
regular expression | The regular expression. |
replace string | The substring used to replace the substrings that match the regular expression. |
Return value type
The varchar type.
Examples
SQL
Example 1: Replace the region name that starts with
cn
in the value of theregion
field with China and calculate the number of requests from China.Sample field
region:cn-shanghai
Query statement (Debug)
* | select regexp_replace(region, 'cn.*','China') AS region, count(*) AS count GROUP BY region
Query and analysis results
Example 2: Delete the version number from the value of the
server_protocol
field and calculate the number of requests for each communication protocol.Sample field
server_protocol:HTTP/2.0
Query statement (Debug)
*| select regexp_replace(server_protocol, '.\d+') AS server_protocol, count(*) AS count GROUP BY server_protocol
Query and analysis results
SPL
Example 1: Replace the region name that starts with cn in the value of the
region
field with China.Sample field
region:cn-shanghai
SPL statement
* | extend a = regexp_replace(region, 'cn.*','China')
SPL results
Example 2: Delete the version number from the value of the
server_protocol
field.Sample field
server_protocol:HTTP/2.0
SPL statement
* | extend a = regexp_replace(server_protocol, '.\d+')
SPL results
regexp_split function
The regexp_split function splits a specified string into multiple substrings by using a specified regular expression and returns an array of the substrings.
Syntax
regexp_split(x, regular expression)
Parameters
Parameter | Description |
x | The value of this parameter is of the varchar type. |
regular expression | The regular expression. |
Return value type
The array type.
Examples
Split the value of the request_uri
field with forward slashes (/).
Sample field
request_uri:/request/path-0/file-7
Query statement (Debug)
* | SELECT regexp_split(request_uri,'/')
Query and analysis results