HyperLogLog functions are approximate aggregate functions and are similar to the approx_distinct function. If a large amount of data is involved in computation, HyperLogLog functions can be used to return estimation results within a shorter period of time. This topic describes the syntax of HyperLogLog functions. This topic also provides examples on how to use HyperLogLog functions.
The following table describes the HyperLogLog functions that are supported by Simple Log Service.
Function | Syntax | Description | Supported in SQL | Supported in SPL |
approx_set(x) | Estimates the number of distinct values in the x field. The maximum standard error is 0.01625, which is the default value. | √ | × | |
cardinality(x) | Converts HyperLogLog data to bigint data. | √ | × | |
empty_approx_set() | Returns a null value of the HyperLogLog type. The maximum standard error is 0.01625, which is the default value. | √ | × | |
merge(x) | Aggregates all HyperLogLog values. | √ | × |
approx_set function
The approx_set function estimates the number of distinct values in the x field. The maximum standard error is 0.01625, which is the default value.
Syntax
approx_set(x)
Parameters
Parameter | Description |
x | The value of this parameter is of an arbitrary data type. |
Return value type
The HyperLogLog type.
Examples
Estimate the number of unique visitors (UVs) per minute. The return value is of the HyperLogLog type.
Query statement
* | SELECT date_trunc('minute', __time__) AS Time, approx_set(client_ip) AS UV FROM website_log GROUP BY Time ORDER BY Time
Query and analysis results
cardinality function
The cardinality function converts HyperLogLog data to bigint data.
Syntax
cardinality(x)
Parameters
Parameter | Description |
x | The value of this parameter is of the HyperLogLog type. |
Return value type
The bigint type.
Examples
Convert HyperLogLog data to bigint data. The approx_set function returns the estimated number of UVs per minute. The return value is of the HyperLogLog type. The cardinality function converts the return value to bigint data.
Query statement
* | SELECT Time, cardinality(UV) AS UV FROM ( SELECT date_trunc('minute', __time__) AS Time, approx_set(client_ip) AS UV FROM website_log GROUP BY Time ORDER BY Time ) AS UV
Query and analysis results
empty_approx_set function
The empty_approx_set function returns a null value of the HyperLogLog type. The maximum standard error is 0.01625, which is the default value.
Syntax
empty_approx_set()
Return value type
The HyperLogLog type.
Examples
Obtain a null value of the HyperLogLog type.
Query statement
* | SELECT empty_approx_set()
Query and analysis results
merge function
The merge function aggregates all HyperLogLog values.
Syntax
merge(x)
Parameters
Parameter | Description |
x | The value of this parameter is of the HyperLogLog type. |
Return value type
The HyperLogLog type.
Examples
Aggregate HyperLogLog values. The approx_set function returns the estimated number of UVs per minute. The merge function aggregates the numbers of UVs of 15 minutes. The cardinality function converts the HyperLogLog data into bigint data.
Query statement
* | SELECT Time, cardinality(UV) AS UV, cardinality(merge(UV) over()) AS Total_UV FROM ( SELECT date_trunc('minute', __time__) AS Time, approx_set(client_ip) AS UV FROM log GROUP BY Time ORDER BY Time )
Query and analysis results