All Products
Search
Document Center

Simple Log Service:HyperLogLog functions

Last Updated:Feb 22, 2024

HyperLogLog functions are approximate aggregate functions and are similar to the approx_distinct function. If a large amount of data is involved in computation, HyperLogLog functions can be used to return estimation results within a shorter period of time. This topic describes the syntax of HyperLogLog functions. This topic also provides examples on how to use HyperLogLog functions.

The following table describes the HyperLogLog functions that are supported by Simple Log Service.

Important If you want to use strings in analytic statements, you must enclose strings in single quotation marks (''). Strings that are not enclosed or enclosed in double quotation marks ("") indicate field names or column names. For example, 'status' indicates the status string, and status or "status" indicates the status log field.

Function

Syntax

Description

Supported in SQL

Supported in SPL

approx_set function

approx_set(x)

Estimates the number of distinct values in the x field. The maximum standard error is 0.01625, which is the default value.

×

cardinality function

cardinality(x)

Converts HyperLogLog data to bigint data.

×

empty_approx_set function

empty_approx_set()

Returns a null value of the HyperLogLog type. The maximum standard error is 0.01625, which is the default value.

×

merge function

merge(x)

Aggregates all HyperLogLog values.

×

approx_set function

The approx_set function estimates the number of distinct values in the x field. The maximum standard error is 0.01625, which is the default value.

Syntax

approx_set(x)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

Return value type

The HyperLogLog type.

Examples

Estimate the number of unique visitors (UVs) per minute. The return value is of the HyperLogLog type.

  • Query statement

    * |
    SELECT
      date_trunc('minute', __time__) AS Time,
      approx_set(client_ip) AS UV
    FROM  website_log
    GROUP BY
      Time
    ORDER BY
      Time
  • Query and analysis resultsapprox_set

cardinality function

The cardinality function converts HyperLogLog data to bigint data.

Syntax

cardinality(x)

Parameters

Parameter

Description

x

The value of this parameter is of the HyperLogLog type.

Return value type

The bigint type.

Examples

Convert HyperLogLog data to bigint data. The approx_set function returns the estimated number of UVs per minute. The return value is of the HyperLogLog type. The cardinality function converts the return value to bigint data.

  • Query statement

    * |
    SELECT
      Time,
      cardinality(UV) AS UV
    FROM  (
        SELECT
          date_trunc('minute', __time__) AS Time,
          approx_set(client_ip) AS UV
        FROM      website_log
        GROUP BY
          Time
        ORDER BY
          Time
      ) AS UV
  • Query and analysis resultscardinality

empty_approx_set function

The empty_approx_set function returns a null value of the HyperLogLog type. The maximum standard error is 0.01625, which is the default value.

Syntax

empty_approx_set()

Return value type

The HyperLogLog type.

Examples

Obtain a null value of the HyperLogLog type.

  • Query statement

    * | SELECT  empty_approx_set()
  • Query and analysis resultsempty_approx_set

merge function

The merge function aggregates all HyperLogLog values.

Syntax

merge(x)

Parameters

Parameter

Description

x

The value of this parameter is of the HyperLogLog type.

Return value type

The HyperLogLog type.

Examples

Aggregate HyperLogLog values. The approx_set function returns the estimated number of UVs per minute. The merge function aggregates the numbers of UVs of 15 minutes. The cardinality function converts the HyperLogLog data into bigint data.

  • Query statement

    * |
    SELECT
      Time,
      cardinality(UV) AS UV,
      cardinality(merge(UV) over()) AS Total_UV
    FROM  (
        SELECT
          date_trunc('minute', __time__) AS Time,
          approx_set(client_ip) AS UV
        FROM      log
        GROUP BY
          Time
        ORDER BY
          Time
      )
  • Query and analysis resultsmerge