Promo Center

50% off for new user

Direct Mail-46% off

Learn More

Basic funnel function (windowFunnel)

Updated at: 2025-02-18 02:55

The basic funnel function windowFunnel can be used to calculate the funnel results of events within a specific time window. This topic describes how to use this function.

Limits

Only Hologres V0.9 and later support the windowFunnel function.

Precautions

To use funnel functions, you must execute the following statement as a superuser to install an extension:

CREATE extension flow_analysis; --Install the extension.
  • The extension is installed at the database level. For each database, you need to install the extension only once.

  • By default, the extension is loaded to the public schema. The extension cannot be loaded to other schemas. To use the extension in other schemas, you must specify a function name in the public.function format, such as public.windowFunnel.

Scenario description

All examples in this topic are based on the public event datasets on GitHub.

  • Dataset description

    A large number of developers develop open source projects on GitHub and generate many events during the development process. GitHub records the type and details of each event, developers, code repositories, and other information. It opens public events, such as the events generated when you add items to your favorites or submit code. For more information about event types, see Webhook events and payloads.

  • Dataset import

    You can use Hologres to import the github_event dataset into a Hologres instance with only a few clicks. For more information, see Import public datasets with a few clicks.

Statements for creating a table for GitHub public events:

BEGIN;
CREATE TABLE hologres_dataset_github_event.hologres_github_event (
    id BIGINT,
    actor_id BIGINT,
    actor_login TEXT,
    repo_id BIGINT,
    repo_name TEXT,
    org_id BIGINT,
    org_login TEXT,
    type TEXT,
    created_at TIMESTAMP WITH TIME ZONE NOT NULL,
    action TEXT,
    iss_or_pr_id BIGINT,
    number BIGINT,
    comment_id BIGINT,
    commit_id TEXT,
    member_id BIGINT,
    rev_or_push_or_rel_id BIGINT,
    ref TEXT,
    ref_type TEXT,
    state TEXT,
    author_association TEXT,
    language TEXT,
    merged BOOLEAN,
    merged_at TIMESTAMP WITH TIME ZONE,
    additions BIGINT,
    deletions BIGINT,
    changed_files BIGINT,
    push_size BIGINT,
    push_distinct_size BIGINT,
    hr TEXT,
    month TEXT,
    year TEXT,
    ds TEXT
);

CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'orientation', 'column');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'bitmap_columns', 'actor_login,repo_name,org_login,type,action,commit_id,ref,ref_type,state,author_association,language,hr,month,year,ds');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'clustering_key', 'created_at:asc');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'dictionary_encoding_columns', 'actor_login:auto,repo_name:auto,org_login:auto,type:auto,action:auto,commit_id:auto,ref:auto,ref_type:auto,state:auto,author_association:auto,language:auto,hr:auto,month:auto,year:auto,ds:auto');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'distribution_key', 'id');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'segment_key', 'created_at');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'time_to_live_in_seconds', '3153600000');

COMMENT ON TABLE hologres_dataset_github_event.hologres_github_event IS NULL;
END;

Function syntax

You can use the windowFunnel function to query events in a sliding time window and calculate the maximum number of events that match the specified conditions.

This function returns the maximum length of an ordered sequence of events starting from the first event. If a match fails, the function stops matching.

Examples in which the time window is long enough:

  • The function specifies events c1, c2, and c3, and the user data is c1, c2, c3, and c4. In this case, events c1, c2, and c3 are matched. The function returns 3.

  • The function specifies events c1, c2, and c3, and the user data is c4, c3, c2, and c1. In this case, the event c1 is matched. The function returns 1.

  • The function specifies events c1, c2, and c3, and the user data is c4 and c3. In this case, no events are matched. The function returns 0.

Syntax of the windowFunnel function:

windowFunnel(window, mode, timestamp, cond1, cond2, ..., condN)

Parameters:

Parameter

Description

Parameter

Description

window

The length of the time window. The windowFunnel function uses the time when the first matched event occurs as the start point of the time window and extracts data of subsequent events based on the length of the time window.

mode

The mode. Two modes are supported: default and strict. Default value: default.

  • default: The windowFunnel function matches as many events as possible starting from the first event in the specified time window. For example, if events c1, c2, and c3 are specified as conditions and the user data is c1, c2, c1, and c3, this function returns 3.

  • strict: windowFunnel() applies the matching conditions only to unique values. The function stops matching when it encounters the same event. For example, if events c1, c2, and c3 are specified as conditions and the user data is c1, c2, c1, and c3, this function returns 2.

timestamp

The time range of the occurrence of events. The TIMESTAMP, INT, and BIGINT data types are supported.

cond

The event type, which indicates each stage of the event.

Examples

The data in the public event datasets on GitHub in the preceding scenario description is used in the examples. If you want to analyze the funnel data of users following a fixed conversion path over a specified period of time, you can specify the following conditions in the SQL statement:

  • Statistics interval: 1,800 seconds (30 minutes)

  • Statistics period: from 10:00:00 on January 28, 2024 to 10:00:00 on January 31, 2024 (UTC+8)

  • User conversion path: CreateEvent > PushEvent > IssuesEvent

--Calculate the funnel data of each user.
SELECT
    actor_id,
    windowFunnel (1800, 'default', created_at,  type = 'CreateEvent',type = 'PushEvent',type = 'IssuesEvent') AS level
FROM
    hologres_dataset_github_event.hologres_github_event
WHERE
    created_at >= TIMESTAMP '2024-01-28 10:00:00+08'
    AND created_at < TIMESTAMP '2024-01-31 10:00:00+08'
GROUP BY
    actor_id;

The following content is a part of the returned result, where:

  • level=0 indicates that the first event is not matched for the user within the time window.

  • level=1 indicates that the first event is matched for the user within the time window.

  • level=2 indicates that the second event is matched for the user within the time window.

  • level=3 indicates that all three events are matched for the user within the time window.

 actor_id	| level
----------+------
143037332	| 0
38708562        | 0
157624788	| 1
137850795 	| 1
69616418 	| 2
158019532	| 2
727125  	| 3

To improve result readability, you can execute the following SQL statement to view the number of users converted at each stage:

WITH level_detail AS (
    SELECT
        level,
        COUNT(1) AS count_user
    FROM (
        SELECT
            actor_id,
            windowFunnel (1800, 'default', created_at, type = 'CreateEvent', type = 'PushEvent',type = 'IssuesEvent') AS level
        FROM
            hologres_dataset_github_event.hologres_github_event
        WHERE
            created_at >= TIMESTAMP '2024-01-28 10:00:00+08'
            AND created_at < TIMESTAMP '2024-01-31 10:00:00+08'
        GROUP BY
            actor_id) AS basic_table
    GROUP BY
        level
    ORDER BY
        level ASC
)
SELECT  CASE level    WHEN 0 THEN 'total'
                      WHEN 1 THEN 'CreateEvent'
                      WHEN 2 THEN 'PushEvent'
                      WHEN 3 THEN 'IssuesEvent'
                      END AS type
        ,SUM(count_user) over ( ORDER BY level DESC )
FROM
    level_detail
GROUP BY
    level,
    count_user
ORDER BY
    level ASC;

The following result is returned:

  type	    |  sum
------------+------
total	    | 1338166
CreateEvent | 461088
PushEvent   | 202221
IssuesEvent | 4727
  • On this page (1)
  • Limits
  • Precautions
  • Scenario description
  • Function syntax
  • Examples
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare