All Products
Search
Document Center

Realtime Compute for Apache Flink:Window Top-N

Last Updated:Aug 07, 2024

A Window Top-N query supports fewer compatible modifications because it must follow the modification rules of window table-valued functions (TVFs) and Top-N queries. This topic describes the compatibility between a job and the state data used to start the job after you modify Window Top-N queries for the job.

Compatible modifications

  • Fully compatible: Add or delete a window attribute field.

    -- Original SQL statement: 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    
    -- Fully compatible: Add the window_end field in the query result. 
    select a, b, c, window_start, window_end from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • Fully compatible: Include or exclude the field that specifies the ranking position from the query result.

    -- Original SQL statement: 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    
    -- Fully compatible: Include the rk field in the query result. 
    select a, b, c, window_start, rk from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • Fully compatible: Modify the order of partition keys.

    -- Original SQL statement: 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by a, b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;
    
    -- Fully compatible: Modify the order of partition keys. 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, a, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;

Incompatible modifications

  • Modify a window-related attribute, such as the window type, window size, or time attribute.

    For information about sample modifications, see Modifications that cause full incompatibility.

  • Add, delete, or modify the fields in the group by clause or modify the computation logic of these fields.

    For information about sample modifications, see Modifications that cause full incompatibility.

  • Add, delete, or modify an aggregated field or change the input of the TOP-N query.

    -- Original SQL statement: 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    -- Incompatible: Add min(d) as d, which is an aggregate field. 
    -- This modification leads to a change in the input of the TOP-N query. 
    select a, b, c, d, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
          min(d) as d,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • Add, delete, or modify the partition keys or modify the computation logic of the fields included in the partition keys.

    For information about sample modifications, see Incompatible modifications.

  • Modify the fields or order specified for the order by clause.

    For information about sample modifications, see Incompatible modifications.

  • Modify the value of N. In a Top-N query, N specifies the number of top-ranked results to return.

    For information about sample modifications, see Incompatible modifications.

  • Modify only the order of the window TVF-related fields in the group by clause or only the order of other fields in the group by clause.

    -- Original SQL statement: 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;
    
    -- Incompatible: Modify only the order of the window TVF-related fields in the group by clause. 
    -- This modification leads to a change in the window-based ranking result.
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_end, window_start)
      ) where rk < 3;
    
    -- Incompatible: Modify only the order of other fields in the group by clause. 
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by b, a, window_start, window_end)
      ) where rk < 3;