全部产品
Search
文档中心

实时计算Flink版:Window Top-N

更新时间:Jul 17, 2024

Window TOP-N需要同时遵循Window TVF和Top-N两者的修改要求,支持的兼容性修改较少。本文为您介绍Window Top-N变更的可兼容性和不可兼容性详情。

可兼容的变更

  • 新增、删除window属性字段,该修改属于完全兼容变更。

    -- 原始SQL。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    
    -- 新增window end,该修改属于完全兼容变更。
    select a, b, c, window_start, window_end from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • 是否输出rank number的值,不影响兼容性,该修改属于完全兼容变更。

    -- 原始SQL。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    
    -- 输入rank number值:rk,该修改属于完全兼容变更。
    select a, b, c, window_start, rk from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • Partition Key顺序变化,该修改属于完全兼容变更。

    -- 原始SQL。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by a, b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;
    
    -- 修改Partition Key顺序,属于完全兼容变更。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, a, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;

不兼容的变更

  • 修改window相关属性(window 的类型,window 的大小,时间相关属性),该修改属于不兼容变更。

    示例可参见Window TVF修改window相关属性示例,详情请参见不兼容的变更

  • 新增、删除、修改统计维度(group key)或者统计维度涉及字段的计算逻辑发生变化,该修改属于不兼容变更。

    示例可参见Window TVF修改group key示例,详情请参见不兼容的变更

  • 新增、删除、修改统计指标,该修改属于不兼容变更。TOP-N的输入发生变化,该修改属于不兼容变更。

    -- 原始SQL。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
    
    -- 新增统计指标:min(d) as d,属于不兼容修改。
    -- 该修改会导致TOP-N的输入发生变化。
    select a, b, c, d, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
          min(d) as d,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, window_start, window_end)
      ) where rk < 3;
  • 新增、删除、修改partition by key或者partition by key涉及字段的计算逻辑发生变化,该修改属于不兼容变更。

    示例可参见Top N修改partition by key示例,详情请参见不兼容的变更

  • 修改order by相关属性(排序字段和方向),该修改属于不兼容修改。

    示例可参见Top N修改order by key示例,详情请参见不兼容的变更

  • 修改rank range的值(Top N中N 的值),则该修改属于不兼容修改。

    示例可参见Top N修改rank range示例,详情请参见不兼容的变更

  • 修改group key顺序,窗口相关的group key顺序发生变化,但是普通group key相对顺序保持不变。或者普通group key顺序发生变化,但窗口group key相对顺序不变,都属于不兼容变更。

    -- 原始SQL。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_start, window_end)
      ) where rk < 3;
    
    -- 窗口相关的group key顺序变化,但是普通group key相对顺序不变,属于不兼容变更。
    -- 改变Window Rank的Schema
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by a, b, window_end, window_start)
      ) where rk < 3;
    
    -- 普通group key顺序发生变化,但窗口group key相对顺序不变,属于不兼容变更。
    select a, b, c, window_start from (
      select *,
          row_number() over (partition by b, window_start, window_end order by c) as rk
      from (
        select a,
          sum(b) as b,
          max(c) as c,
            window_start,
            window_end
        from table (tumble(table MyTable, descriptor(ts), interval '1' minute))
          group by b, a, window_start, window_end)
      ) where rk < 3;