Window 関数 - MaxCompute - Alibaba Cloud ドキュメントセンター

ウィンドウ関数を使用して、動的に定義されたデータサブセットのデータを集計または計算できます。たとえば、ウィンドウ関数を使用して、時系列データを処理したり、特定のデータのランキングを計算したり、特定のデータの移動平均を計算したりできます。このトピックでは、MaxCompute SQLでサポートされているウィンドウ関数の構文とパラメーターについて説明します。このトピックでは、ウィンドウ関数を使用してデータを開発する方法の例も示します。

次の表に、MaxCompute SQLでサポートされているウィンドウ関数を示します。

関数	説明
AVG	ウィンドウ内のデータの平均値を計算します。
CLUSTER_SAMPLE	データのランダムな行をサンプリングします。 trueが返された場合、指定された行のデータがサンプリングされます。
COUNT	ウィンドウ内の行数を計算します。
CUME_DIST	パーティション内のデータの累積分布を計算します。
DENSE_RANK	行のグループ内の行のパーセンタイルランクを計算します。ランクは連続しています。
FIRST_VALUE	現在の行が属するウィンドウ内のデータの最初の行の計算結果を取得します。
LAG	ウィンドウ内の特定のオフセットで現在の行に先行するデータのN番目の行の計算結果を取得します。
LAST_VALUE	現在の行が属するウィンドウ内のデータの最後の行の計算結果を取得します。
LEAD	ウィンドウ内の特定のオフセットで現在の行に続くデータのN番目の行の計算結果を取得します。
MAX	ウィンドウ内の最大値を計算します。
MEDIAN	ウィンドウ内の中央値を計算します。
MIN	ウィンドウ内の最小値を計算します。
NTILE	パーティション内のデータの行を同じサイズのN個のグループに分割し、現在の行が属するグループの番号を返します。グループ番号は1〜Nの範囲である。
NTH_VALUE	現在の行が属するウィンドウ内のデータのN行目の計算結果を取得します。
PERCENT_RANK	行のグループ内の行のパーセンタイルランクを計算します。この関数はパーセント値を返します。
RANK	行のグループ内の行のパーセンタイルランクを計算します。ランクは連続した整数でなくてもよい。
ROW_NUMBER	行のシーケンス番号を計算します。行番号は1から始まります。
STDDEV	すべての入力値の母集団の標準偏差を返します。この関数はSTDDEV_POPとも呼ばれます。
STDDEV_SAMP	すべての入力値のサンプル標準偏差を返します。
SUM	ウィンドウ内のデータの合計を計算します。

制限事項

ウィンドウ関数を使用する前に、次の制限に注意してください。

Window関数はSELECT文でのみサポートされます。
ウィンドウ関数にネストされたウィンドウ関数または集計関数を含めることはできません。
同じレベルの集計関数と一緒にウィンドウ関数を使用することはできません。

構文

ウィンドウ関数の構文：

<function_name>([distinct][<expression> [, ...]]) over (<window_definition>)
<function_name>([distinct][<expression> [, ...]]) over <window_name>

function_name: 組み込みウィンドウ関数、組み込み集計関数、またはユーザー定義集計関数 (UDAF) 。組み込みの集計関数の詳細については、「集計関数」をご参照ください。 UDAFの詳細については、「概要」をご参照ください。
expression: ウィンドウ関数の形式。形式は関数構文に従います。
windowing_definition: ウィンドウの定義。 windowing_definitionの構文については、「windowing_definition」をご参照ください。
window_name: ウィンドウの名前。 windowキーワードを使用してウィンドウを設定し、windowing_definitionを使用してウィンドウの名前を指定できます。 named_window_defの構文:
```
window <window_name> as (<window_definition>)
```
SQL文内のnamed_window_defの位置:
```
select ... from ... [where ...] [group by ...] [having ...] named_window_def [order by ...] [limit ...]
```

windowing_definition

構文

-- partition_clause:
[partition by <expression> [, ...]]
-- orderby_clause:
[order by <expression> [asc|desc][nulls {first|last}] [, ...]]
[<frame_clause>]

SELECT文でwindow関数を使用すると、window関数を実行すると、windowing_definitionのpartition byとorder byに基づいてデータがパーティション分割され、ソートされます。 SELECTステートメントにpartition byが含まれていない場合、1つのパーティションのみが存在し、パーティションにはすべてのデータが含まれます。 SELECT文にorder byが含まれていない場合、パーティション内のデータはランダムな順序で配置され、データストリームが生成されます。データストリームが生成された後、windowing_definitionのframe_句に基づいてデータストリームから行のグループが抽出され、現在の行のウィンドウが作成されます。ウィンドウ関数は、現在の行が属するウィンドウに含まれるデータを計算します。

partition by <expression> [, ...]: オプション。このパラメータは、パーティション情報を指定します。パーティションキー列の値が行のグループで同じである場合、これらの行は同じウィンドウに含まれます。によるパーティションの形式の詳細については、「テーブル操作」をご参照ください。
order by <expression> [asc | desc][nulls {first | last}] [, ...]: オプション。ウィンドウ内のデータの行をソートする方法を指定します。
説明
order byで指定された列の値が同じである場合、ソート結果が正確でない可能性があります。データのランダムな順序付けを減らすには、order byで指定された列の値が一意であることを確認します。
frame_句: オプション。このパラメータは、ウィンドウのデータ境界を決定するために使用されます。 frame_句の詳細については、「frame_句」をご参照ください。

filter_句

構文

FILTER (WHERE filter_condition)

filter_conditionはブール式で、WHERE句と同じ方法で使用されます。select... から... ここでステートメントを使用します。

FILTER句が指定されている場合、filter_condition値がtrueの行のみがウィンドウフレームに含まれます。 COUNT、SUM、AVG、MAX、MIN、WM_CONCATなどの集計ウィンドウ関数の場合、行ごとに値が返されます。ただし、NULLやfalseなどの真でない値は、どの行のウィンドウフレームにも含まれません。 NULLはfalseと同じ方法で処理されます。

例

データ準備

-- Create a table.
CREATE TABLE IF NOT EXISTS mf_window_fun(key BIGINT,value BIGINT) STORED AS ALIORC;

-- Insert data into the table.
insert into mf_window_fun values (1,100),(2,200),(1,150),(2,250),(3,300),(4,400),(5,500),(6,600),(7,700);

-- Query data in the mf_window_fun table.
select * from mf_window_fun;

-- The following result is returned:
+------------+------------+
| key        | value      |
+------------+------------+
| 1          | 100        |
| 2          | 200        |
| 1          | 150        |
| 2          | 250        |
| 3          | 300        |
| 4          | 400        |
| 5          | 500        |
| 6          | 600        |
| 7          | 700        |
+------------+------------+

ウィンドウ内の値が100を超える行の累積合計を照会します。

select key,sum(value) filter(where value > 100) 
       over (partition by key order by key)  
       from mf_window_fun;

次の応答が返されます。

+------------+------------+
| key        | _c1        |
+------------+------------+
| 1          | NULL       | -- Skipped
| 1          | 150        |
| 2          | 200        |
| 2          | 450        |
| 3          | 300        |
| 4          | 400        |
| 5          | 500        |
| 6          | 600        |
| 7          | 700        |
+------------+------------+

説明

FILTER句は、filter_conditionで指定された条件を満たさない行をクエリ結果から削除せず、ウィンドウ関数の計算中に行が存在しないと見なします。特定の行を削除する場合は、WHERE句の行を指定する必要があります。select... から... ここでステートメントを使用します。この行のウィンドウ関数値は0またはNULLではなく、前の行のウィンドウ関数値です。
FILTER句は、ウィンドウ関数がCOUNT、SUM、AVG、MAX、MIN、WM_CONCATなどの集計関数である場合にのみ使用できます。ウィンドウ関数がRANK、ROW_NUMBER、NTILEなどの非集計関数の場合、FILTER句は使用できません。そうしないと、構文エラーが発生する可能性があります。
ウィンドウ関数でFILTER構文を使用する場合は、set odps.sql.window.function.newimpl=true; 設定を追加する必要があります。

frame_clause

構文

-- Syntax 1 
{ROWS|RANGE|GROUPS} <frame_start> [<frame_exclusion>]
-- Syntax 2 
{ROWS|RANGE|GROUPS} between <frame_start> and <frame_end> [<frame_exclusion>]

frame_clauseは、ウィンドウのデータ境界を決定するために使用される閉じた区間です。データ境界は、frame_startおよびframe_endで指定された行に基づいて決定されます。

ROWS | 範囲 | グループ: 必須。 ROWS、RANGE、およびGROUPSは、frame_句のタイプを示します。 frame_startとframe_endの実装ルールは、frame_clauseのタイプによって異なります。
- ROWS: ウィンドウのデータ境界は、行数に基づいて決定されます。
- RANGE: ウィンドウのデータ境界は、order by指定された列の値の比較結果に基づいて決定されます。ほとんどの場合、order byはwindowing_definitionで指定されます。 order byがwindowing_definitionで指定されていない場合、order byで指定された列の値は、パーティション内のすべての行で同じです。 null値は同等と見なされます。
- パーティションでは、列の同じ値を持つ行がorder byグループを形成します。 order byが指定されていない場合、パーティション内のすべての行がグループを形成します。 NULL値は同等と見なされます。

frame_startおよびframe_end: ウィンドウの開始行と終了行。 frame_startが必要です。 frame_endはオプションです。 frame_endが指定されていない場合、デフォルト値CURRENT ROWが使用されます。

frame_startで指定された行は、frame_endで指定された行の前にあるか、同じである必要があります。 frame_endで指定された行と比較して、frame_startで指定された行は、ウィンドウ内のすべてのデータがwindowing_definitionのorder by指定された列に基づいてソートされた後、ウィンドウの最初の行に近くます。 frame_clauseのタイプがROWS、RANGE、またはGROUPSの場合のframe_startおよびframe_endの有効な値とロジックを次の表に示します。

frame_clause型	frame_startまたはframe_end値	説明
ROWS, RANGE, and GROUPS	予約なし	パーティションの最初の行を示します。行は1からカウントされます。
ROWS, RANGE, and GROUPS	以下のUNBOUNDED	パーティションの最後の行を示します。
ROWS	現在の列	現在の行を示します。データの各行は、窓関数によって計算された結果に対応する。現在の行は、ウィンドウ関数を使用してデータを計算する行を示します。
	オフセットPRECEDING	指定された`オフセット`で現在の行の前にあるN番目の行を示します。たとえば、`0 PRECEDING`が現在の行を示す場合、`1 PRECEDING`は前の行を示します。 `offset`は負でない整数でなければなりません。
	次のオフセット	指定された`オフセット`で現在の行に続くN番目の行を示します。たとえば、`0 FOLLOWING`が現在の行を示す場合、`1 FOLLOWING`は次の行を示します。 `offset`は負でない整数でなければなりません。
RANGE	現在の列	frame_startがCURRENT ROWに設定されている場合、現在の行と`order by`で指定された列の値が同じ最初の行を示します。 frame_endがCURRENT ROWに設定されている場合、現在の行と`order by`で指定された列の値が同じ最後の行を示します。
	オフセットPRECEDING	frame_startとframe_endで指定された行は、`order by`で指定された並べ替え順序に基づいて決定されます。例えば、ウィンドウ内のデータは列Xに基づいてソートされ、Xiはi番目の行に対応するX値を示し、Xcは現在の行に対応するX値を示す。 frame_startとframe_endで指定した行の位置: `order by`はascに設定されます。 frame_startは、次の要件を満たす最初の行を示します。`Xc - Xi <= offset`。 frame_endは、`Xc − Xi >= offset`という要件を満たす最後の行を示す。 `order by`はdescに設定されます。 frame_startは、次の要件を満たす最初の行を示します。`Xi - Xc <= offset`。 frame_endは、次の要件を満たす最後の行を示します。`Xi - Xc >= offset`。 `order by`で指定する列には、TINYINT、SMALLINT、INT、BIGINT、FLOAT、DOUBLE、DECIMAL、DATETIME、DATE、およびTIMESTAMPのデータ型を指定できます。日付タイプの`オフセット`の構文: `N`: N日またはN秒を示します。負でない整数でなければなりません。 DATETIMEまたはTIMESTAMPタイプのオフセットの場合、N秒を示します。 DATEタイプのオフセットの場合、N日を示します。 `間隔 'N' {YEAR\MONTH\DAY\HOUR\MINUTE\SECOND}`: N年、月、日、時間、分、秒を示します。例えば、`INTERVAL '3' YEAR`は3年を示す。 `INTERVAL 'N-M' YEAR TO MONTH`: N年とMヶ月を示します。例えば、`INTERVAL「1-3」YEAR TO MONTH」`は、1年3ヶ月を示す。 `INTERVAL 'D[ H[:M[:S[:N]]]]'DAY TO SECOND`: D日、H時間、M分、S秒、およびNナノ秒を示す。例えば、`INTERVAL「1 2:3:4:5」DAY TO SECOND」`は、1日、2時間、3分、4秒、および5ナノ秒を示す。
	次のオフセット	frame_startとframe_endで指定された行は、`order by`で指定された並べ替え順序に基づいて決定されます。例えば、ウィンドウ内のデータは列Xに基づいてソートされ、Xiはi番目の行に対応するX値を示し、Xcは現在の行に対応するX値を示す。 frame_startとframe_endで指定した行の位置: `order by`はascに設定されます。 frame_start: 次の要件を満たす最初の行を示します。`Xi - Xc >= offset`。 frame_end: 次の要件を満たす最後の行を示します。`Xi - Xc <= offset`。 `order by`はdescに設定されます。 frame_start: 次の要件を満たす最初の行を示します。`Xc - Xi >= offset`。 frame_end: `Xc − Xi <= offset`という要件を満たす最後の行を示す。
グループ	現在の列	frame_startがCURRENT ROWに設定されている場合、現在の行が属するグループの最初の行を示します。 frame_endがCURRENT ROWに設定されている場合、現在の行が属するグループの最後の行を示します。
	オフセットPRECEDING	frame_startがオフセットPRECEDINGに設定されている場合、所与の`オフセット`で現在の行のグループに先行するN番目のグループの最初の行を示す。 frame_endがオフセットPRECEDINGに設定されている場合、所与の`オフセット`で現在の行のグループに先行するN番目のグループの最後の行を示す。説明 frame_startをUNBOUNDED FOLLOWINGに設定することはできません。また、frame_endをUNBOUNED PRECEDINGに設定することはできません。
	次のオフセット	frame_startがオフセットFOLLOWINGに設定されている場合、所与の`オフセット`で現在の行のグループに続くN番目のグループの最初の行を示します。 frame_endがオフセットFOLLOWINGに設定されている場合、指定された`オフセット`で現在の行のグループに続くN番目のグループの最後の行が使用されます。説明 frame_startをUNBOUNDED FOLLOWINGに設定することはできません。また、frame_endをUNBOUNED PRECEDINGに設定することはできません。

frame_exclusion: オプション。このパラメータは、ウィンドウから特定の行を削除するために使用されます。有効な値：
- EXCLUDE NO OTHERS: ウィンドウから除外される行はありません。
- EXCLUDE CURRENT ROW: 現在の行はウィンドウから除外されます。
- EXCLUDE GROUP: パーティション内の行のグループ全体がウィンドウから除外されます。グループ内で、すべての行は、現在の行として順序で指定された列の同じ値を持ちます。
- EXCLUDE TIES: 現在の行を除く行のグループ全体がウィンドウから除外されます。

デフォルトのframe_clause

frame_clauseを指定しない場合、MaxComputeはデフォルトのframe_clauseを使用してウィンドウのデータ境界を決定します。デフォルトのframe_clauseの値:

odps.sql.hive.com patibleがtrueに設定されている場合、次のデフォルトのframe_clauseが使用されます。このルールは、ほとんどのSQLシステムに適用されます。
```
RANGE between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE NO OTHERS
```
odps.sql.hive.com patibleがfalseに設定され、order byが指定され、次のウィンドウ関数のいずれかが使用される場合、ROWSモードのデフォルトのframe_clauseが使用されます。
```
ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE NO OTHERS
```

ウィンドウのデータ境界の例

この例では、tblという名前のテーブルに、BIGINT型のpid、oid、およびridの3つの列が含まれています。 tblテーブルには、次のデータが含まれます。

+------------+------------+------------+
| pid        | oid        | rid        |
+------------+------------+------------+
| 1          | NULL       | 1          |
| 1          | NULL       | 2          |
| 1          | 1          | 3          |
| 1          | 1          | 4          |
| 1          | 2          | 5          |
| 1          | 4          | 6          |
| 1          | 7          | 7          |
| 1          | 11         | 8          |
| 2          | NULL       | 9          |
| 2          | NULL       | 10         |
+------------+------------+------------+

ROWSモードのWindows

windowing_definition 1

partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1]    |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9]    |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

windowing_definition 2

partition by pid order by oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | NULL       | 2          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 3          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 4          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

windowing_definition 3

partition by pid order by oid ROWS between 1 FOLLOWING and 3 FOLLOWING
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid ROWS between 1 FOLLOWING and 3 FOLLOWING) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [2, 3, 4] |
| 1          | NULL       | 2          | [3, 4, 5] |
| 1          | 1          | 3          | [4, 5, 6] |
| 1          | 1          | 4          | [5, 6, 7] |
| 1          | 2          | 5          | [6, 7, 8] |
| 1          | 4          | 6          | [7, 8] |
| 1          | 7          | 7          | [8]    |
| 1          | 11         | 8          | NULL   |
| 2          | NULL       | 9          | [10]   |
| 2          | NULL       | 10         | NULL   |
+------------+------------+------------+--------+

windowing_definition 4

partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW
-- Sample SQL statement: 
select pid, 
oid, 
rid, 
collect_list(rid) over(partition by pid order by 
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | NULL   |
| 1          | NULL       | 2          | [1]    |
| 1          | 1          | 3          | [1, 2] |
| 1          | 1          | 4          | [1, 2, 3] |
| 1          | 2          | 5          | [1, 2, 3, 4] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7] |
| 2          | NULL       | 9          | NULL   |
| 2          | NULL       | 10         | [9]    |
+------------+------------+------------+--------+

windowing_definition 5

partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | NULL   |
| 1          | NULL       | 2          | NULL   |
| 1          | 1          | 3          | [1, 2] |
| 1          | 1          | 4          | [1, 2] |
| 1          | 2          | 5          | [1, 2, 3, 4] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7] |
| 2          | NULL       | 9          | NULL   |
| 2          | NULL       | 10         | NULL   |
+------------+------------+------------+--------+

windowing_definition 6

partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1]    |
| 1          | NULL       | 2          | [2]    |
| 1          | 1          | 3          | [1, 2, 3] |
| 1          | 1          | 4          | [1, 2, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9]    |
| 2          | NULL       | 10         | [10]   |
+------------+------------+------------+--------+

EXCLUDE CURRENT ROWとEXCLUDE GROUPとの間の差は、構文5および構文6における2、4、および10の除外列値を有する行のウィンドウ列値の間の比較に基づいて得ることができる。 frame_exclusionがEXCLUDE GROUPに設定されている場合、パーティション内で同じpid列値を持つ行は、行が現在の行と同じoid列値を持つ場合に抽出されます。

RANGEモードのWindows

windowing_definition 1

partition by pid order by oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3, 4] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

frame_endがCURRENT ROWに設定されている場合、現在の行とbyでoid列の同じ値を持つ最後の行が取得されます。したがって、rid column値が1である行のwindow column値は [1,2] となります。

windowing_definition 2

partition by pid order by oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | NULL       | 2          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 3          | [3, 4, 5, 6, 7, 8] |
| 1          | 1          | 4          | [3, 4, 5, 6, 7, 8] |
| 1          | 2          | 5          | [5, 6, 7, 8] |
| 1          | 4          | 6          | [6, 7, 8] |
| 1          | 7          | 7          | [7, 8] |
| 1          | 11         | 8          | [8]    |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

windowing_definition 3

partition by pid order by oid RANGE between 3 PRECEDING and 1 PRECEDING
-- Sample SQL statement: 

select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid RANGE between 3 PRECEDING and 1 PRECEDING) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | NULL   |
| 1          | 1          | 4          | NULL   |
| 1          | 2          | 5          | [3, 4] |
| 1          | 4          | 6          | [3, 4, 5] |
| 1          | 7          | 7          | [6]    |
| 1          | 11         | 8          | NULL   |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

oid in order byの値がnullである行の場合、frame_startがoffset PRECEDINGまたはoffset FOLLOWINGに設定されている場合、その行は、oid in order byの値がnullである最初の行です。 frame_endがオフセットPRECEDINGまたはオフセットFOLLOWINGに設定されている場合、行は、order byのoidの値がnullである最後の行です。

グループモードのWindows

windowing_definition

partition by pid order by oid GROUPS between 2 PRECEDING and CURRENT ROW
-- Sample SQL statement: 
select pid, 
       oid, 
       rid, 
collect_list(rid) over(partition by pid order by 
oid GROUPS between 2 PRECEDING and CURRENT ROW) as window from tbl;

次の応答が返されます。

+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3, 4] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [3, 4, 5, 6] |
| 1          | 7          | 7          | [5, 6, 7] |
| 1          | 11         | 8          | [6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+

サンプルデータ

このセクションでは、関数の使用方法を理解するためのサンプルソースデータと例を示します。 empという名前のテーブルを作成し、サンプルデータをテーブルに挿入します。サンプルコマンド：

create table if not exists emp
   (empno bigint,
    ename string,
    job string,
    mgr bigint,
    hiredate datetime,
    sal bigint,
    comm bigint,
    deptno bigint);
tunnel upload emp.txt emp;

emp.txtファイルには、次のサンプルデータが含まれています。

7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

AVG

構文

double avg([distinct] double <expr>) over ([partition_clause] [orderby_clause] [frame_clause])
decimal avg([distinct] decimal <expr>) over ([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のexprの平均値を返します。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。 DOUBLEまたはDECIMAL型の値。
  - 入力値がSTRING型またはBIGINT型の場合は、計算前に暗黙的にDOUBLE型の値に変換されます。入力値が別のデータ型の場合、エラーが返されます。
  - 入力値がnullの場合、値を含む行は計算に使用されません。
  - distinctキーワードが指定されている場合、distinct値の平均値が計算されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprの入力値がDECIMAL型の場合、DECIMAL型の値が返されます。入力値が他のデータ型の場合、DOUBLE型の値が返されます。 exprの入力値がnullの場合、nullが返されます。

例

例1: deptno列を使用してウィンドウを定義し、sal列の平均値を計算します。 order by句は指定されていません。この関数は、現在のウィンドウの最初の行から最後の行までの値の累積平均値を返します。現在のウィンドウには、同じdeptno列値を持つ行が含まれます。例：

select deptno, sal, avg(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 2916.6666666666665 |   -- This row is the first row of this window. The return value is the cumulative average value of the values from the first row to the sixth row. 
| 10         | 2450       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
| 10         | 5000       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
| 10         | 1300       | 2916.6666666666665 |
| 10         | 5000       | 2916.6666666666665 |
| 10         | 2450       | 2916.6666666666665 |
| 20         | 3000       | 2175.0     |
| 20         | 3000       | 2175.0     |
| 20         | 800        | 2175.0     |
| 20         | 1100       | 2175.0     |
| 20         | 2975       | 2175.0     |
| 30         | 1500       | 1566.6666666666667 |
| 30         | 950        | 1566.6666666666667 |
| 30         | 1600       | 1566.6666666666667 |
| 30         | 1250       | 1566.6666666666667 |
| 30         | 1250       | 1566.6666666666667 |
| 30         | 2850       | 1566.6666666666667 |
+------------+------------+------------+

例2: Hive互換データ型エディションを無効にし、deptno列を使用してウィンドウを定義し、sal列の平均値を計算します。 ORDER BYを指定します。この関数は、現在のウィンドウの最初の行から現在の行までの値の累積平均値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。サンプル文:

-- Disable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=false;
-- Execute the following statement: 
select deptno, sal, avg(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300.0     |           -- This row is the first row of this window. 
| 10         | 1300       | 1300.0     |           -- The return value is the cumulative average value of the values in the first and second rows. 
| 10         | 2450       | 1683.3333333333333 |   -- The return value is the cumulative average value of the values from the first row to the third row. 
| 10         | 2450       | 1875.0     |           -- The return value is the cumulative average value of the values from the first row to the fourth row. 
| 10         | 5000       | 2500.0     |           -- The return value is the cumulative average value of the values from the first row to the fifth row. 
| 10         | 5000       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
| 20         | 800        | 800.0      |
| 20         | 1100       | 950.0      |
| 20         | 2975       | 1625.0     |
| 20         | 3000       | 1968.75    |
| 20         | 3000       | 2175.0     |
| 30         | 950        | 950.0      |
| 30         | 1250       | 1100.0     |
| 30         | 1250       | 1150.0     |
| 30         | 1500       | 1237.5     |
| 30         | 1600       | 1310.0     |
| 30         | 2850       | 1566.6666666666667 |
+------------+------------+------------+

例3: Hive互換データ型エディションを有効にし、deptno列を使用してウィンドウを定義し、sal列の平均値を計算します。 ORDER BYを指定します。この関数は、最初の行から現在のウィンドウの現在の行と同じsal値を持つ行までの値の累積平均値を返します。同じsal値を持つ行の平均値は同じです。現在のウィンドウには、同じdeptno値を持つ行が含まれています。サンプル文:

-- Enable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=true;
-- Execute the following statement: 
select deptno, sal, avg(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300.0     |          -- This row is the first row of this window. The average value for the first row is the cumulative average value of the values in the first and second rows because the two rows have the same sal value. 
| 10         | 1300       | 1300.0     |          -- The return value is the cumulative average value of the values in the first and second rows. 
| 10         | 2450       | 1875.0     |          -- The average value for the third row is the cumulative average value of the values from the first row to the fourth row because the third and fourth rows have the same sal value. 
| 10         | 2450       | 1875.0     |          -- The return value is the cumulative average value of the values from the first row to the fourth row. 
| 10         | 5000       | 2916.6666666666665 |
| 10         | 5000       | 2916.6666666666665 |
| 20         | 800        | 800.0      |
| 20         | 1100       | 950.0      |
| 20         | 2975       | 1625.0     |
| 20         | 3000       | 2175.0     |
| 20         | 3000       | 2175.0     |
| 30         | 950        | 950.0      |
| 30         | 1250       | 1150.0     |
| 30         | 1250       | 1150.0     |
| 30         | 1500       | 1237.5     |
| 30         | 1600       | 1310.0     |
| 30         | 2850       | 1566.6666666666667 |
+------------+------------+------------+

CLUSTER_SAMPLE

構文

boolean cluster_sample(bigint <N>) OVER ([partition_clause])
boolean cluster_sample(bigint <N>, bigint <M>) OVER ([partition_clause])

説明
- cluster_sample(bigint <N>): N個のランダムな行のデータをサンプリングすることを指定します。
- cluster_sample(bigint <N>, bigint <M>): 指定された比率 (M/N) に基づいて行をサンプリングすることを指定します。サンプリングされる行の数は、次の式を使用して計算されます。partition_row_count × M/N。 partition_row_countは、パーティション内の行数を指定します。
パラメーター
- N: 必須。 BIGINT型の定数。 Nがnullに設定されている場合、nullが返されます。
- M: 必須です。 BIGINT型の定数。 Mがnullに設定されている場合、nullが返されます。
- partition_clause: オプションです。詳細は、「windowing_definition」をご参照ください。
戻り値
BOOLEAN型の値が返されます。

例

各グループのデータエントリの20% についてのサンプル。例：

select deptno, sal
    from (
        select deptno, sal, cluster_sample(5, 1) over (partition by deptno) as flag
        from emp
        ) sub
    where flag = true;

次の応答が返されます。

+------------+------------+
| deptno     | sal        |
+------------+------------+
| 10         | 1300       |
| 20         | 3000       |
| 30         | 950        |
+------------+------------+

COUNT

構文

bigint count(*) over ([partition_clause] [orderby_clause] [frame_clause])
bigint count([distinct] <expr>[,...]) over ([partition_clause] [orderby_clause] [frame_clause])

説明
- count(*): 返される行の総数。
- count([distinct] <expr>[,...]): 行数を計算すると、exprがnullである行は無視されます。 exprパラメーターが複数存在する場合、exprがnullの行はすべて無視されます。また、distinctキーワードを指定した場合は、重複排除後の行数が計算されます。 exprがnullの行はすべて無視されます。
パラメーター
- expr: 必須です。このパラメータは、値をカウントする列を指定します。すべてのデータ型がサポートされています。行の値がnullの場合、この行は計算に使用されません。 DISTINCTキーワードが指定されている場合、異なる値のカウント値が取得されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
BIGINT型の値が返されます。

サンプル文

例1: sal列を使用してウィンドウを定義します。 order by句は指定されていません。この関数は、現在のウィンドウの最初の行から最後の行までの累積カウントを返します。現在のウィンドウには、同じsal値を持つ行が含まれています。例：

select sal, count(sal) over (partition by sal) as count from emp;

次の応答が返されます。

+------------+------------+
| sal        | count      |
+------------+------------+
| 800        | 1          |
| 950        | 1          |
| 1100       | 1          |
| 1250       | 2          |   -- This row is the first row of this window. The value of COUNT for the first row is the cumulative count for the second row because the two rows have the same sal value. 
| 1250       | 2          |   -- The return value is the cumulative count from the first row to the second row in the current window. 
| 1300       | 2          |
| 1300       | 2          |
| 1500       | 1          |
| 1600       | 1          |
| 2450       | 2          |
| 2450       | 2          |
| 2850       | 1          |
| 2975       | 1          |
| 3000       | 2          |
| 3000       | 2          |
| 5000       | 2          |
| 5000       | 2          |
+------------+------------+

例2: Hive互換データ型エディションを無効にし、sal列を使用してウィンドウを定義します。 ORDER BYを指定します。この関数は、現在のウィンドウの最初の行から現在の行までの累積カウントを返します。現在のウィンドウには、同じsal値を持つ行が含まれています。サンプル文:

-- Disable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=false;
-- Execute the following statement: 
select sal, count(sal) over (partition by sal order by sal) as count from emp;

次の応答が返されます。

+------------+------------+
| sal        | count      |
+------------+------------+
| 800        | 1          |
| 950        | 1          |
| 1100       | 1          |
| 1250       | 1          |   -- This row is the first row of this window. The cumulative count for the first row is 1. 
| 1250       | 2          |   -- The cumulative count for the second row is 2. 
| 1300       | 1          |
| 1300       | 2          |
| 1500       | 1          |
| 1600       | 1          |
| 2450       | 1          |
| 2450       | 2          |
| 2850       | 1          |
| 2975       | 1          |
| 3000       | 1          |
| 3000       | 2          |
| 5000       | 1          |
| 5000       | 2          |
+------------+------------+

例3: Hive互換データ型エディションを有効にし、sal列を使用してウィンドウを定義します。 ORDER BYを指定します。この関数は、現在のウィンドウの最初の行から最後の行までの累積カウントを返します。現在のウィンドウには、同じsal値を持つ行が含まれています。サンプル文:

-- Enable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=true;
-- Execute the following statement: 
select sal, count(sal) over (partition by sal order by sal) as count from emp;

次の応答が返されます。

+------------+------------+
| sal        | count      |
+------------+------------+
| 800        | 1          |
| 950        | 1          |
| 1100       | 1          |
| 1250       | 2          |   -- This row is the first row of this window. The value of COUNT for the first row is the cumulative count for the second row because the two rows have the same sal value. 
| 1250       | 2          |   -- The return value is the cumulative count from the first row to the second row in the current window. 
| 1300       | 2          |
| 1300       | 2          |
| 1500       | 1          |
| 1600       | 1          |
| 2450       | 2          |
| 2450       | 2          |
| 2850       | 1          |
| 2975       | 1          |
| 3000       | 2          |
| 3000       | 2          |
| 5000       | 2          |
| 5000       | 2          |
+------------+------------+

CUME_DIST

構文

double cume_dist() over([partition_clause] [orderby_clause])

説明
累積分布を計算します。累積分布は、値が現在の行の値以上である行の、パーティション内のすべての行に対する比率を示します。比率は、orderby_clauseに基づいて決定されます。
パラメーター
partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
DOUBLE型の値が返されます。戻り値は、次の式を使用して計算されます。row_number_of_last_peer/partition_row_count。 row_number_of_last_peerは、現在の行が属するグループの最後の行に対応するROW_NUMBERによって返される値を示します。 partition_row_countは、現在の行が属するパーティション内の行数を示します。

例

deptno列に基づいてすべての従業員をグループ化し、給与で各グループの従業員の累積分布を計算します。例：

select deptno, ename, sal, concat(round(cume_dist() over (partition by deptno order by sal desc)*100,2),'%') as cume_dist from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | cume_dist  |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 33.33%     |
| 10         | KING       | 5000       | 33.33%     |
| 10         | CLARK      | 2450       | 66.67%     |
| 10         | WELAN      | 2450       | 66.67%     |
| 10         | TEBAGE     | 1300       | 100.0%     |
| 10         | MILLER     | 1300       | 100.0%     |
| 20         | SCOTT      | 3000       | 40.0%      |
| 20         | FORD       | 3000       | 40.0%      |
| 20         | JONES      | 2975       | 60.0%      |
| 20         | ADAMS      | 1100       | 80.0%      |
| 20         | SMITH      | 800        | 100.0%     |
| 30         | BLAKE      | 2850       | 16.67%     |
| 30         | ALLEN      | 1600       | 33.33%     |
| 30         | TURNER     | 1500       | 50.0%      |
| 30         | MARTIN     | 1250       | 83.33%     |
| 30         | WARD       | 1250       | 83.33%     |
| 30         | JAMES      | 950        | 100.0%     |
+------------+------------+------------+------------+

DENSE_RANK

構文

bigint dense_rank() over ([partition_clause] [orderby_clause])

説明
orderby_clauseで指定された順序に基づいて、パーティション内の現在の行のランクを返します。ランクは1からカウントされます。パーティション内で、order byで指定された列の同じ値を持つ行は同じランクを持ちます。 順番に指定した列の値が変わるたびにランクが1ずつ増加します。
パラメーター
partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
BIGINT型の値が返されます。 orderby_句が指定されていない場合、返される結果の値はすべて1になります。

例

deptno列に基づいてすべての従業員をグループ化します。各グループで、sal値に基づいて降順に従業員をソートし、グループ内の従業員の序数を取得します。例：

select deptno, ename, sal, dense_rank() over (partition by deptno order by sal desc) as nums from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nums       |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 1          |
| 10         | KING       | 5000       | 1          |
| 10         | CLARK      | 2450       | 2          |
| 10         | WELAN      | 2450       | 2          |
| 10         | TEBAGE     | 1300       | 3          |
| 10         | MILLER     | 1300       | 3          |
| 20         | SCOTT      | 3000       | 1          |
| 20         | FORD       | 3000       | 1          |
| 20         | JONES      | 2975       | 2          |
| 20         | ADAMS      | 1100       | 3          |
| 20         | SMITH      | 800        | 4          |
| 30         | BLAKE      | 2850       | 1          |
| 30         | ALLEN      | 1600       | 2          |
| 30         | TURNER     | 1500       | 3          |
| 30         | MARTIN     | 1250       | 4          |
| 30         | WARD       | 1250       | 4          |
| 30         | JAMES      | 950        | 5          |
+------------+------------+------------+------------+

FIRST_VALUE

構文

first_value(<expr>[, <ignore_nulls>]) over ([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウの最初の行に対応するexprの値を返します。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。
- ignore_nulls: オプションです。 BOOLEAN型の値。 null値を無視するかどうかを指定します。デフォルト値：false このパラメーターをTrueに設定すると、ウィンドウの最初の行に対応するexprのnull以外の値が返されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

サンプル文

すべての従業員を部門ごとにグループ化し、各グループのデータの最初の行を返します。例：

order byは指定されていません。

select deptno, ename, sal, first_value(sal) over (partition by deptno) as first_value from emp;

次の応答が返されます。

+------------+------------+------------+-------------+
| deptno     | ename      | sal        | first_value |
+------------+------------+------------+-------------+
| 10         | TEBAGE     | 1300       | 1300        |   -- This row is the first row of this window. 
| 10         | CLARK      | 2450       | 1300        |
| 10         | KING       | 5000       | 1300        |
| 10         | MILLER     | 1300       | 1300        |
| 10         | JACCKA     | 5000       | 1300        |
| 10         | WELAN      | 2450       | 1300        |
| 20         | FORD       | 3000       | 3000        |   -- This row is the first row of this window. 
| 20         | SCOTT      | 3000       | 3000        |
| 20         | SMITH      | 800        | 3000        |
| 20         | ADAMS      | 1100       | 3000        |
| 20         | JONES      | 2975       | 3000        |
| 30         | TURNER     | 1500       | 1500        |   -- This row is the first row of this window. 
| 30         | JAMES      | 950        | 1500        |
| 30         | ALLEN      | 1600       | 1500        |
| 30         | WARD       | 1250       | 1500        |
| 30         | MARTIN     | 1250       | 1500        |
| 30         | BLAKE      | 2850       | 1500        |
+------------+------------+------------+-------------+

order byが指定されています。

select deptno, ename, sal, first_value(sal) over (partition by deptno order by sal desc) as first_value from emp;

次の応答が返されます。

+------------+------------+------------+-------------+
| deptno     | ename      | sal        | first_value |
+------------+------------+------------+-------------+
| 10         | JACCKA     | 5000       | 5000        |   -- This row is the first row of this window. 
| 10         | KING       | 5000       | 5000        |
| 10         | CLARK      | 2450       | 5000        |
| 10         | WELAN      | 2450       | 5000        |
| 10         | TEBAGE     | 1300       | 5000        |
| 10         | MILLER     | 1300       | 5000        |
| 20         | SCOTT      | 3000       | 3000        |   -- This row is the first row of this window. 
| 20         | FORD       | 3000       | 3000        |
| 20         | JONES      | 2975       | 3000        |
| 20         | ADAMS      | 1100       | 3000        |
| 20         | SMITH      | 800        | 3000        |
| 30         | BLAKE      | 2850       | 2850        |   -- This row is the first row of this window. 
| 30         | ALLEN      | 1600       | 2850        |
| 30         | TURNER     | 1500       | 2850        |
| 30         | MARTIN     | 1250       | 2850        |
| 30         | WARD       | 1250       | 2850        |
| 30         | JAMES      | 950        | 2850        |
+------------+------------+------------+-------------+

LAG

構文

lag(<expr>[,bigint <offset>[, <default>]]) over([partition_clause] orderby_clause)

説明
指定されたオフセットで現在の行の前にあるexprの値を返します。式exprは、列、列操作、または関数操作にすることができます。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。
- offset: オプション。値はBIGINT型の定数であり、0以上でなければなりません。値0は現在の行を示し、値1は前の行を示します。デフォルト値は 1 です。入力値がSTRING型またはDOUBLE型の場合、計算前にBIGINT型の値に暗黙的に変換されます。
- default: オプション。 offsetの値が有効範囲外の場合のデフォルト値。このパラメータの値は定数でなければなりません。このパラメーターのデフォルト値はnullです。このパラメーターの値は、exprの値と同じデータ型である必要があります。 exprの値が定数でない場合、パラメーター値は現在の行に基づいて決定されます。
- partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

例

deptno列に基づいてすべての従業員をグループ化し、特定のオフセットで各従業員のsalの値を計算します。例：

select deptno, ename, sal, lag(sal, 1) over (partition by deptno order by sal) as sal_new from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | sal_new    |
+------------+------------+------------+------------+
| 10         | TEBAGE     | 1300       | NULL       |
| 10         | MILLER     | 1300       | 1300       |
| 10         | CLARK      | 2450       | 1300       |
| 10         | WELAN      | 2450       | 2450       |
| 10         | KING       | 5000       | 2450       |
| 10         | JACCKA     | 5000       | 5000       |
| 20         | SMITH      | 800        | NULL       |
| 20         | ADAMS      | 1100       | 800        |
| 20         | JONES      | 2975       | 1100       |
| 20         | SCOTT      | 3000       | 2975       |
| 20         | FORD       | 3000       | 3000       |
| 30         | JAMES      | 950        | NULL       |
| 30         | MARTIN     | 1250       | 950        |
| 30         | WARD       | 1250       | 1250       |
| 30         | TURNER     | 1500       | 1250       |
| 30         | ALLEN      | 1600       | 1500       |
| 30         | BLAKE      | 2850       | 1600       |
+------------+------------+------------+------------+

LAST_VALUE

構文

last_value(<expr>[, <ignore_nulls>]) over([partition_clause] [orderby_clause] [frame_clause])

説明
exprで指定された式を使用して、ウィンドウ内のデータの最後の行の計算結果を返します。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。
- ignore_nulls: オプションです。 BOOLEAN型の値。 null値を無視するかどうかを指定します。デフォルト値：false このパラメーターをtrueに設定すると、ウィンドウの最後の行に対応するexprのnull以外の値が返されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

例

すべての従業員を部門ごとにグループ化し、各グループのデータの最後の行を返します。例：

order byが指定されていない場合、最初の行から最後の行までの行が現在のウィンドウに属します。現在のウィンドウの最後の行の値が返されます。

select deptno, ename, sal, last_value(sal) over (partition by deptno) as last_value from emp;

次の応答が返されます。

+------------+------------+------------+-------------+
| deptno     | ename      | sal        | last_value |
+------------+------------+------------+-------------+
| 10         | TEBAGE     | 1300       | 2450        |
| 10         | CLARK      | 2450       | 2450        |
| 10         | KING       | 5000       | 2450        |
| 10         | MILLER     | 1300       | 2450        |
| 10         | JACCKA     | 5000       | 2450        |   
| 10         | WELAN      | 2450       | 2450        |   -- This row is the last row of this window. 
| 20         | FORD       | 3000       | 2975        |
| 20         | SCOTT      | 3000       | 2975        |
| 20         | SMITH      | 800        | 2975        |
| 20         | ADAMS      | 1100       | 2975        |
| 20         | JONES      | 2975       | 2975        |   -- This row is the last row of this window. 
| 30         | TURNER     | 1500       | 2850        |
| 30         | JAMES      | 950        | 2850        |
| 30         | ALLEN      | 1600       | 2850        |
| 30         | WARD       | 1250       | 2850        |
| 30         | MARTIN     | 1250       | 2850        |
| 30         | BLAKE      | 2850       | 2850        |   -- This row is the last row of the current window. 
+------------+------------+------------+-------------+

order byを指定した場合、最初の行から現在の行までの行が現在のウィンドウに属します。現在のウィンドウの現在の行の値が返されます。

select deptno, ename, sal, last_value(sal) over (partition by deptno order by sal desc) as last_value from emp;

次の応答が返されます。

+------------+------------+------------+-------------+
| deptno     | ename      | sal        | last_value |
+------------+------------+------------+-------------+
| 10         | JACCKA     | 5000       | 5000        |   -- This row is the current row of this window. 
| 10         | KING       | 5000       | 5000        |   -- This row is the current row of this window. 
| 10         | CLARK      | 2450       | 2450        |   -- This row is the current row of this window. 
| 10         | WELAN      | 2450       | 2450        |   -- This row is the current row of this window. 
| 10         | TEBAGE     | 1300       | 1300        |   -- This row is the current row of this window. 
| 10         | MILLER     | 1300       | 1300        |   -- This row is the current row of this window. 
| 20         | SCOTT      | 3000       | 3000        |   -- This row is the current row of this window. 
| 20         | FORD       | 3000       | 3000        |   -- This row is the current row of this window. 
| 20         | JONES      | 2975       | 2975        |   -- This row is the current row of this window. 
| 20         | ADAMS      | 1100       | 1100        |   -- This row is the current row of this window. 
| 20         | SMITH      | 800        | 800         |   -- This row is the current row of this window. 
| 30         | BLAKE      | 2850       | 2850        |   -- This row is the current row of this window. 
| 30         | ALLEN      | 1600       | 1600        |   -- This row is the current row of this window. 
| 30         | TURNER     | 1500       | 1500        |   -- This row is the current row of this window. 
| 30         | MARTIN     | 1250       | 1250        |   -- This row is the current row of this window. 
| 30         | WARD       | 1250       | 1250        |   -- This row is the current row of this window. 
| 30         | JAMES      | 950        | 950         |   -- This row is the current row of this window. 
+------------+------------+------------+-------------+

LEAD

構文

lead(<expr>[, bigint <offset>[, <default>]]) over([partition_clause] orderby_clause)

説明
指定されたオフセットで現在の行に続くN行目に対応するexprの値を返します。式exprは、列、列操作、または関数操作にすることができます。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。
- offset: オプション。値はBIGINT型の定数であり、0以上でなければなりません。値0は現在の行を示し、値1は次の行を示します。デフォルト値は 1 です。入力値がSTRING型またはDOUBLE型の場合、計算前にBIGINT型の値に暗黙的に変換されます。
- default: オプション。 offsetの値が有効範囲外の場合のデフォルト値。このパラメータの値は定数でなければなりません。このパラメーターのデフォルト値はnullです。このパラメーターの値は、exprの値と同じデータ型である必要があります。 exprの値が定数でない場合、パラメーター値は現在の行に基づいて決定されます。
- partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

例

deptno列に基づいてすべての従業員をグループ化し、特定のオフセットで各従業員のsalの値を計算します。例：

select deptno, ename, sal, lead(sal, 1) over (partition by deptno order by sal) as sal_new from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | sal_new    |
+------------+------------+------------+------------+
| 10         | TEBAGE     | 1300       | 1300       |
| 10         | MILLER     | 1300       | 2450       |
| 10         | CLARK      | 2450       | 2450       |
| 10         | WELAN      | 2450       | 5000       |
| 10         | KING       | 5000       | 5000       |
| 10         | JACCKA     | 5000       | NULL       |
| 20         | SMITH      | 800        | 1100       |
| 20         | ADAMS      | 1100       | 2975       |
| 20         | JONES      | 2975       | 3000       |
| 20         | SCOTT      | 3000       | 3000       |
| 20         | FORD       | 3000       | NULL       |
| 30         | JAMES      | 950        | 1250       |
| 30         | MARTIN     | 1250       | 1250       |
| 30         | WARD       | 1250       | 1500       |
| 30         | TURNER     | 1500       | 1600       |
| 30         | ALLEN      | 1600       | 2850       |
| 30         | BLAKE      | 2850       | NULL       |
+------------+------------+------------+------------+

MAX

構文

max(<expr>) over([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のexprの最大値を返します。
パラメーター
- expr: 必須です。最大値の計算に使用される式。入力値は、BOOLEAN以外の任意のデータ型とすることができる。行の値がnullの場合、この行は計算に使用されません。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じ型の値が返されます。

例

例1: deptno列を使用してウィンドウを定義し、sal列の最大値を取得します。 order by句は指定されていません。この関数は、現在のウィンドウの最大値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, max(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 5000       |   -- This row is the first row of this window. The return value is the maximum value among the values from the first row to the sixth row. 
| 10         | 2450       | 5000       |   -- The return value is the maximum value among the values from the first row to the sixth row. 
| 10         | 5000       | 5000       |   -- The return value is the maximum value among the values from the first row to the sixth row. 
| 10         | 1300       | 5000       |
| 10         | 5000       | 5000       |
| 10         | 2450       | 5000       |
| 20         | 3000       | 3000       |
| 20         | 3000       | 3000       |
| 20         | 800        | 3000       |
| 20         | 1100       | 3000       |
| 20         | 2975       | 3000       |
| 30         | 1500       | 2850       |
| 30         | 950        | 2850       |
| 30         | 1600       | 2850       |
| 30         | 1250       | 2850       |
| 30         | 1250       | 2850       |
| 30         | 2850       | 2850       |
+------------+------------+------------+

例2: deptno列を使用してウィンドウを定義し、sal列の最大値を取得します。 order by句が指定されています。この関数は、現在のウィンドウの最初の行から現在の行までの値のうち、最大値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, max(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300       |   -- This row is the first row of this window. 
| 10         | 1300       | 1300       |   -- The return value is the maximum value among the values in the first and second rows. 
| 10         | 2450       | 2450       |   -- The return value is the maximum value among the values from the first row to the third row. 
| 10         | 2450       | 2450       |   -- The return value is the maximum value among the values from the first row to the fourth row. 
| 10         | 5000       | 5000       |
| 10         | 5000       | 5000       |
| 20         | 800        | 800        |
| 20         | 1100       | 1100       |
| 20         | 2975       | 2975       |
| 20         | 3000       | 3000       |
| 20         | 3000       | 3000       |
| 30         | 950        | 950        |
| 30         | 1250       | 1250       |
| 30         | 1250       | 1250       |
| 30         | 1500       | 1500       |
| 30         | 1600       | 1600       |
| 30         | 2850       | 2850       |
+------------+------------+------------+

MEDIAN

構文

median(<expr>) over ([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のexprの中央値を返します。
パラメーター
- expr: 必須です。中央値の計算に使用される式。 DOUBLEまたはDECIMAL型の値。このパラメーターの値は1〜255桁である必要があります。
  - 入力値がSTRING型またはBIGINT型の場合は、計算前に暗黙的にDOUBLE型の値に変換されます。別のデータ型の場合は、エラーが返されます。
  - 入力値がnullの場合は、nullが返されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
DOUBLEまたはDECIMAL型の値が返されます。 exprで指定されたすべての式の値がnullの場合、nullが返されます。

例

deptno列を使用してウィンドウを定義し、sal列の中央値を計算します。この関数は、現在のウィンドウの中央値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, median(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 2450.0     |   -- This row is the first row of this window. The return value is the median value of the values from the first row to the sixth row. 
| 10         | 2450       | 2450.0     |
| 10         | 5000       | 2450.0     |
| 10         | 1300       | 2450.0     |
| 10         | 5000       | 2450.0     |
| 10         | 2450       | 2450.0     |
| 20         | 3000       | 2975.0     |
| 20         | 3000       | 2975.0     |
| 20         | 800        | 2975.0     |
| 20         | 1100       | 2975.0     |
| 20         | 2975       | 2975.0     |
| 30         | 1500       | 1375.0     |
| 30         | 950        | 1375.0     |
| 30         | 1600       | 1375.0     |
| 30         | 1250       | 1375.0     |
| 30         | 1250       | 1375.0     |
| 30         | 2850       | 1375.0     |
+------------+------------+------------+

MIN

構文

min(<expr>) over([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のexprの最小値を返します。
パラメーター
- expr: 必須です。最小値の計算に使用される式。入力値は、BOOLEAN以外の任意のデータ型とすることができる。行の値がnullの場合、この行は計算に使用されません。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

例

例1: deptno列を使用してウィンドウを定義し、sal列の最小値を取得します。 order by句は指定されていません。この関数は、現在のウィンドウの最小値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, min(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300       |   -- This row is the first row of this window. The return value is the minimum value among the values from the first row to the sixth row. 
| 10         | 2450       | 1300       |   -- The return value is the minimum value among the values from the first row to the sixth row. 
| 10         | 5000       | 1300       |   -- The return value is the minimum value among the values from the first row to the sixth row. 
| 10         | 1300       | 1300       |
| 10         | 5000       | 1300       |
| 10         | 2450       | 1300       |
| 20         | 3000       | 800        |
| 20         | 3000       | 800        |
| 20         | 800        | 800        |
| 20         | 1100       | 800        |
| 20         | 2975       | 800        |
| 30         | 1500       | 950        |
| 30         | 950        | 950        |
| 30         | 1600       | 950        |
| 30         | 1250       | 950        |
| 30         | 1250       | 950        |
| 30         | 2850       | 950        |
+------------+------------+------------+

例2: deptno列を使用してウィンドウを定義し、sal列の最小値を取得します。 order by句が指定されています。この関数は、現在のウィンドウの最初の行から現在の行までの値の中で最小値を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, min(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300       |   -- This row is the first row of this window. 
| 10         | 1300       | 1300       |   -- The return value is the minimum value among the values in the first and second rows. 
| 10         | 2450       | 1300       |   -- The return value is the minimum value among the values from the first row to the third row. 
| 10         | 2450       | 1300       |
| 10         | 5000       | 1300       |
| 10         | 5000       | 1300       |
| 20         | 800        | 800        |
| 20         | 1100       | 800        |
| 20         | 2975       | 800        |
| 20         | 3000       | 800        |
| 20         | 3000       | 800        |
| 30         | 950        | 950        |
| 30         | 1250       | 950        |
| 30         | 1250       | 950        |
| 30         | 1500       | 950        |
| 30         | 1600       | 950        |
| 30         | 2850       | 950        |
+------------+------------+------------+

NTILE

構文

bigint ntile(bigint <N>) over ([partition_clause] [orderby_clause])

説明
パーティション内のデータの行を同じサイズのN個のグループに分割し、指定された行が属するグループの番号を返します。パーティション内のデータを等しいサイズのN個のグループに分割できない場合、もう1つの行が優先的に最初のM個のグループに割り当てられる。
パラメーター
- N: 必須。このパラメータは、分割の数を指定します。 BIGINT型の値。
- partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
BIGINT型の値が返されます。

例

すべての従業員をsal列に基づいて降順に3つのグループに分割し、各従業員が属するグループの番号を取得します。例：

select deptno, ename, sal, ntile(3) over (partition by deptno order by sal desc) as nt3 from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nt3        |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 1          |
| 10         | KING       | 5000       | 1          |
| 10         | CLARK      | 2450       | 2          |
| 10         | WELAN      | 2450       | 2          |
| 10         | TEBAGE     | 1300       | 3          |
| 10         | MILLER     | 1300       | 3          |
| 20         | SCOTT      | 3000       | 1          |
| 20         | FORD       | 3000       | 1          |
| 20         | JONES      | 2975       | 2          |
| 20         | ADAMS      | 1100       | 2          |
| 20         | SMITH      | 800        | 3          |
| 30         | BLAKE      | 2850       | 1          |
| 30         | ALLEN      | 1600       | 1          |
| 30         | TURNER     | 1500       | 2          |
| 30         | MARTIN     | 1250       | 2          |
| 30         | WARD       | 1250       | 3          |
| 30         | JAMES      | 950        | 3          |
+------------+------------+------------+------------+

NTH_VALUE

構文

nth_value(<expr>, <number> [, <ignore_nulls>]) over ([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のN番目の行に対応するexprの値を返します。
パラメーター
- expr: 必須です。返された結果を計算するために使用される式。
- number: 必須です。 BIGINT型の値。値は1以上の整数である必要があります。入力値が1の場合、この関数はFIRST_VALUEに相当します。
- ignore_nulls: オプションです。 BOOLEAN型の値。 null値を無視するかどうかを指定します。デフォルト値：false このパラメーターをTrueに設定すると、ウィンドウのN行目に対応するexprのnull以外の値が返されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。

例

すべての従業員を部門ごとにグループ化し、各グループの6行目のデータを返します。例：

order byが指定されていない場合、最初の行から最後の行までの行が現在のウィンドウに属します。現在のウィンドウの6行目の値が返されます。例：

select deptno, ename, sal, nth_value(sal,6) over (partition by deptno) as nth_value from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nth_value  |
+------------+------------+------------+------------+
| 10         | TEBAGE     | 1300       | 2450       |
| 10         | CLARK      | 2450       | 2450       |
| 10         | KING       | 5000       | 2450       |
| 10         | MILLER     | 1300       | 2450       |
| 10         | JACCKA     | 5000       | 2450       |
| 10         | WELAN      | 2450       | 2450       |   -- This row is the sixth row of this window. 
| 20         | FORD       | 3000       | NULL       |
| 20         | SCOTT      | 3000       | NULL       |
| 20         | SMITH      | 800        | NULL       |
| 20         | ADAMS      | 1100       | NULL       |
| 20         | JONES      | 2975       | NULL       |   -- This current window has less than six rows, and null is returned. 
| 30         | TURNER     | 1500       | 2850       |
| 30         | JAMES      | 950        | 2850       |
| 30         | ALLEN      | 1600       | 2850       |
| 30         | WARD       | 1250       | 2850       |
| 30         | MARTIN     | 1250       | 2850       |
| 30         | BLAKE      | 2850       | 2850       |   -- This row is the sixth row of the current window. 
+------------+------------+------------+------------+

order byを指定した場合、最初の行から現在の行までの行が現在のウィンドウに属します。現在のウィンドウの6行目の値が返されます。

select deptno, ename, sal, nth_value(sal,6) over (partition by deptno order by sal) as nth_value from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nth_value  |
+------------+------------+------------+------------+
| 10         | TEBAGE     | 1300       | NULL       |   
| 10         | MILLER     | 1300       | NULL       |   -- This window has only two rows, and null is returned. 
| 10         | CLARK      | 2450       | NULL       |
| 10         | WELAN      | 2450       | NULL       |
| 10         | KING       | 5000       | 5000       |  
| 10         | JACCKA     | 5000       | 5000       |
| 20         | SMITH      | 800        | NULL       |
| 20         | ADAMS      | 1100       | NULL       |
| 20         | JONES      | 2975       | NULL       |
| 20         | SCOTT      | 3000       | NULL       |
| 20         | FORD       | 3000       | NULL       |
| 30         | JAMES      | 950        | NULL       |
| 30         | MARTIN     | 1250       | NULL       |
| 30         | WARD       | 1250       | NULL       |
| 30         | TURNER     | 1500       | NULL       |
| 30         | ALLEN      | 1600       | NULL       |
| 30         | BLAKE      | 2850       | 2850       |
+------------+------------+------------+------------+

PERCENT_RANK

構文

double percent_rank() over([partition_clause] [orderby_clause])

説明
orderby_句に基づいて、パーティション内の現在の行のパーセンタイルランクを計算します。
パラメーター
partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
DOUBLE型の値が返されます。有効な値の範囲は [0.0, 1.0] です。戻り値は、次の式を使用して計算されます。"(rank - 1)/(partition_row_count - 1)" rankは、現在の行に対応するRANK関数の戻り値を示します。 partition_row_countは、現在の行が属するパーティション内の行数を示します。パーティションに1行のデータしか含まれていない場合は、0.0が返されます。

例

sal列に基づいて、グループ内の各従業員のパーセンタイルランクを計算します。例：

select deptno, ename, sal, percent_rank() over (partition by deptno order by sal desc) as sal_new from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | sal_new    |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 0.0        |
| 10         | KING       | 5000       | 0.0        |
| 10         | CLARK      | 2450       | 0.4        |
| 10         | WELAN      | 2450       | 0.4        |
| 10         | TEBAGE     | 1300       | 0.8        |
| 10         | MILLER     | 1300       | 0.8        |
| 20         | SCOTT      | 3000       | 0.0        |
| 20         | FORD       | 3000       | 0.0        |
| 20         | JONES      | 2975       | 0.5        |
| 20         | ADAMS      | 1100       | 0.75       |
| 20         | SMITH      | 800        | 1.0        |
| 30         | BLAKE      | 2850       | 0.0        |
| 30         | ALLEN      | 1600       | 0.2        |
| 30         | TURNER     | 1500       | 0.4        |
| 30         | MARTIN     | 1250       | 0.6        |
| 30         | WARD       | 1250       | 0.6        |
| 30         | JAMES      | 950        | 1.0        |
+------------+------------+------------+------------+

RANK

構文

bigint rank() over ([partition_clause] [orderby_clause])

説明
orderby_clauseで指定された順序に基づいて、パーティション内の現在の行のランクを返します。ランクは1からカウントされます。
パラメーター
partition_clauseおよびorderby_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
BIGINT型の値が返されます。戻り値は、重複および不連続であり得る。戻り値は、現在の行が属するグループの最初の行のシーケンス番号です。最初の行のシーケンス番号は、ROW_NUMBER() 関数を使用して計算されます。 orderby_句が指定されていない場合、返される結果の値はすべて1になります。

例

select deptno, ename, sal, rank() over (partition by deptno order by sal desc) as nums from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nums       |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 1          |
| 10         | KING       | 5000       | 1          |
| 10         | CLARK      | 2450       | 3          |
| 10         | WELAN      | 2450       | 3          |
| 10         | TEBAGE     | 1300       | 5          |
| 10         | MILLER     | 1300       | 5          |
| 20         | SCOTT      | 3000       | 1          |
| 20         | FORD       | 3000       | 1          |
| 20         | JONES      | 2975       | 3          |
| 20         | ADAMS      | 1100       | 4          |
| 20         | SMITH      | 800        | 5          |
| 30         | BLAKE      | 2850       | 1          |
| 30         | ALLEN      | 1600       | 2          |
| 30         | TURNER     | 1500       | 3          |
| 30         | MARTIN     | 1250       | 4          |
| 30         | WARD       | 1250       | 4          |
| 30         | JAMES      | 950        | 6          |
+------------+------------+------------+------------+

ROW_NUMBER

構文

row_number() over([partition_clause] [orderby_clause])

説明
パーティション内の現在の行のシーケンス番号を返します。シーケンス番号は1からカウントされます。
パラメーター
詳細は、「windowing_definition」をご参照ください。 frame_句はサポートされていません。
戻り値
BIGINT型の値が返されます。

例

deptno列に基づいてすべての従業員をグループ化します。各グループで、sal値に基づいて降順で従業員をソートし、それぞれのグループの従業員のシーケンス番号を取得します。例：

select deptno, ename, sal, row_number() over (partition by deptno order by sal desc) as nums from emp;

次の応答が返されます。

+------------+------------+------------+------------+
| deptno     | ename      | sal        | nums       |
+------------+------------+------------+------------+
| 10         | JACCKA     | 5000       | 1          |
| 10         | KING       | 5000       | 2          |
| 10         | CLARK      | 2450       | 3          |
| 10         | WELAN      | 2450       | 4          |
| 10         | TEBAGE     | 1300       | 5          |
| 10         | MILLER     | 1300       | 6          |
| 20         | SCOTT      | 3000       | 1          |
| 20         | FORD       | 3000       | 2          |
| 20         | JONES      | 2975       | 3          |
| 20         | ADAMS      | 1100       | 4          |
| 20         | SMITH      | 800        | 5          |
| 30         | BLAKE      | 2850       | 1          |
| 30         | ALLEN      | 1600       | 2          |
| 30         | TURNER     | 1500       | 3          |
| 30         | MARTIN     | 1250       | 4          |
| 30         | WARD       | 1250       | 5          |
| 30         | JAMES      | 950        | 6          |
+------------+------------+------------+------------+

STDDEV

構文

double stddev|stddev_pop([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause])
decimal stddev|stddev_pop([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause])

説明
すべての入力値の母標準偏差を返します。この関数はSTDDEV_POPとも呼ばれます。
パラメーター
- expr: 必須です。母集団の標準偏差を計算するために使用される式。 DOUBLEまたはDECIMAL型の値。
  - 入力値がSTRING型またはBIGINT型の場合は、計算前に暗黙的にDOUBLE型の値に変換されます。別のデータ型の場合は、エラーが返されます。
  - 入力値がnullの場合、値を含む行は計算に使用されません。
  - distinctキーワードが指定されている場合、distinct値の母集団標準偏差が計算されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。 exprで指定されたすべての式の値がnullの場合、nullが返されます。

例

例1: deptno列を使用してウィンドウを定義し、sal列の母集団標準偏差を計算します。 order by句は指定されていません。この関数は、現在のウィンドウの累積母集団標準偏差を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, stddev(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1546.1421524412158 |   -- This row is the first row of this window. The return value is the cumulative population standard deviation of the values from the first row to the sixth row. 
| 10         | 2450       | 1546.1421524412158 |   -- The return value is the cumulative population standard deviation of the values from the first row to the sixth row. 
| 10         | 5000       | 1546.1421524412158 |
| 10         | 1300       | 1546.1421524412158 |
| 10         | 5000       | 1546.1421524412158 |
| 10         | 2450       | 1546.1421524412158 |
| 20         | 3000       | 1004.7387720198718 |
| 20         | 3000       | 1004.7387720198718 |
| 20         | 800        | 1004.7387720198718 |
| 20         | 1100       | 1004.7387720198718 |
| 20         | 2975       | 1004.7387720198718 |
| 30         | 1500       | 610.1001739241042 |
| 30         | 950        | 610.1001739241042 |
| 30         | 1600       | 610.1001739241042 |
| 30         | 1250       | 610.1001739241042 |
| 30         | 1250       | 610.1001739241042 |
| 30         | 2850       | 610.1001739241042 |
+------------+------------+------------+

例2: Hive互換データ型エディションを無効にし、deptno列を使用してウィンドウを定義し、sal列の母集団標準偏差を計算します。 ORDER BY句が指定されています。この関数は、現在のウィンドウの最初の行から現在の行までの値の累積母集団標準偏差を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

-- Disable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=false;
-- Execute the following statement: 
select deptno, sal, stddev(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 0.0        |           -- This row is the first row of this window. 
| 10         | 1300       | 0.0        |           -- The return value is the cumulative population standard deviation of the values in the first and second rows. 
| 10         | 2450       | 542.1151989096865 |    -- The return value is the cumulative population standard deviation of the values from the first row to the third row. 
| 10         | 2450       | 575.0      |           -- The return value is the cumulative population standard deviation of the values from the first row to the fourth row. 
| 10         | 5000       | 1351.6656391282572 |
| 10         | 5000       | 1546.1421524412158 |
| 20         | 800        | 0.0        |
| 20         | 1100       | 150.0      |
| 20         | 2975       | 962.4188277460079 |
| 20         | 3000       | 1024.2947268730811 |
| 20         | 3000       | 1004.7387720198718 |
| 30         | 950        | 0.0        |
| 30         | 1250       | 150.0      |
| 30         | 1250       | 141.4213562373095 |
| 30         | 1500       | 194.8557158514987 |
| 30         | 1600       | 226.71568097509268 |
| 30         | 2850       | 610.1001739241042 |
+------------+------------+------------+

例3: Hive互換データ型エディションを有効にし、deptno列を使用してウィンドウを定義し、sal列の母集団標準偏差を計算します。 ORDER BY句が指定されています。この関数は、最初の行から現在のウィンドウの現在の行と同じsal値を持つ行までの値の累積母集団標準偏差を返します。同じsal値を持つ行の母集団の標準偏差は同じです。現在のウィンドウには、同じdeptno値を持つ行が含まれています。サンプル文:

-- Enable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=true;
-- Execute the following statement: 
select deptno, sal, stddev(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 0.0        |           -- This row is the first row of this window. The population standard deviation for the first row is the cumulative population standard deviation of the values in the first and second rows because the two rows have the same sal value. 
| 10         | 1300       | 0.0        |           -- The return value is the cumulative population standard deviation of the values in the first and second rows. 
| 10         | 2450       | 575.0      |           -- The population standard deviation for the third row is the cumulative population standard deviation of the values from the first row to the fourth row because the third and fourth rows have the same sal value. 
| 10         | 2450       | 575.0      |           -- The return value is the cumulative population standard deviation of the values from the first row to the fourth row. 
| 10         | 5000       | 1546.1421524412158 |
| 10         | 5000       | 1546.1421524412158 |
| 20         | 800        | 0.0        |
| 20         | 1100       | 150.0      |
| 20         | 2975       | 962.4188277460079 |
| 20         | 3000       | 1004.7387720198718 |
| 20         | 3000       | 1004.7387720198718 |
| 30         | 950        | 0.0        |
| 30         | 1250       | 141.4213562373095 |
| 30         | 1250       | 141.4213562373095 |
| 30         | 1500       | 194.8557158514987 |
| 30         | 1600       | 226.71568097509268 |
| 30         | 2850       | 610.1001739241042 |
+------------+------------+------------+

STDDEV_SAMP

構文

double stddev_samp([distinct] <expr>) over([partition_clause] [orderby_clause] [frame_clause])
decimal stddev_samp([distinct] <expr>) over([partition_clause] [orderby_clause] [frame_clause])

説明
すべての入力値のサンプル標準偏差を返します。
パラメーター
- expr: 必須です。このパラメーターは、サンプルの標準偏差を計算するために使用される式を指定します。 DOUBLEまたはDECIMAL型の値。
  - 入力値がSTRING型またはBIGINT型の場合は、計算前に暗黙的にDOUBLE型の値に変換されます。別のデータ型の場合は、エラーが返されます。
  - 入力値がnullの場合、値を含む行は計算に使用されません。
  - distinctキーワードが指定されている場合、distinct値のサンプル標準偏差が計算されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
exprと同じデータ型の値が返されます。 exprで指定されたすべての式の値がnullの場合、nullが返されます。ウィンドウにexpr値がnullでないデータの行が1つしかない場合、0が返されます。

例

例1: deptnoカラムを使用してウィンドウを定義し、salカラムのサンプル標準偏差を計算します。 order by句は指定されていません。この関数は、現在のウィンドウの累積サンプル標準偏差を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, stddev_samp(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1693.7138680032904 |   -- This row is the first row of this window. The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. 
| 10         | 2450       | 1693.7138680032904 |   -- The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. 
| 10         | 5000       | 1693.7138680032904 |   -- The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. 
| 10         | 1300       | 1693.7138680032904 |     
| 10         | 5000       | 1693.7138680032904 |
| 10         | 2450       | 1693.7138680032904 |
| 20         | 3000       | 1123.3320969330487 |
| 20         | 3000       | 1123.3320969330487 |
| 20         | 800        | 1123.3320969330487 |
| 20         | 1100       | 1123.3320969330487 |
| 20         | 2975       | 1123.3320969330487 |
| 30         | 1500       | 668.331255192114 |
| 30         | 950        | 668.331255192114 |
| 30         | 1600       | 668.331255192114 |
| 30         | 1250       | 668.331255192114 |
| 30         | 1250       | 668.331255192114 |
| 30         | 2850       | 668.331255192114 |
+------------+------------+------------+

例2: deptnoカラムを使用してウィンドウを定義し、salカラムのサンプル標準偏差を計算します。 ORDER BY句が指定されています。この関数は、現在のウィンドウの最初の行から現在の行までの値の累積サンプル標準偏差を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, stddev_samp(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 0.0        |          -- This row is the first row of this window. 
| 10         | 1300       | 0.0        |          -- The return value is the cumulative sample standard deviation of the values in the first and second rows. 
| 10         | 2450       | 663.9528095680697 |   -- The return value is the cumulative sample standard deviation of the values from the first row to the third row. 
| 10         | 2450       | 663.9528095680696 |
| 10         | 5000       | 1511.2081259707413 |
| 10         | 5000       | 1693.7138680032904 |
| 20         | 800        | 0.0        |
| 20         | 1100       | 212.13203435596427 |
| 20         | 2975       | 1178.7175234126282 |
| 20         | 3000       | 1182.7536725793752 |
| 20         | 3000       | 1123.3320969330487 |
| 30         | 950        | 0.0        |
| 30         | 1250       | 212.13203435596427 |
| 30         | 1250       | 173.20508075688772 |
| 30         | 1500       | 225.0      |
| 30         | 1600       | 253.4758371127315 |
| 30         | 2850       | 668.331255192114 |
+------------+------------+------------+

SUM

構文

sum([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause])

説明
ウィンドウ内のexprの合計を返します。
パラメーター
- expr: 必須です。このパラメータは、合計を計算する列を指定します。列は、DOUBLE、DECIMAL、またはBIGINTタイプです。
  - 入力値がSTRING型の場合は、計算前に暗黙的にDOUBLE型の値に変換されます。別のデータ型の場合は、エラーが返されます。
  - 入力値がnullの場合、値を含む行は計算に使用されません。
  - distinctキーワードが指定されている場合、distinct値の合計が計算されます。
- partition_clause、orderby_clause、およびframe_clause: これらのパラメーターの詳細については、「windowing_definition」をご参照ください。
戻り値
- 入力値がBIGINT型の場合、BIGINT型の値が返されます。
- 入力値がDECIMAL型の場合、DECIMAL型の値が返されます。
- 入力値がDOUBLE型またはSTRING型の場合、DOUBLE型の値が返されます。
- 入力値がnullの場合、nullが返されます。

例

例1: deptno列を使用してウィンドウを定義し、sal列の合計を計算します。 order by句は指定されていません。この関数は、現在のウィンドウの累積合計を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。例：

select deptno, sal, sum(sal) over (partition by deptno) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 17500      |   -- This row is the first row of this window. The return value is the cumulative sum of the values from the first row to the sixth row. 
| 10         | 2450       | 17500      |   -- The return value is the cumulative sum of the values from the first row to the sixth row. 
| 10         | 5000       | 17500      |   -- The return value is the cumulative sum of the values from the first row to the sixth row. 
| 10         | 1300       | 17500      |
| 10         | 5000       | 17500      |
| 10         | 2450       | 17500      |
| 20         | 3000       | 10875      |
| 20         | 3000       | 10875      |
| 20         | 800        | 10875      |
| 20         | 1100       | 10875      |
| 20         | 2975       | 10875      |
| 30         | 1500       | 9400       |
| 30         | 950        | 9400       |
| 30         | 1600       | 9400       |
| 30         | 1250       | 9400       |
| 30         | 1250       | 9400       |
| 30         | 2850       | 9400       |
+------------+------------+------------+

例2: Hive互換データ型エディションを無効にし、deptno列を使用してウィンドウを定義し、sal列の合計を計算します。 ORDER BY句が指定されています。この関数は、現在のウィンドウの最初の行から現在の行までの値の累積合計を返します。現在のウィンドウには、同じdeptno値を持つ行が含まれています。サンプル文:

-- Disable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=false;
-- Execute the following statement: 
select deptno, sal, sum(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 1300       |   -- This row is the first row of this window. 
| 10         | 1300       | 2600       |   -- The return value is the cumulative sum of the values in the first and second rows. 
| 10         | 2450       | 5050       |   -- The return value is the cumulative sum of the values from the first row to the third row. 
| 10         | 2450       | 7500       |
| 10         | 5000       | 12500      |
| 10         | 5000       | 17500      |
| 20         | 800        | 800        |
| 20         | 1100       | 1900       |
| 20         | 2975       | 4875       |
| 20         | 3000       | 7875       |
| 20         | 3000       | 10875      |
| 30         | 950        | 950        |
| 30         | 1250       | 2200       |
| 30         | 1250       | 3450       |
| 30         | 1500       | 4950       |
| 30         | 1600       | 6550       |
| 30         | 2850       | 9400       |
+------------+------------+------------+

例3: Hive互換データ型エディションを有効にし、deptno列を使用してウィンドウを定義し、sal列の合計を計算します。 ORDER BY句が指定されています。この関数は、最初の行から、現在のウィンドウの現在の行と同じsal値を持つ行までの値の累積合計を返します。同じsal値を持つ行の合計値は同じです。現在のウィンドウには、同じdeptno値を持つ行が含まれています。サンプル文:

-- Enable the Hive-compatible data type edition. 
set odps.sql.hive.compatible=true;
-- Execute the following statement: 
select deptno, sal, sum(sal) over (partition by deptno order by sal) from emp;

次の応答が返されます。

+------------+------------+------------+
| deptno     | sal        | _c2        |
+------------+------------+------------+
| 10         | 1300       | 2600       |   -- This row is the first row of this window. The sum for the first row is the cumulative sum of the values in the first and second rows because the two rows have the same sal value. 
| 10         | 1300       | 2600       |   -- The return value is the cumulative sum of the values in the first and second rows. 
| 10         | 2450       | 7500       |   -- The sum for the third row is the cumulative sum of the values from the first row to the fourth row because the third and fourth rows have the same sal value. 
| 10         | 2450       | 7500       |   -- The return value is the cumulative sum of the values from the first row to the fourth row. 
| 10         | 5000       | 17500      |
| 10         | 5000       | 17500      |
| 20         | 800        | 800        |
| 20         | 1100       | 1900       |
| 20         | 2975       | 4875       |
| 20         | 3000       | 10875      |
| 20         | 3000       | 10875      |
| 30         | 950        | 950        |
| 30         | 1250       | 3450       |
| 30         | 1250       | 3450       |
| 30         | 1500       | 4950       |
| 30         | 1600       | 6550       |
| 30         | 2850       | 9400       |
+------------+------------+------------+

MaxCompute:Window 関数

制限事項

構文

windowing_definition

filter_句

frame_clause

サンプルデータ

AVG

CLUSTER_SAMPLE

COUNT

CUME_DIST

DENSE_RANK

FIRST_VALUE

LAG

LAST_VALUE

LEAD

MAX

MEDIAN

MIN

NTILE

NTH_VALUE

PERCENT_RANK

RANK

ROW_NUMBER

STDDEV

STDDEV_SAMP

SUM

関連ドキュメント