Calculates the percentile rank of a row in an ordered group of rows.
Limits
Before you use window functions, take note of the following limits:
Window functions are supported only in
SELECT
statements.A window function cannot contain nested window functions or nested aggregate functions.
You cannot use window functions together with aggregate functions of the same level.
Syntax
double percent_rank() over([partition_clause] [orderby_clause])
Description
Calculates the percentile rank of the current row in a partition based on orderby_clause.
Parameters
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
Return value
A value of the DOUBLE type is returned. The valid value range is [0.0, 1.0]. The return value is calculated by using the following formula: (rank - 1)/(partition_row_count - 1)
. rank
indicates the return value of the RANK function that corresponds to the current row. partition_row_count
indicates the number of rows in the partition to which the current row belongs. If the partition contains only one row of data, 0.0 is returned.
Sample data
This section provides sample source data and examples for you to understand how to use the functions. Create a table named emp and insert the sample data into the table. Sample statement:
create table if not exists emp
(empno bigint,
ename string,
job string,
mgr bigint,
hiredate datetime,
sal bigint,
comm bigint,
deptno bigint);
tunnel upload emp.txt emp;
The emp.txt file contains the following sample data:
7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10
Examples
Calculate the percentile rank of each employee in a group based on the sal column. Sample statement:
select deptno, ename, sal, percent_rank() over (partition by deptno order by sal desc) as sal_new from emp;
The following result is returned:
+------------+------------+------------+------------+
| deptno | ename | sal | sal_new |
+------------+------------+------------+------------+
| 10 | JACCKA | 5000 | 0.0 |
| 10 | KING | 5000 | 0.0 |
| 10 | CLARK | 2450 | 0.4 |
| 10 | WELAN | 2450 | 0.4 |
| 10 | TEBAGE | 1300 | 0.8 |
| 10 | MILLER | 1300 | 0.8 |
| 20 | SCOTT | 3000 | 0.0 |
| 20 | FORD | 3000 | 0.0 |
| 20 | JONES | 2975 | 0.5 |
| 20 | ADAMS | 1100 | 0.75 |
| 20 | SMITH | 800 | 1.0 |
| 30 | BLAKE | 2850 | 0.0 |
| 30 | ALLEN | 1600 | 0.2 |
| 30 | TURNER | 1500 | 0.4 |
| 30 | MARTIN | 1250 | 0.6 |
| 30 | WARD | 1250 | 0.6 |
| 30 | JAMES | 950 | 1.0 |
+------------+------------+------------+------------+
Related functions
PERCENT_RANK is a window function. For more information about the functions that are used to calculate the sum of data of columns in a window and to sort data, see Window functions.