COUNT command format, parameters, and return value - MaxCompute

The COUNT function calculates the number of records.

Applicability

Window functions can be used only in SELECT statements.
Do not nest window functions or aggregate functions in a window function.
Do not use window functions together with aggregate functions at the same level.

Syntax

-- Calculate the number of records.
BIGINT COUNT([DISTINCT|ALL] <colname>)

-- Calculate the number of records in a window.
BIGINT COUNT(*) OVER ([partition_clause] [orderby_clause] [frame_clause])
BIGINT COUNT([DISTINCT] <expr>[,...]) OVER ([partition_clause] [orderby_clause] [frame_clause])

Parameters

DISTINCT|ALL: Optional. Specifies whether to remove duplicate records before counting. The default value is ALL, which counts all records. If you specify DISTINCT, the function counts only unique records.
colname: Required. The column whose values you want to count. This parameter can be of any data type. You can specify * for colname, as in COUNT(*). In this case, the function returns the total number of rows. Rows where the colname value is NULL are not included in the calculation.
expr: Required. An expression. This parameter can be of any data type. Rows where the expression evaluates to NULL are not included in the calculation. If you specify the DISTINCT keyword, the function returns the count of unique values.
COUNT([DISTINCT] <expr>[,...]): Counts the number of rows in the specified window where the values of all specified expressions are not NULL. If you specify the DISTINCT keyword, the function counts the rows after removing duplicate rows.
partition_clause, orderby_clause, and frame_clause: For more information, see windowing_definition.

Return value

The function returns a BIGINT value. Rows where colname is NULL are not included in the calculation.

Examples

Prepare test data

If you have existing data, you can skip this step.

Download the test_data.txt file.

Create a test table.

CREATE TABLE IF NOT EXISTS emp(
  empno BIGINT,
  ename STRING,
  job STRING,
  mgr BIGINT,
  hiredate DATETIME,
  sal BIGINT,
  comm BIGINT,
  deptno BIGINT
);

Load data.
Replace FILE_PATH with the actual path and name of your data file.
```
TUNNEL UPLOAD FILE_PATH emp;   
```

Example 1: Specify a column for the window and return the cumulative count without sorting

This example partitions the window by the salary (sal) column. The data is not sorted. The function returns the cumulative count of all rows in the current window. The window contains all rows that have the same sal value.

Sample command

SELECT sal, COUNT(sal) OVER (PARTITION BY sal) AS count FROM emp;

Result

+------------+------------+
| sal        | count      | 
+------------+------------+
| 800        | 1          | 
| 950        | 1          | 
| 1100       | 1          | 
| 1250       | 2          |  -- The partition contains two rows where sal is 1250. The function returns 2 for both rows.
| 1250       | 2          |  -- The count is also 2 for the second row in the partition.
| 1300       | 2          | 
| 1300       | 2          | 
| 1500       | 1          | 
| 1600       | 1          | 
| 2450       | 2          | 
| 2450       | 2          | 
| 2850       | 1          | 
| 2975       | 1          | 
| 3000       | 2          | 
| 3000       | 2          | 
| 5000       | 2          | 
| 5000       | 2          | 
+------------+------------+

Example 2: In non-Hive compatible mode, specify a column for the window and return the cumulative count after sorting

In non-Hive compatible mode, this example partitions the window by the salary (sal) column and sorts the data. The function returns the cumulative count from the first row to the current row in the current window. The window contains all rows that have the same sal value.

Sample command

-- Disable Hive compatible mode.
SET odps.sql.hive.compatible=false;

SELECT sal, COUNT(sal) OVER (PARTITION BY sal ORDER BY sal) AS count FROM emp;

Return value

+------------+------------+
| sal        | count      |
+------------+------------+
| 800        | 1          |
| 950        | 1          |
| 1100       | 1          |
| 1250       | 1          |   -- The window partition starts. The running count for the first row is 1.
| 1250       | 2          |   -- The running count for the second row is 2.
| 1300       | 1          |
| 1300       | 2          |
| 1500       | 1          |
| 1600       | 1          |
| 2450       | 1          |
| 2450       | 2          |
| 2850       | 1          |
| 2975       | 1          |
| 3000       | 1          |
| 3000       | 2          |
| 5000       | 1          |
| 5000       | 2          |
+------------+------------+

Example 3: In Hive compatible mode, specify a column for the window and return the cumulative count after sorting

In Hive compatible mode, this example partitions the window by the salary (sal) column and sorts the data. The function returns the cumulative count from the first row to the last row in the current window. The window contains all rows that have the same sal value.

Sample command

-- Enable Hive compatible mode.
SET odps.sql.hive.compatible=true;

SELECT sal, COUNT(sal) OVER (PARTITION BY sal ORDER BY sal) AS count FROM emp;

Result

+------------+------------+
| sal        | count      |
+------------+------------+
| 800        | 1          |
| 950        | 1          |
| 1100       | 1          |
| 1250       | 2          |   -- The partition contains two rows where sal is 1250. The function returns 2 for both rows.
| 1250       | 2          |   -- The count is also 2 for the second row in the partition.
| 1300       | 2          |
| 1300       | 2          |
| 1500       | 1          |
| 1600       | 1          |
| 2450       | 2          |
| 2450       | 2          |
| 2850       | 1          |
| 2975       | 1          |
| 3000       | 2          |
| 3000       | 2          |
| 5000       | 2          |
| 5000       | 2          |
+------------+------------+

Example 4: Return the total number of rows

Calculate the total number of employees in all departments.

Sample command
```
SELECT COUNT(*) FROM emp;
```

Result

+------------+
| _c0        |
+------------+
| 17         |
+------------+

Example 5: Group data and calculate the total for each group

This example uses the COUNT function with GROUP BY to group all employees by department (deptno) and calculate the number of employees in each department.

Sample command

SELECT deptno, COUNT(*) FROM emp GROUP BY deptno;

Result

+------------+------------+
| deptno     | _c1        | 
+------------+------------+
| 20         | 5          | 
| 30         | 6          | 
| 10         | 6          | 
+------------+------------+

Example 6: Count unique values

This example uses DISTINCT to remove duplicate values and count the number of departments.

Sample command
```
SELECT COUNT(DISTINCT deptno) FROM emp;
```

Return value

+------------+
| _c0        |
+------------+
| 3          |
+------------+

Related functions

COUNT is an aggregate function and a window function.

For more information about functions that calculate a single value from multiple input records, see Aggregate functions.
For more information about other window functions, see Window functions.