Multidimensional analysis is a common process used in data warehousing applications. This process helps you examine data by using various combinations of dimensions. Dimensions are categories used to classify data such as time, geography, departments, and product lines. The results associated with a particular set of dimensions are called facts. Facts are typically figures associated with dimensions such as product sales, profits, volumes, and counts.
You can use SQL aggregation to obtain these facts based on a set of dimensions in a relational database system. During SQL aggregation, data is grouped by certain criteria or dimensions. The result set consists of aggregates of facts, such as counts, sums, and averages of the data in each group.
The GROUP BY clause of the SQL SELECT statement supports the following extensions that simplify the process of generating aggregate results.
ROLLUP extension
CUBE extension
GROUPING SETS extension
In addition, the GROUPING function and the GROUPING_ID function can be used in the SELECT list or the HAVING clause to interpret the results when these extensions are used.
This topic describes how to use these extensions by taking the dept and emp tables for example. The following changes are used to these tables to provide more informative results.
UPDATE dept SET loc = 'BOSTON' WHERE deptno = 20;
INSERT INTO emp (empno,ename,job,deptno) VALUES (9001,'SMITH','CLERK',40);
INSERT INTO emp (empno,ename,job,deptno) VALUES (9002,'JONES','ANALYST',40);
INSERT INTO emp (empno,ename,job,deptno) VALUES (9003,'ROGERS','MANAGER',40);
The following rows from a join of the emp and dept tables are used:
SELECT loc, dname, job, empno FROM emp e, dept d
WHERE e.deptno = d.deptno
ORDER BY 1, 2, 3, 4;
loc | dname | job | empno
----------+------------+-----------+-------
BOSTON | OPERATIONS | ANALYST | 9002
BOSTON | OPERATIONS | CLERK | 9001
BOSTON | OPERATIONS | MANAGER | 9003
BOSTON | RESEARCH | ANALYST | 7788
BOSTON | RESEARCH | ANALYST | 7902
BOSTON | RESEARCH | CLERK | 7369
BOSTON | RESEARCH | CLERK | 7876
BOSTON | RESEARCH | MANAGER | 7566
CHICAGO | SALES | CLERK | 7900
CHICAGO | SALES | MANAGER | 7698
CHICAGO | SALES | SALESMAN | 7499
CHICAGO | SALES | SALESMAN | 7521
CHICAGO | SALES | SALESMAN | 7654
CHICAGO | SALES | SALESMAN | 7844
NEW YORK | ACCOUNTING | CLERK | 7934
NEW YORK | ACCOUNTING | MANAGER | 7782
NEW YORK | ACCOUNTING | PRESIDENT | 7839
(17 rows)
The loc, dname, and job columns are used for the dimensions of the SQL aggregations used in the examples. The COUNT(*) function is used to retrieve the number of employees as the resulting facts of the aggregations.
The following example shows a basic query where the loc, dname, and job columns are grouped.
SELECT loc, dname, job, COUNT(*) AS "employees" FROM emp e, dept d
WHERE e.deptno = d.deptno
GROUP BY loc, dname, job
ORDER BY 1, 2, 3;
The rows of this result set that uses the basic GROUP BY clause without extensions are called the base aggregate rows.
loc | dname | job | employees
----------+------------+-----------+-----------
BOSTON | OPERATIONS | ANALYST | 1
BOSTON | OPERATIONS | CLERK | 1
BOSTON | OPERATIONS | MANAGER | 1
BOSTON | RESEARCH | ANALYST | 2
BOSTON | RESEARCH | CLERK | 2
BOSTON | RESEARCH | MANAGER | 1
CHICAGO | SALES | CLERK | 1
CHICAGO | SALES | MANAGER | 1
CHICAGO | SALES | SALESMAN | 4
NEW YORK | ACCOUNTING | CLERK | 1
NEW YORK | ACCOUNTING | MANAGER | 1
NEW YORK | ACCOUNTING | PRESIDENT | 1
(12 rows)
The ROLLUP and CUBE extensions are added to the base aggregate rows and provide additional levels of subtotals to the result set.
The GROUPING SETS extension can be used to combine different types of groups into a single result set.
The GROUPING and GROUPING_ID functions are used to interpret the result set.
For more information about the additions provided by these extensions, see subsequent topics.