Syntax
[ WITH with_query [, ...] ]
SELECT [ ALL | DISTINCT ] select_expr [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
[ HAVING condition]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count | ALL ] ]
Parameters
from_item
: DLA supports the following two modes:table_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
from_item join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]
join_type
: indicates how tables are connected.[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
grouping_element
()
expression
WITH clause
The WITH
clause defines a named relationship for use in a query statement. It flattens nested queries or simplifies subqueries. For example, the following queries are equivalent:
SELECT a, b
FROM (
SELECT a, MAX(b) AS b FROM t GROUP BY a
) AS x;
WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
SELECT a, b FROM x;
This clause is also applicable to multiple subqueries:
WITH
t1 AS (SELECT a, MAX(b) AS b FROM x GROUP BY a),
t2 AS (SELECT a, AVG(d) AS d FROM y GROUP BY a)
SELECT t1.*, t2.*
FROM t1
JOIN t2 ON t1.a = t2.a;
Additionally, the relationships defined in a WITH clause can be connected to each other.
WITH
x AS (SELECT a FROM t),
y AS (SELECT a AS b FROM x),
z AS (SELECT b AS c FROM y)
SELECT c FROM z;
GROUP BY clause
The GROUP BY
clause groups the output data of the SELECT
statement. A group contains data rows of the matching values. A simple GROUP BY
clause contains any expressions or column numbers (starting from 1) consisting of the input columns.
The following queries are equivalent. Both of them group the nationkey column. The first query uses the column number, while the second query uses the column name.
SELECT count(*), nationkey FROM customer GROUP BY 2;
SELECT count(*), nationkey FROM customer GROUP BY nationkey;
If the column name is not specified in a query statement, the GROUP BY
clause can still group the output data. For example, the following query uses the mktsegment column for grouping and counts the number of rows in the customer table.:
SELECT count(*) FROM customer GROUP BY mktsegment;
_col0
-------
29968
30142
30189
29949
29752
(5 rows)
When a GROUP BY
clause is used in a SELECT
statement, all output columns must be aggregate functions or columns in the GROUP BY clause.
HAVING clause
The HAVING clause is used together with the aggregate functions and GROUP BY clause to control and select groups. The HAVING clause removes groups that do not meet the conditions. After the grouping and aggregate computing operations are complete, the HAVING clause filters the groups. The following example shows how to query a customer table, groups the table, and queries the records with the account balance greater than a specified value.
SELECT count(*), mktsegment, nationkey,
CAST(sum(acctbal) AS bigint) AS totalbal
FROM customer
GROUP BY mktsegment, nationkey
HAVING sum(acctbal) > 5700000
ORDER BY totalbal DESC;
_col0 | mktsegment | nationkey | totalbal
-------+------------+-----------+----------
1272 | AUTOMOBILE | 19 | 5856939
1253 | FURNITURE | 14 | 5794887
1248 | FURNITURE | 9 | 5784628
1243 | FURNITURE | 12 | 5757371
1231 | HOUSEHOLD | 3 | 5753216
1251 | MACHINERY | 2 | 5719140
1247 | FURNITURE | 8 | 5701952
(7 rows)
UNION, INTERSECT, and EXCEPT clauses
UNION, INTERSECT, and EXCEPT are full set operators. These clauses are used to combine the results of multiple query statements into a final result.
query UNION [ALL | DISTINCT] query
query INTERSECT query
query EXCEPT query
The parameter ALL or DISTINCT controls the rows that you want to include in the final result set. If ALL is specified, all rows are included, even some rows are exactly the same. If DISTINCT is specified, the result set contains only unrepeated rows with repeated rows merged. If no parameter is specified, DISTINCT is used by default.
Multiple set operators are processed from left to right, unless the sequence is specified using parentheses. In addition, INTERSECT has a higher priority than EXCEPT and UNION, which means that the expression A UNION B INTERSECT C EXCEPT D is the same as A UNION (B INTERSECT C) EXCEPT D.
UNION
UNION combines two result sets. The following example shows the simplest way to use the UNION clause. The UNION clause selects the value 13 and merges it with the second selected value 42.
SELECT 13
UNION
SELECT 42;
_col0
-------
13
42
(2 rows)
INTERSECT
INTERSECT returns only the combinations of rows that appear in both the first and second queries. The following example shows the simplest way to use the INTERSECT clause. The INTERSECT clause selects the values 13 and 42, and merges them with the value 13 that is selected from the secondary query. As the value 42 is included in the result set of the first query, it is not included in the final result set.
SELECT 13
INTERSECT
SELECT 13;
_col0
-------
13
(2 rows)
EXCEPT
EXCEPT returns the rows that appear only in the result set of the first query and do not appear in the result set of the second query. The following example shows the simplest way to use the EXCEPT clause. The clause selects the value 42, and merges it with the value 13 that is selected from the second query. As the value 13 is included in the result set of the second query, it is not included in the final result set.
SELECT 42
EXCEPT
SELECT 13;
_col0
-------
42
(2 rows)
ORDER BY Clause
The ORDER BY clause is used to sort a result set by one or more output expressions.
ORDER BY expression [ ASC | DESC ] [, ...]
Each expression consists of the column names or numbers (starting from 1). Serving as the last step of a query, the ORDER BY clause is used after the GROUP BY and HAVING clauses.
LIMIT Clause
The LIMIT clause limits the number of rows in the final result set. The result of the LIMIT ALL clause is the same as that when the LIMIT clause is omitted. In the following example, a large table is queried, and the LIMIT clause restricts output to only five rows. (Rows are randomly returned because the ORDER BY clause is not used in the query.)
SELECT orderdate FROM orders LIMIT 5;
o_orderdate
-------------
1996-04-14
1992-01-15
1995-02-01
1995-11-12
1992-04-26
(5 rows)
Joins
This allows you to merge data from multiple Join clauses.
CROSS JOIN
SELECT *
FROM nation, region;
The tables nation and region contain 25 rows and 5 rows, respectively. Therefore, the two tables ultimately generate 125 rows.
Subqueries
A subquery is an expression that contains queries. A subquery column is joint to a peripheral query column. Logically, a subquery is evaluated on a row basis by its peripheral queries. Therefore, the referenced columns are fixed to the evaluation process of the subquery.
Note
Joined subqueries are limited. Not every type of the subqueries is supported.
EXISTS
The EXISTS assertion determines whether a subquery can return any row.
SELECT name
FROM nation
WHERE EXISTS (SELECT * FROM region WHERE region.regionkey = nation.regionkey)
IN
The IN assertion determines whether the value returned by a subquery is included in a given result set. The result of IN follows the standard result of nulls. A subquery can generate only one column:
SELECT name
FROM nation
WHERE regionkey IN (SELECT regionkey FROM region)
Scalar subqueries
A scalar subquery is a non-joined subquery that returns none or one data row. If the subquery returns more than one data row, an error is returned. If the subquery does not return any rows, NULL
is returned.
SELECT name
FROM nation
WHERE regionkey = (SELECT max(regionkey) FROM region)
The scalar subquery can be used for a single column.