All Products
Search
Document Center

Hologres:SELECT

Last Updated:Feb 04, 2026

This topic describes how to use the SELECT statement to query data in Hologres.

Command overview

The SELECT statement queries data from zero or more tables. The main parameters are listed in the following table.

Parameters

WITH clause

SELECT list

FROM clause

WHERE clause

GROUP BY clause

CUBE clause

DISTINCT clause

COUNT DISTINCT clause

UNION clause

INTERSECT clause

EXCEPT clause

ORDER BY clause

LIMIT clause

None

Command syntax

The basic syntax of the SELECT statement is as follows.

[ WITH with_query [, ...] ]
SELECT [ALL | DISTINCT [ON (expression [, ...])]]
  * | expression [[AS] output_name] [, ...]
  [FROM from_item [, ...]]
  [WHERE condition]
  [GROUP BY grouping_element [, ...]]
  [HAVING condition [, ...]]
  [{UNION | INTERSECT | EXCEPT} [ALL] select]
  [ORDER BY expression [ASC | DESC | USING operator] [, ...]]
  [LIMIT {count | ALL}]

In this syntax, grouping_element and from_item are defined as follows:

  • A grouping_element contains an expression. An expression can be a column name, a constant, a function, or any combination of column names and constants that are connected by arithmetic or bitwise operators.

  • from_item contains

    table_name [[AS] alias [( column_alias [, ...] )]]
    (select) [AS] alias [( column_alias [, ...] )]
    from_item [NATURAL] join_type from_item
              [ON join_condition | USING ( join_column [, ...] )]

Command usage

The following are common SELECT methods:

  1. All elements in the FROM list, which are real or virtual tables, are evaluated. If the list contains more than one element, the result is their union.

  2. If a WHERE clause is specified, it filters out all rows that do not satisfy the specified condition.

  3. If a GROUP BY clause is specified or if aggregate functions are used, the rows are combined into groups based on matching values. The aggregate functions are then computed for each group. If a HAVING clause is present, it filters out groups that do not meet the specified condition.

  4. The SELECT output expressions are used to compute the actual output rows from the selected rows or groups.

  5. You can combine multiple SELECT statements into a single result set using the UNION, INTERSECT, and EXCEPT set operators. The UNION operator returns all rows that are in either or both result sets. The INTERSECT operator returns all rows that are in both result sets. The EXCEPT operator returns rows that are in the first result set but not in the second. For all three operators, duplicate rows are removed from the final result set unless you specify ALL. By default, these operators remove duplicates, which is equivalent to specifying DISTINCT. Note that this behavior is different from a standard SELECT statement, where the default behavior is to return all rows, including duplicates.

  6. If you specify an ORDER BY clause, the returned rows are sorted in the specified order. If you do not specify an ORDER BY clause, the rows are returned in an arbitrary order, which is typically the fastest order for the system to produce.

  7. If a LIMIT (or FETCH FIRST) or OFFSET clause is specified, the SELECT statement returns only a subset of the result rows.

Parameter descriptions

  • WITH clause

    • Command overview

      The WITH clause lets you define one or more subqueries that can be referenced by name in the main query. These named subqueries are known as Common Table Expressions (CTEs). A WITH clause is typically placed before the primary SELECT statement. The syntax is as follows:

      with query_name [ ( column_name [, ...] ) ] AS ( select )
    • Parameter description

      Parameter

      Description

      query_name

      Specifies the name of the current CTE. It can be any valid identifier.

      column_name

      Lists the column names corresponding to the return values of the subquery. This is similar to the AS semantics in a SELECT clause. The subquery can be a regular SELECT query.

      CTEs are separated by commas. A later CTE definition can reference an earlier CTE. Recursive CTEs are not currently supported. In subsequent queries, you can use query_name directly as a view. If no column_name list is specified, the view’s column_name defaults to the column names returned by the subquery. If a column_name list is specified, you must use the defined column_name, and the number of column_name entries must match the number of columns returned by the SELECT statement.

  • SELECT list

    • Command overview

      The SELECT list, which is the part of the statement between the SELECT and FROM keywords, specifies the expressions that form the output rows of the SELECT statement. These expressions can refer to columns that are computed in the FROM clause.

    • Parameter description

      Each output column in a SELECT statement has a name. In a simple SELECT statement, this name is used to label the column for display. If the SELECT statement is a subquery, the outer query uses this name to reference the column of the virtual table that is produced by the subquery. To specify a name for an output column, add AS output_name after the column expression. You can omit AS, but only if the desired output name does not match any PostgreSQL keyword. To avoid potential conflicts with future keywords, we recommend that you use AS or enclose the output name in double quotes. If you do not specify a column name, PostgreSQL automatically assigns one. If the column expression is a simple column reference, the assigned name is the same as the column's name. In more complex cases that involve functions or type names, the system may generate a generic name, such as ?column?.

      You can reference an output column by its name in the ORDER BY and GROUP BY clauses. However, you cannot reference an output column by its name in the WHERE or HAVING clauses. In those clauses, you must use the expression itself.

      You can use an asterisk (*) in the output list as a shorthand for all columns of the selected rows. You can also use table_name.* as a shorthand for all columns from a specific table. In these cases, you cannot use AS to specify a new name. The output column names will be the same as the original table column names.

  • FROM clause

    • Command overview

      The FROM clause specifies one or more source tables for the SELECT statement. If you specify multiple source tables, the result is the Cartesian product (cross join) of all the source tables. Typically, you can add a WHERE clause with join conditions to restrict the returned rows to a small subset of the Cartesian product.

    • Parameter description

      The FROM clause can include the following elements:

      Element

      Description

      table_name

      The name of an existing table or view (optionally qualified with a schema).

      alias

      An alternative name for a FROM item that includes an alias. Aliases make writing concise or resolve ambiguity in self-joins (where the same table is scanned multiple times). When an alias is provided, the actual table or function name is hidden. For example, given FROM foo AS f, the rest of the SELECT must refer to this FROM item as f, not foo.

      select

      A sub-SELECT can appear in the FROM clause. This acts as if its output were a temporary table created for the duration of the SELECT command. Note that the sub-SELECT must be enclosed in parentheses and must have an alias.

      function_name

      A function call can appear in the FROM clause (especially useful for functions that return result sets, but any function can be used). This acts as if the function's output were a temporary table created for the duration of the SELECT command.

      You can provide an alias just like for a table. If you provide an alias, you can also specify a list of column aliases to give alternative names to one or more attributes of the function's composite return type, including any new column added by ORDINALITY (if present).

      You can combine multiple function calls into a single FROM-clause item by enclosing them in ROWS FROM(。). The output of such an item concatenates the first row of each function, then the second row of each function, and so on. If some functions produce fewer rows than others, NULLs fill the missing data positions, so the total number of returned rows always matches the function that produces the most rows.

      join_type

      Includes the following five types:

      • [ INNER ] JOIN

        For INNER and OUTER join types, you must specify a join condition: one of NATURAL, ON join_condition, or USING (join_column [, ...]) (only one). See below for meanings. For CROSS JOIN, these clauses cannot appear. A JOIN clause combines two FROM items (called “tables” for convenience, though they can be any FROM item type). Use parentheses to control nesting order if needed. Without parentheses, JOINs nest left to right. In all cases, JOIN binding is stronger than comma-separated FROM-list items.

      • LEFT [ OUTER ] JOIN

        LEFT OUTER JOIN returns all rows from the restricted Cartesian product (all combinations that satisfy the join condition), plus one copy of each unmatched row from the left table. Unmatched left rows are extended to full join-table rows by inserting NULLs in right-hand columns. Note that only the JOIN clause’s own condition determines which rows match. Outer conditions are applied afterward.

      • RIGHT [ OUTER ] JOIN

        RIGHT OUTER JOIN returns all joined rows plus each unmatched right-hand row (extended with NULLs on the left). This is merely notational convenience, as you can convert it to a LEFT OUTER JOIN by swapping left and right tables.

      • FULL [ OUTER ] JOIN

        FULL OUTER JOIN returns all joined rows, plus each unmatched left-hand row (extended with NULLs on the right), plus each unmatched right-hand row (extended with NULLs on the left).

      • CROSS JOIN

        CROSS JOIN and INNER JOIN produce a simple Cartesian product—the same result as listing two tables in FROM—but constrained by the join condition (if any). CROSS JOIN is equivalent to INNER JOIN ON (TRUE), meaning the condition removes no rows. These join types are purely notational convenience, as nothing they do cannot also be done with plain FROM and WHERE.

      join_condition

      join_condition is an expression that evaluates to a Boolean value (similar to a WHERE clause), specifying which rows are considered matching in a join.

      USING ( a, b, ... )

      The clause USING (a, b...) is shorthand for ON left_table.a = right_table.a AND left_table.b = right_table.b .... Additionally, USING ensures only one column from each equal pair appears in the join output.

      NATURAL

      Shorthand for USING that lists all columns with identical names in both tables.

  • WHERE clause

    • Command overview

      The optional WHERE clause has the following form:

      WHERE condition
    • Parameter description

      Parameter

      Description

      condition

      condition is any expression that evaluates to a Boolean result. Any row that does not satisfy this condition is removed from the output. A row satisfies the condition if substituting its actual values for variable references causes the expression to return true.

  • GROUP BY clause

    • Command overview

      The optional GROUP BY clause condenses all selected rows that share the same values for the grouping expressions into a single summary row. The GROUP BY clause has the following form:

      GROUP BY grouping_element [, ...]
    • Parameter description

      The expression in a grouping_element can be an input column name, the name or ordinal number of an output column from the SELECT list, or an expression that is formed from input column values. If a name is ambiguous, the GROUP BY clause interprets it as an input column name rather than an output column name.

      If a grouping element contains GROUPING SETS, ROLLUP, or CUBE, the GROUP BY clause defines multiple independent grouping sets. This is equivalent to a UNION ALL of subqueries, where each subquery has its own GROUP BY clause that corresponds to one of the grouping sets.

      If you use aggregate functions, they are computed across all rows in each group, which produces a single value for each group. If you use aggregate functions but do not include a GROUP BY clause, the query treats all selected rows as a single group. You can filter the rows that are passed to an aggregate function by appending a FILTER clause to the aggregate function call. If a FILTER clause is present, only the rows that match the filter condition are included in the input for that aggregate function.

      If a GROUP BY clause or any aggregate function is present, expressions in the SELECT list cannot reference ungrouped columns, except within an aggregate function or if the ungrouped column is functionally dependent on the grouped columns. This is because there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns include the primary key, or a subset of the primary key, of the table that contains the ungrouped column.

      All aggregate functions are computed before any scalar expressions in the HAVING clause or the SELECT list. This means, for example, that a CASE expression cannot be used to skip the computation of an aggregate function.

  • CUBE clause

    • CUBE

      The CUBE subclause automatically creates subtotals for all combinations of the grouping columns that are specified in the GROUP BY clause. The result set includes all possible combinations of values in the dimension columns, along with aggregated values from the base rows that match those dimension combinations. CUBE returns one summary row for each group. You can use CUBE to generate cross-tabulation values. For example, if you specify three expressions (n = 3) in the CUBE clause, the operation results in 2n = 23 = 8 groups. Rows that are grouped by all n expression values are regular rows. All other rows are super-aggregate rows. The syntax is as follows:

      CUBE ( { expression | ( expression [, …] ) } [, …] )
    • ROLLUP

      The ROLLUP subclause is an extension of the GROUP BY clause. When you generate statistics and reports, you can use ROLLUP to return a subtotal for each group and a grand total for all groups. The syntax is as follows:

      ROLLUP ( { expression | ( expression [, …] ) } [, …] ) 
    • GROUPING SETS

      The GROUPING SETS subclause is an extension of the GROUP BY clause that lets you specify multiple GROUP BY options in a single query. This improves efficiency by allowing you to select only the grouping sets that you need. When you specify only the required data groups, the database does not need to compute the full set of aggregations that is generated by CUBE or ROLLUP. The syntax is as follows:

      GROUPING SETS ( grouping_element [, …] ) 
  • DISTINCT clause

    If you specify SELECT DISTINCT, all duplicate rows are removed from the result set. Only one row is kept from each group of duplicate rows.

    Note

    SELECT DISTINCT on columns of the ARRAY data type is not supported.

    SELECT DISTINCT accountid FROM table;
  • COUNT DISTINCT clause

    • Command overview

      COUNT DISTINCT counts the number of distinct values in a column. A value that appears multiple times is counted only once. Similar to the COUNT function, NULL values in the column are not included in the count.

    • Syntax description

      • Syntax for exact calculation:

        SELECT c1, COUNT(DISTINCT c2) FROM table GROUP BY c1
      • An exact COUNT DISTINCT calculation can consume significant resources. For this reason, Hologres also supports approximate COUNT DISTINCT calculations. The syntax is as follows:

        SELECT c1, approx_count_distinct(c2) FROM table GROUP BY c1
  • UNION clause

    • Command overview

      The syntax of the UNION clause is as follows:

      select_statement UNION [ ALL | DISTINCT ] select_statement
    • Parameter description

      Parameter

      Description

      select_statement

      select_statement is any SELECT statement without ORDER BY, LIMIT, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, or FOR KEY SHARE clauses (if the subexpression is enclosed in parentheses, ORDER BY and LIMIT can be attached to it. Without parentheses, these clauses apply to the UNION result, not the right-hand expression).

      UNION

      The UNION operator computes the union of rows returned by the involved SELECT statements. A row appears in the union if it appears in at least one of the two result sets. The SELECT statements used as UNION operands must produce the same number of columns, and corresponding columns must have compatible data types.

      The UNION result does not contain duplicate rows unless the ALL option is specified. ALL prevents duplicate removal (so UNION ALL is usually significantly faster than UNION; use ALL whenever possible). You can explicitly specify duplicate removal by writing DISTINCT.

      Multiple UNION operators in the same SELECT statement are evaluated left to right unless parentheses specify a different order.

  • INTERSECT clause

    • Command overview

      The INTERSECT clause has the following form:

      select_statement INTERSECT [ ALL | DISTINCT ] select_statement
    • Parameter description

      Parameter

      Description

      select_statement

      select_statement is any SELECT statement without an ORDER BY LIMIT clause.

      INTERSECT

      The INTERSECT operator computes the intersection of rows returned by the involved SELECT statements. A row appears in the intersection only if it appears in both result sets.

      The INTERSECT result does not contain duplicate rows unless the ALL option is specified. With ALL, a row that appears m times in the left table and n times in the right table appears min(m, n) times in the result. You can explicitly specify duplicate removal by writing DISTINCT.

      Multiple INTERSECT operators in the same SELECT statement are evaluated left to right unless parentheses specify a different order. INTERSECT has higher precedence than UNION. Thus, A UNION B INTERSECT C is read as A UNION (B INTERSECT C).

  • EXCEPT clause

    • Command overview

      The EXCEPT clause has the following form:

      select_statement EXCEPT [ ALL | DISTINCT ] select_statement
    • Parameter description

      Parameter

      Description

      select_statement

      select_statement is any SELECT statement without an ORDER BY LIMIT clause.

      EXCEPT

      The EXCEPT operator computes the set of rows that appear in the result of the left SELECT statement but not in the result of the right SELECT statement.

      The EXCEPT result does not contain duplicate rows unless the ALL option is specified. With ALL, a row that appears m times in the left table and n times in the right table appears max(m - n, 0) times in the result. You can explicitly specify duplicate removal by writing DISTINCT.

      Multiple EXCEPT operators in the same SELECT statement are evaluated left to right unless parentheses specify a different order. EXCEPT has the same precedence as UNION.

      Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, and FOR KEY SHARE cannot be used with EXCEPT results or any EXCEPT input.

  • ORDER BY clause

    • Command overview

      The optional ORDER BY clause has the following form:

      ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...]
    • Parameter description

      The ORDER BY clause sorts the result rows based on the specified expressions. If two rows are equal according to the leftmost expression, they are compared based on the next expression, and so on. If the rows are equal for all specified expressions, they are returned in an implementation-dependent order.

      Each expression can be the name or ordinal number of an output column from the SELECT list, or an expression that is formed from input column values.

      The ordinal number refers to the position of the output column, from left to right. This feature lets you define an order for columns that do not have unique names. This is typically not necessary because you can use the AS clause to assign a unique name to an output column.

      You can also use arbitrary expressions in the ORDER BY clause, including columns that are not in the SELECT output list. For example, the following statement is valid:

      SELECT name FROM distributors ORDER BY code;

      A limitation of this feature is that when you apply an ORDER BY clause to the result of a UNION, INTERSECT, or EXCEPT clause, the ORDER BY clause can specify only output column names or ordinal numbers, not expressions.

      If an ORDER BY expression is a simple name that matches both an output column name and an input column name, ORDER BY interprets it as the output column name. This behavior differs from the GROUP BY clause, which interprets an ambiguous name as an input column name. This inconsistency is for compatibility with the SQL standard.

      You can add the ASC (ascending) or DESC (descending) keyword after any expression in the ORDER BY clause. If you do not specify a keyword, ASC is used by default. Alternatively, you can specify a sort operator name in a USING clause. A sort operator must be a less-than or greater-than member of a B-tree operator family. ASC is usually equivalent to USING <, and DESC is usually equivalent to USING >. However, the creator of a user-defined data type can define a specific default sort order, which might correspond to operators with other names.

      If you specify NULLS LAST, null values are sorted after non-null values. If you specify NULLS FIRST, null values are sorted before non-null values. If you do not specify either, the default behavior for ASC is NULLS LAST, and the default behavior for DESC is NULLS FIRST. By default, nulls are treated as being greater than non-nulls. When you use a USING clause, the default nulls order depends on whether the operator is a less-than or greater-than operator.

      Note that ordering options apply only to the expression they follow. For example, ORDER BY x, y DESC differs from ORDER BY x DESC, y DESC.

  • LIMIT clause

    • Command overview

      The LIMIT clause consists of two independent subclauses:

      LIMIT { count | ALL }
      OFFSET start
    • Parameter description

      The count parameter specifies the maximum number of rows to return. The start parameter specifies the number of rows to skip before the system starts to return rows. If you specify both parameters, the system skips the specified number of start rows before it begins to count the specified number of count rows to return.

      If the count expression evaluates to NULL, it is treated as LIMIT ALL, which means there is no limit. If the start expression evaluates to NULL, it is treated as OFFSET 0.

      When you use a LIMIT clause, you should also include an ORDER BY clause to constrain the rows to a unique order. Otherwise, you will retrieve an unpredictable subset of the query's rows. For example, if you request rows 10 through 20 without specifying an ORDER BY clause, there is no guarantee which rows will be returned.

      The query planner takes the LIMIT clause into account when it generates a query plan. Therefore, different LIMIT and OFFSET values will likely result in different query plans and different row orders. If you select different subsets of a query result using different LIMIT and OFFSET values, you might receive inconsistent results unless you enforce a predictable order with an ORDER BY clause. This is not a bug. It is an intended consequence of the fact that SQL does not guarantee a specific order for query results unless an ORDER BY clause is used to constrain the order.

      If you do not use an ORDER BY clause to enforce a deterministic subset, executing the same LIMIT query multiple times might return different subsets of the table rows. This is not a bug. Determinism is not guaranteed in such cases.

Usage examples

  • Join two tables

    SELECT f.title, f.did, d.name, f.date_prod, f.kind FROM 
    distributors d, films f WHERE f.did = d.did;
  • WITH clause

    WITH distributor_name(name) AS (SELECT name from distributors)
    SELECT name FROM distributor_name ORDER BY name;
  • GROUP BY

    SELECT kind, sum(length) AS total FROM films GROUP BY kind;
  • Filtering with HAVING

    SELECT kind, sum(length) AS total FROM films GROUP BY kind 
    HAVING sum(length) < interval '5 hours';
  • GROUP BY CUBE

    SELECT  l_returnflag
            ,l_shipmode
            ,SUM(l_quantity)
    FROM    public.lineitem
    GROUP BY cube((l_returnflag),(l_shipmode))
    ORDER BY l_returnflag
             ,l_shipmode;
  • GROUP BY ROLLUP

    SELECT  l_returnflag
            ,l_shipmode
            ,SUM(l_quantity)
    FROM    public.lineitem
    GROUP BY ROLLUP ((l_returnflag),(l_shipmode))
    ORDER BY l_returnflag
             ,l_shipmode;
  • GROUP BY GROUPING SETS

    SELECT  l_returnflag
            ,l_shipmode
            ,SUM(l_quantity)
    FROM    public.lineitem
    GROUP BY GROUPING SETS ((l_returnflag,l_shipmode),())
    ORDER BY l_returnflag
             ,l_shipmode;
  • ORDER BY

    SELECT * FROM distributors ORDER BY name;