This topic describes the types of subqueries supported by PolarDB-X 1.0 and the limits and additional considerations when you use subqueries in PolarDB-X 1.0.
Limits
Compared with the native MySQL, PolarDB-X 1.0 has the following limits when you use subqueries:
- Subqueries cannot be used in HAVING clauses. Example:
SELECT name, AVG( quantity ) FROM tb1 GROUP BY name HAVING AVG( quantity ) > 2* ( SELECT AVG( quantity ) FROM tb2 );
- Subqueries cannot be used in JOIN ON clauses. Example:
SELECT * FROM tb1 p JOIN tb2 s on (p.id=s.id and p.quantity>All(select quantity from tb3))
- ROW subqueries and scalar subqueries cannot be placed before and after equal signs
(=) simultaneously. Example:
select * from tb1 where row(id, name) = (select id, name from tb2)
- Subqueries cannot be used in UPDATE SET clauses. Example:
UPDATE t1 SET c1 = (SELECT c2 FROM t2 WHERE t1.c1 = t2.c1) LIMIT 10
Additional considerations
In PolarDB-X 1.0, some subqueries can be executed by using only the APPLY operator and result in inefficient queries. Avoid the following inefficient SQL statements:
- SQL statements whose WHERE clauses contain both OR operators and subqueries. The execution
efficiency is reduced based on the data in the foreign tables. Examples:
Efficient: select * from tb1 where id in (select id from tb2) Efficient: select * from tb1 where id in (select id from tb2) and id>3 Inefficient: select * from tb1 where id in (select id from tb2) or id>3
- Correlated subqueries whose correlated items are used in functions or used along with
non-equal signs. Examples:
Efficient: select * from tb1 a where id in (select id from tb2 b where a.name=b.name) Inefficient: select * from tb1 a where id in (select id from tb2 b where UPPER(a.name)=b.name) Inefficient: select * from tb1 a where id in (select id from tb2 b where a.decimal_test=abs(b.decimal_test)) Inefficient: select * from tb1 a where id in (select id from tb2 b where a.name! =b.name) Inefficient: select * from tb1 a where id in (select id from tb2 b where a.name>=b.name)
- Correlated subqueries whose correlated items are connected with other conditions by
using OR operators. Examples:
Efficient: select * from tb1 a where id in (select id from tb2 b where a.name=b.name and b.date_test<'2015-12-02') Inefficient: select * from tb1 a where id in (select id from tb2 b where a.name=b.name or b.date_test<'2015-12-02') Inefficient: select * from tb1 a where id in (select id from tb2 b where a.name=b.name or b.date_test=a.date_test)
- Scalar subqueries that have correlated items. Examples:
Efficient: select * from tb1 a where id > (select id from tb2 b where b.date_test<'2015-12-02') Inefficient: select * from tb1 a where id > (select id from tb2 b where a.name=b.name and b.date_test<'2015-12-02')
- Subqueries whose correlated items span the correlation levels. Examples:
- An SQL statement has multiple correlation levels. The correlated items in each subquery
are correlated only with the upper level. Such statements are efficient.
Efficient: select * from tb1 a where id in(select id from tb2 b where a.name=b.name and exists (select name from tb3 c where b.address=c.address))
- An SQL statement has multiple correlation levels. The correlated items of subqueries
in
table c
are correlated with columns intable a
. Such statements are inefficient.Inefficient: select * from tb1 a where id in(select id from tb2 b where a.name=b.name and exists (select name from tb3 c where a.address=c.address))
Note In the preceding example, bothtable a
andtable b
,table b
andtable c
belong to the same correlation level. The correlation betweentable a
andtable c
spans the correlation levels. - An SQL statement has multiple correlation levels. The correlated items in each subquery
are correlated only with the upper level. Such statements are efficient.
- Subqueries that contain GROUP BY clauses. Make sure that the correlated items are
correlated to the grouping columns. Examples:
- An SQL subquery contains aggregate functions and correlated items. The
b.pk
correlated item is correlated to thepk
grouping column. Such SQL statements are efficient.Efficient: select * from tb1 a where exists (select pk from tb2 b where a.pk=b.pk and b.date_test='2003-04-05' group by pk);
- An SQL subquery contains aggregate functions and correlated items. The
b.date_test
correlated item is not correlated to thepk
grouping column. Such SQL statements are inefficient.Inefficient: select * from tb1 a where exists (select pk from tb2 b where a.date_test=b.date_test and b.date_test='2003-04-05' group by pk);
- An SQL subquery contains aggregate functions and correlated items. The
Supported subqueries
PolarDB-X 1.0 supports the following types of subqueries:
- Comparisons using subqueries
Comparisons using subqueries indicate subqueries that use comparison operators. These subqueries are commonly used.
- Syntax:
non_subquery_operand comparison_operator (subquery) comparison_operator: = > < >= <= <> ! = <=> like
- Example:
select * from tb1 WHERE 'a' = (SELECT column1 FROM t1)
Note Subqueries can be placed only to the right of comparison operators.
- Syntax:
- Subqueries with ANY, ALL, IN/NOT IN, and EXISTS/NOT EXISTS
- Syntax:
operand comparison_operator ANY (subquery) operand comparison_operator ALL (subquery) operand IN (subquery) operand NOT IN (subquery) operand EXISTS (subquery) operand NOT EXISTS (subquery) comparison_operator:= > < >= <= <> ! =
- Examples
- ANY: If any row returned by the subquery meets the expression before ANY, TRUE is returned. Otherwise, FALSE is returned.
- ALL: If all rows returned by the subquery meet the expression before ALL, TRUE is returned. Otherwise, FALSE is returned.
- IN: If IN is used before the subquery, IN is equivalent to
=ANY
. Example:SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);
- NOT IN: If NOT IN is used before the subquery, NOT IN is equivalent to
<>ALL
. Example:SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);
- EXISTS: If the subquery returns any rows, TRUE is returned. Otherwise, FALSE is returned.
Example:
SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
Note If a subquery contains any rows, the WHERE condition returns TRUE even if the subquery contains only NULL rows. - NOT EXISTS: If the subquery returns any rows, FALSE is returned. Otherwise, TRUE is returned.
- Syntax:
- ROW subqueries
- ROW subqueries support the following comparison operators:
comparison_operator: = > < >= <= <> ! = <=>
- Examples:
SELECT * FROM t1 WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10); SELECT * FROM t1 WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
The preceding two SQL statements are equivalent. Data rows in table t1 are returned only when the following conditions are met:- The subquery (
SELECT col3, col4 FROM t2 WHERE id=10
) returns only one row. An error is reported if multiple rows are returned. col3
andcol4
returned by the subquery are equal tocol1
andcol2
in the primary table.
- The subquery (
- ROW subqueries support the following comparison operators:
- Correlated subqueries
Correlated subqueries are subqueries that contain references to foreign tables in outer queries. Example:
SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);
In the example, the subquery does not contain table t1 and its column column2. In this case, the subquery finds the table in the outer query.
- Derived tables (subqueries in a FROM clause)
Derived tables are subqueries in a FROM clause.
- Syntax:
SELECT ... FROM (subquery) [AS] tbl_name ...
- Examples
- Prepare data:
Execute the following statements to create table t1:
CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT); INSERT INTO t1 VALUES (1,'1',1.0); INSERT INTO t1 VALUES (2,'2',2.0);
Execute the following statement. The query result is2, '2', 4.0
.SELECT sb1,sb2,sb3 FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb WHERE sb1 > 1;
- Query data: Query the average value of grouped data that is processed by the SUM function.
If you execute the following SQL statement, an error is reported and no result is returned.
SELECT AVG(SUM(s1)) FROM t1 GROUP BY s1;
You can execute the following statement that contains a derived table. The query result is1.5000
.SELECT AVG(sum_s1) FROM (SELECT SUM(s1) AS sum_s1 FROM t1 GROUP BY s1) AS t1;
Note- A derived table must have an alias, such as
t1
in the previous statement. - A derived table can return a scalar, a column, a row, or a table.
- Derived tables cannot be correlated subqueries. Derived tables cannot contain references to foreign tables in outer queries.
- A derived table must have an alias, such as
- Prepare data:
- Syntax: