The JOIN function joins two or more tables and returns data that meets the join and query conditions.
Usage notes
In this topic, a left table indicates the table on the left side of the JOIN keyword, and a right table indicates the table on the right side of the JOIN keyword.
For example, in the following SQL statement, table_1 is the left table and table_2 is the right table.
SELECT * FROM table_1 JOIN table_2 on ......;
Syntax
table_references join_type table_references [ ON join_condition | USING ( join_column [, ...] ) ]
table_references : {
table_name [ [ AS ] alias_name ]
| select_statement
}
join_type : {
[ INNER ] JOIN
| LEFT [ OUTER ] JOIN
| RIGHT [ OUTER ] JOIN
| CROSS JOIN
}
Parameters
Parameters | Required | Description |
table_references | Yes | The information about the tables which you want to join. The value can be the names of the tables or the SELECT statement. The left table and right table are distinguished based on whether the tables are on the left side or right side of the JOIN keyword. |
join_type | Yes | The type of the JOIN function. Valid values:
|
join_condition | Yes | The columns based on which you want to join two tables. The following SQL statement shows that table_1 and table_2 are joined by matching the col_A column in table_1 and the col_B column in table_2.
If the names of the columns that are specified in the join condition are the same in the two tables, you can use
Simplified SQL statement:
|
JOIN algorithms
Tablestore provides the following algorithms for the JOIN action: INDEX JOIN and HASH JOIN. INDEX JOIN is the default JOIN algorithm. If the specified columns in the right table based on which you want to join the tables do not meet the condition for using INDEX JOIN, the system uses HASH JOIN.
INDEX JOIN (default): reads data from the left table, uses the index or primary key of the right table to read the rows in the right table that match the data in the left table, and then joins the two tables. For more information, see INDEX JOIN.
HASH JOIN: reads data from the left table to create a hash table, reads data from the right table to determine whether the rows in the right table match the data in the hash table, and then joins the two tables. For more information, see HASH JOIN.
INDEX JOIN
To use INDEX JOIN, ensure that the specified columns in the right table based on which you want to join the tables meet specific conditions for using indexes. The left table is not subject to these conditions.
If you use a secondary index or the primary key of the right table to read data from the right table, the specified columns in the right table based on which you want to join tables must comply with the leftmost matching principle.
If you use a search index of the right table to read data from the right table, the specified columns in the right table based on which you want to join tables must be included in the search index. If other filtering conditions are specified for the right table, the filtering conditions must be able to be pushed down to the search index that includes the specified columns in the right table based on which you want to join tables.
If the number of rows in the tables that you want to join is small, the amount of data in the left table is small, or the amount of data in the left table is large but the amount of data after data is filtered is small, INDEX JOIN can improve the join performance so long as the right table meets the condition for using INDEX JOIN. In scenarios in which INDEX JOIN is used, when you set join_type to INNER JOIN and the two tables meet the condition for using INDEX JOIN, we recommend that you set the table that has a smaller amount of data or the table that has a smaller amount of data after data is filtered to the left table.
HASH JOIN
Tablestore does not impose limits on the scenarios in which HASH JOIN can be used. If the condition for using INDEX JOIN is not met, the system uses HASH JOIN.
If the number of rows in the tables that you want to join is large, you can use HASH JOIN to obtain better performance.
In scenarios in which HASH JOIN is used, when you set the join_type parameter to INNER JOIN, we recommend that you set the table that has a smaller amount of data or the table that has a smaller amount of data after data is filtered to the left table.
Suggestions
If you want to use INDEX JOIN, make sure that the specified columns in the right table based on which you want to join the tables meet specific conditions for using indexes. Tablestore does not impose requirements on the left table.
If you use a secondary index or the primary key of the right table to read data from the right table, the specified columns in the right table based on which you want to join tables must comply with the leftmost matching principle.
If you use a search index of the right table to read data from the right table, make sure that the specified columns in the right table based on which you want to join tables are included in the search index and that the filtering conditions that are specified for the right table can be pushed down to the search index.
If a large number of filtering conditions are specified for the right table and no index is used, the read performance is compromised.
When you set the join_type parameter to INNER JOIN, the table on the left side of the JOIN keyword is the external table, and the table on the right side of the JOIN keyword is the internal table. We recommend that you use the table that has a smaller number of rows or the table that has a smaller number of rows after data is filtered as the external table, which is on the left side of the JOIN keyword.
If the number of rows in the tables that you want to join is small or the external table that you want to join has a smaller number of rows or has a smaller number of rows after data is filtered, INDEX JOIN provides better performance than HASH JOIN. If the amount of data or the number of rows in the tables that you want to join is large, the performance of INDEX JOIN is compromised. In this case, HASH JOIN provides better performance than INDEX JOIN.
If a large number of filtering conditions are used for a table specified in the SQL statement in which the JOIN clause is used, the read performance is compromised when no index is used. In this case, you can use indexes to accelerate data read operations. This may help increase the execution speed of the SQL statement.
Examples
The following examples use two tables named orders and customers:
SELECT * FROM orders;
+----------+-------------+------------+--------------+
| order_id | customer_id | order_date | order_amount |
+----------+-------------+------------+--------------+
| 1001 | 1 | 2023-01-01 | 50 |
| 1002 | 2 | 2023-01-02 | 80 |
| 1003 | 3 | 2023-01-03 | 180 |
| 1004 | 4 | 2023-01-04 | 220 |
| 1005 | 6 | 2023-01-05 | 250 |
+----------+-------------+------------+--------------+
SELECT * FROM customers;
+-------------+---------------+----------------+
| customer_id | customer_name | customer_phone |
+-------------+---------------+----------------+
| 1 | Alice | 11111111111 |
| 2 | Bob | 22222222222 |
| 3 | Carol | 33333333333 |
| 4 | David | 44444444444 |
| 5 | Eve | 55555555555 |
+-------------+---------------+----------------+
Example 1: INNER JOIN
SELECT * FROM orders JOIN customers ON orders.customer_id = customers.customer_id;
The preceding INNER JOIN statement is equivalent to the following SQL statement in which USING is used:
SELECT * FROM orders JOIN customers USING(customer_id);
The following output is returned:
+----------+-------------+------------+--------------+-------------+---------------+----------------+ | order_id | customer_id | order_date | order_amount | customer_id | customer_name | customer_phone | +----------+-------------+------------+--------------+-------------+---------------+----------------+ | 1001 | 1 | 2023-01-01 | 50 | 1 | Alice | 11111111111 | | 1002 | 2 | 2023-01-02 | 80 | 2 | Bob | 22222222222 | | 1003 | 3 | 2023-01-03 | 180 | 3 | Carol | 33333333333 | | 1004 | 4 | 2023-01-04 | 220 | 4 | David | 44444444444 | +----------+-------------+------------+--------------+-------------+---------------+----------------+
Example 2: LEFT JOIN
SELECT * FROM orders LEFT JOIN customers ON orders.customer_id = customers.customer_id;
The preceding LEFT JOIN statement is equivalent to the following SQL statement in which USING is used:
SELECT * FROM orders LEFT JOIN customers USING(customer_id);
The following output is returned:
+----------+-------------+------------+--------------+-------------+---------------+----------------+ | order_id | customer_id | order_date | order_amount | customer_id | customer_name | customer_phone | +----------+-------------+------------+--------------+-------------+---------------+----------------+ | 1001 | 1 | 2023-01-01 | 50 | 1 | Alice | 11111111111 | | 1002 | 2 | 2023-01-02 | 80 | 2 | Bob | 22222222222 | | 1003 | 3 | 2023-01-03 | 180 | 3 | Carol | 33333333333 | | 1004 | 4 | 2023-01-04 | 220 | 4 | David | 44444444444 | | 1005 | 6 | 2023-01-05 | 250 | NULL | NULL | NULL | +----------+-------------+------------+--------------+-------------+---------------+----------------+
Example 3: CROSS JOIN
SELECT * FROM orders CROSS JOIN customers;
The following output is returned:
+----------+-------------+------------+--------------+-------------+---------------+----------------+ | order_id | customer_id | order_date | order_amount | customer_id | customer_name | customer_phone | +----------+-------------+------------+--------------+-------------+---------------+----------------+ | 1001 | 1 | 2023-01-01 | 50 | 1 | Alice | 11111111111 | | 1002 | 2 | 2023-01-02 | 80 | 1 | Alice | 11111111111 | | 1003 | 3 | 2023-01-03 | 180 | 1 | Alice | 11111111111 | | 1004 | 4 | 2023-01-04 | 220 | 1 | Alice | 11111111111 | | 1005 | 6 | 2023-01-05 | 250 | 1 | Alice | 11111111111 | | 1001 | 1 | 2023-01-01 | 50 | 2 | Bob | 22222222222 | | 1002 | 2 | 2023-01-02 | 80 | 2 | Bob | 22222222222 | | 1003 | 3 | 2023-01-03 | 180 | 2 | Bob | 22222222222 | | 1004 | 4 | 2023-01-04 | 220 | 2 | Bob | 22222222222 | | 1005 | 6 | 2023-01-05 | 250 | 2 | Bob | 22222222222 | | 1001 | 1 | 2023-01-01 | 50 | 3 | Carol | 33333333333 | | 1002 | 2 | 2023-01-02 | 80 | 3 | Carol | 33333333333 | | 1003 | 3 | 2023-01-03 | 180 | 3 | Carol | 33333333333 | | 1004 | 4 | 2023-01-04 | 220 | 3 | Carol | 33333333333 | | 1005 | 6 | 2023-01-05 | 250 | 3 | Carol | 33333333333 | | 1001 | 1 | 2023-01-01 | 50 | 4 | David | 44444444444 | | 1002 | 2 | 2023-01-02 | 80 | 4 | David | 44444444444 | | 1003 | 3 | 2023-01-03 | 180 | 4 | David | 44444444444 | | 1004 | 4 | 2023-01-04 | 220 | 4 | David | 44444444444 | | 1005 | 6 | 2023-01-05 | 250 | 4 | David | 44444444444 | | 1001 | 1 | 2023-01-01 | 50 | 5 | Eve | 55555555555 | | 1002 | 2 | 2023-01-02 | 80 | 5 | Eve | 55555555555 | | 1003 | 3 | 2023-01-03 | 180 | 5 | Eve | 55555555555 | | 1004 | 4 | 2023-01-04 | 220 | 5 | Eve | 55555555555 | | 1005 | 6 | 2023-01-05 | 250 | 5 | Eve | 55555555555 | +----------+-------------+------------+--------------+-------------+---------------+----------------+