This topic describes how to use the constraint relationships between primary and foreign keys to eliminate unnecessary joins and optimize execution plans.
Prerequisites
An AnalyticDB for MySQL cluster of V3.1.10 or later is created.
To query the minor version of an AnalyticDB for MySQL Data Lakehouse Edition cluster, execute the SELECT adb_version();
statement. To update the minor version of a cluster, contact technical support.
Overview
In most cases, large volumes of data are stored in large-scale database tables to perform complex query and analysis operations. You can use the JOIN statement to combine rows of two or more tables. However, join operations may degrade query performance. To optimize query performance, you can use join elimination. One of the optimization methods is to eliminate unnecessary joins by using primary and foreign key constraints. The constraints provide information about the relationships between tables to eliminate unnecessary joins, reduce query time, and improve database performance.
AnalyticDB for MySQL allows you to use the /*+ PK_FK_DEPENDENCY_ENABLED*/
hint to enable or disable the feature of using the primary and foreign key constraint information to eliminate unnecessary joins.
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
: enables the feature of obtaining the primary and foreign key constraint information. After the feature is enabled, the system applies join elimination rules based on the primary and foreign key constraint information./*+ PK_FK_DEPENDENCY_ENABLED = false*/
: disables the feature of obtaining the primary and foreign key constraint information. After the feature is disabled, the system cannot apply join elimination rules based on the primary and foreign key constraint information.
PK: the primary key.
FK: the foreign key.
Procedure
Declare foreign keys
When you create a table, use the CREATE TABLE statement to add a foreign key.
For an existing table, use the ALTER TABLE statement to add or remove a foreign key.
You cannot use multiple columns as a foreign key. Example:
FOREIGN KEY (sr_item_sk, sr_ticket_number) REFERENCES store_sales(ss_item_sk,d_date_sk)
.You must check the data constraint relationships between primary keys and foreign keys. AnalyticDB for MySQL does not check data constraints.
When you use the
DROP TABLE
statement to remove foreign keys, AnalyticDB for MySQL does not check this operation or report an error.You cannot add foreign key constraints for external tables.
Use the CREATE TABLE statement to add a foreign key
Syntax
For information about the syntax, see CREATE TABLE.
Example
Create a database named
db
and create a table nameditem
in the database. Set the primary key toi_item_sk
.CREATE DATABASE db; USE db; CREATE TABLE item ( i_item_sk bigint NOT NULL, i_current_price bigint, PRIMARY KEY(i_item_sk) ) DISTRIBUTED BY HASH(i_item_sk);
Use the CREATE TABLE statement to add a foreign key.
Add a foreign key for the same database.
Create a table named
store_returns
in thedb
database. Use theFOREIGN KEY
clause to associate thesr_item_sk
column of the store_returns table with the primary key columni_item_sk
of theitem
table.CREATE TABLE store_returns ( sr_sale_id bigint, sr_store_sk bigint, sr_item_sk bigint NOT NULL, FOREIGN KEY (sr_item_sk) REFERENCES item (i_item_sk) );
NoteThe
symbol
parameter is not specified. In this case, the parser automatically usessr_item_sk_fk
as the name of the foreign key constraint.Add a foreign key for another database.
Create a database named
db2
, create a table namedstore_sales
in the database, and then add a foreign key column namedss_item_sk
.CREATE DATABASE db2; USE db2; CREATE TABLE store_sales ( ss_sale_id bigint, ss_store_sk bigint, ss_item_sk bigint not null, CONSTRAINT fk_constraint FOREIGN KEY (ss_item_sk) REFERENCES db.item (i_item_sk) );
Use the CREATE TABLE statement to declare multiple foreign keys for the same table
Example:
Create two tables named
customer
andvendor
.USE db; CREATE TABLE customer ( i_customer_sk bigint NOT NULL, i_current_price bigint, PRIMARY KEY(i_customer_sk) ) DISTRIBUTED BY HASH(i_customer_sk); CREATE TABLE vendor ( id bigint primary key, name varchar(5) not null );
Create a table named
store_product
. Associate thesr_sale_id
column of the store_product table with the primary key columnid
of thevendor
table. Then, associate thesr_customer_sk
column of the store_product table with the primary key columni_customer_sk
of thecustomer
table.CREATE TABLE store_product ( sr_sale_id bigint, sr_store_sk bigint, sr_customer_sk bigint NOT NULL, FOREIGN KEY (sr_sale_id) REFERENCES vendor (id), FOREIGN KEY (sr_customer_sk) REFERENCES customer (i_customer_sk) );
Query foreign keys
Use the SHOW CREATE TABLE
statement to query foreign keys.
Example:
USE db;
SHOW CREATE TABLE store_returns;
Sample result:
-- Result (other statements omitted):
-- CONSTRAINT `sr_item_sk_fk` FOREIGN KEY (`sr_item_sk`) REFERENCES `db`.`item`(`i_item_sk`)
Use the ALTER TABLE statement to add or remove a foreign key
Syntax
For information about the syntax, see ALTER TABLE.
Examples
In the following examples, a database named db
is used. The table for which you want to add a foreign key is named store_returns
. The primary table is named item
.
Remove a foreign key from the
store_returns
table.USE db; ALTER TABLE store_returns DROP FOREIGN KEY sr_item_sk_fk;
Add a foreign key for the
store_returns
table.USE db; ALTER TABLE store_returns ADD CONSTRAINT sr_item_fk FOREIGN KEY (sr_item_sk) REFERENCES item (i_item_sk);
Use the ALTER TABLE statement to declare multiple foreign keys for the same table
You can use one ALTER TABLE statement to add a foreign key for a table at a time. To add multiple foreign keys for a table, execute multiple ALTER TABLE statements.
Example:
Create a table named
store
in thedb2
database.USE db2; CREATE TABLE store ( id bigint primary key, name varchar(5) not null );
Add a foreign key for the
store_returns
table in thedb
database.ALTER TABLE db.store_returns ADD FOREIGN KEY (sr_store_sk) REFERENCES store(id);
Apply join elimination rules
If you want to query data from only one of the joined tables, you can use primary and foreign key constraints to eliminate unnecessary joins, simplify execution plans, and improve query performance. Join elimination is commonly used in the following scenarios:
Two tables in the same database are joined
For example, you want to query data from the store_returns
table after you perform a join operation on the store_returns
and item
tables.
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
EXPLAIN
SELECT
s.sr_sale_id,
s.sr_store_sk,
s.sr_item_sk
FROM
store_returns s,
item
WHERE
sr_item_sk = i_item_sk;
Sample result:
+---------------+
| Plan Summary |
+---------------+
1- Output[ Query plan ] {Est rowCount: 1.0}
2 -> Exchange[GATHER] {Est rowCount: 1.0}
3 - TableScan {table: store_returns, Est rowCount: 1.0}
The execution plan does not contain the INNER JOIN operator. In this case, the optimizer can use the constraint relationship between the sr_item_sk
and i_item_sk
columns to eliminate the join between the store_returns
and item
tables.
Two tables across databases are joined
For example, you want to query data from the store_sales table after you perform a join operation on the item
table in the db
database and the store_sales
table in the db2
database.
USE db2;
-- Sample query statement
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
EXPLAIN
SELECT
s.ss_sale_id,
s.ss_item_sk
FROM
store_sales s,
db.item
WHERE
ss_item_sk = i_item_sk;
Sample result:
+---------------+
| Plan Summary |
+---------------+
1- Output[ Query plan ] {Est rowCount: 1.0}
2 -> Exchange[GATHER] {Est rowCount: 1.0}
3 - TableScan {table: store_sales, Est rowCount: 1.0}
Multiple tables are joined
For example, you want to query data from the store_returns table after you perform a join operation on the store_returns
, item
, and store
tables.
USE db;
-- Sample query statement
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
EXPLAIN
SELECT
s.sr_sale_id,
s.sr_store_sk,
s.sr_item_sk
FROM
store_returns s,
item,
db2.store
WHERE
sr_item_sk = i_item_sk
AND sr_store_sk = id;
Sample result:
+---------------+
| Plan Summary |
+---------------+
1- Output[ Query plan ] {Est rowCount: 1.0}
2 -> Exchange[GATHER] {Est rowCount: 1.0}
3 - TableScan {table: store_returns, Est rowCount: 1.0}
A view is created based on joined tables
A view may be created based on two or more tables that are joined. When you use the view in another query statement that does not reference specific information about the view, the optimizer can eliminate unnecessary joins. For example, create a view that contains information about the store_returns
and item
tables.
CREATE VIEW sr_item_v AS
SELECT
s.sr_store_sk AS store_name,
s.sr_sale_id AS sale_id,
s.sr_item_sk AS sr_item_id,
item.i_current_price AS item_price,
item.i_item_sk as item_id
FROM
store_returns s,
item
WHERE
sr_item_sk = i_item_sk;
If a query does not reference the item_price
column of the item
table, the optimizer can scan only the store_returns table, not the item
table.
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
EXPLAIN
SELECT store_name, sr_item_id, sale_id
FROM sr_item_v;
Sample result:
+---------------+
| Plan Summary |
+---------------+
1- Output[ Query plan ] {Est rowCount: 1.0}
2 -> Exchange[GATHER] {Est rowCount: 1.0}
3 - TableScan {table: store_returns, Est rowCount: 1.0}
A foreign key constraint exists between the item.i_item_sk
and store_returns.sr_item_sk
columns. The optimizer can replace the reference to the item.i_item_sk
column with the reference to the store_returns.sr_item_sk
column. In this case, the optimizer can query data from columns of the store_returns
table and eliminate unnecessary joins.
Exceptional case of join elimination
In the preceding view example, if a query references columns of the item
table, you cannot eliminate joins.
-- Sample query statement
/*+ PK_FK_DEPENDENCY_ENABLED = true*/
EXPLAIN
SELECT store_name, sr_item_id, sale_id, item_price
FROM sr_item_v;
Sample result:
+---------------+
| Plan Summary |
+---------------+
1- Output[ Query plan ] {Est rowCount: 1.0}
2 -> Exchange[GATHER] {Est rowCount: 1.0}
3 -> InnerJoin[Hash Join] {Est rowCount: 1.0}
4 -> Project {Est rowCount: 1.0}
5 -> Exchange[REPARTITION] {Est rowCount: 1.0}
6 - TableScan {table: store_returns, Est rowCount: 1.0}
7 -> LocalExchange[HASH] {Est rowCount: 1.0}
8 -> ScanProject {table: item, Est rowCount: 1.0}
9 - TableScan {table: item, Est rowCount: 1.0}