OceanBase is 100% proprietary. It has supported Double 11 steadily for seven consecutive years and launched a new city-level disaster recovery standard of three places and five centers. It is the only domestic native distributed database that has set world records in both TPC-C and TPC-H tests. It officially opened its source code in June 2021. The query optimizer is the core module of the relational database system, the focus and difficulty of database kernel development, and the measure of the maturity of the entire database system.
We will write a series of articles to help you understand the OceanBase query optimizer better. Each article will help you understand the essence of query rewriting, comprehend the equivalence of complex SQL, and write efficient SQL.
The OceanBase Optimizer Team, led by Xifeng (OceanBase Senior Technical Expert) and Shanwen (Technical Expert), is committed to building the world's leading distributed query optimizer.
This query rewriting series includes subquery optimization, aggregate function optimization, window function optimization, complex expression optimization, and more modules. Please stay tuned.
This article is Part 1 of the OceanBase Rewriting Series: OceanBase query rewriting practices. Most users usually use SELECT... FROM... WHERE operations when operating databases. GROUP-BY, ORDER-BY, and LIMIT are added in more complex statements. Features (like window functions) are not used in most cases. When you want to write a query statement, you must use your familiar grammar to write a semantically correct query. Intuitive and correct SQL is already a good SQL for users. Ee can write different SQL statements to complete the same query. For example, some application developers write SQL on the left side of figure (1) to implement "update type = 10 in the T table and order the smallest three rows by k." Semantically, this SQL has no problem. However, from a performance point of view, this SQL is not good enough. If no optimization is performed, the kernel will execute the subquery to calculate the set of id
that meet the conditions first. Then, it will execute the outer main query to update the corresponding rows. This is not a good SQL for the kernel. We can use SQL on the right side of figure (1) to accomplish the same function.
Figure 1: Example of Query Rewriting
When the kernel executes this rewritten query, it only needs to access the T
table once to find the rows that meet the conditions and update them directly. The service may write inefficient SQL statements since it does not have a comprehensive understanding of the syntax nor much about the way the kernel executes.
If every user that uses the database is a senior expert proficient in SQL writing with a deep understanding of the core processing of the database, ze can write good SQL. However, this is not what the language was originally designed to be. The goal of SQL is to make users only care about correctly describing their needs. How to handle it efficiently is the job of the database kernel. Therefore, the kernel development implements the query rewriting module, which is responsible for rewriting the user's good SQL equivalent to a kernel's good SQL. This article describes OceanBase's practices in query rewriting.
After OceanBase receives an SQL statement, it parses the syntax and converts it to an internal STMT structure. The input of the query rewriting module is an STMT S1
. After a series of rewriting algorithms, it is equivalent to another STMT S2
. The output S2
will be used to generate logical and physical execution plans. Query rewriting is essentially a pattern matching process. It traverses each object in the STMT structure to check whether the object matches the rewriting pattern. If satisfied, it will be rewritten to another form equivalently. Let’s take the constant-truth condition elimination in figure (2) as an example. The query rewriting traverses all the condition expressions in the S1
and checks whether they are constant truth conditions. Here, the 1=1
is a constant truth condition, which is removed by the rewriting algorithm.
Figure 2: Input and Output of Query Rewriting
The query rewriting module contains many rewriting algorithms. Different rewriting algorithms match different patterns and perform corresponding equivalent transformations. A complete query rewrite attempts to optimize the input STMT with all rewrite algorithms. The query rewriting framework is responsible for iterating various rewriting algorithms to optimize STMT. On the whole, we hope each query rewrite can make SQL better. However, not all rewrite algorithms will transform SQL in a good direction. We divide the rewriting algorithm into the following two categories:
Let's use the following example to explain why some rewrites are triggered based on cost.
Q1:
SELECT * FROM T1 WHERE C1 < 20000 OR C2 < 30 ;
=>
Q2:
SELECT /*SEL_1*/ * FROM T1 WHERE C1 < 20000 UNION ALL
SELECT /*SEL_2*/ * FROM T1 WHERE C2 < 30 AND LNNVL (C1 < 20000);
In OceanBase, Q1 is rewritten by OR Expansion to Q2. Semantically, the two queries are equivalent. Q1 finds the set of records that meet the C1 < 20000
or C2 < 30
. In Q2, SEL_1
finds the set of records that meets the C1 < 20000
, SEL_2
finds the set of records that meets the C2 < 30
and does not meet the C1 < 20000
, and the union of the two parts is the final result. It is worth noting that SEL_2
uses LNNVL(...)
instead of C1 >= 20000
to find records that do not meet the C1 < 20000
. Examining the following example, given the value of r1 (C1, C2) as (NULL, 40), we expect that SEL_2
can return r1. However, if C1 >= 20000
is used, r1 will be filtered out.
For this example of OR Expansion, consider the following two scenarios:
SEL_1
can use the index on C1, and SEL_2
can use the index on C2. If these two filter conditions have strong filtering, index scanning can reduce the overhead of reading data. As such, triggering OR Expansion rewrite helps generate a better execution plan.The author thinks the concept behind each type of rewrite algorithm is not complicated. If a person understands some rewrite rules, it is easy to manually rewrite SQL and convert it equivalently into a friendly form for the database kernel. However, it is challenging to implement these rewrite algorithms in the kernel. Rewriting SQL at the business layer is simple because our task is only to rewrite a known SQL correctly. Rewriting at the kernel layer is difficult since our task is to give any SQL. If we can rewrite it, we should rewrite it as much as possible, and the rewrite must be correct. This puts forward two requirements for the rewriting algorithm:
Correctness is easy to understand. If the result of a rewrite is wrong, it is negative for the value of the business. Completeness is equally important. It requires a rewriting algorithm to have good generality and can deal with complex scenes. With poor generality, the value of the business is limited. Of course, it is difficult for us to make a rewriting algorithm complete. Some complicated situations are difficult to deal with, and forced rewriting may introduce correctness issues. We will be as complete as possible on the premise of ensuring the correctness in the process of implementing a rewriting algorithm. In the following, we will explain the challenges of achieving a complete rewrite under the correctness premise through outer join elimination.
The outer join operation can be divided into left outer join, right outer join, and full outer join. During the process, the left and the right order of outer join is unable to be transformed, which limits the optimizer's choice of join order. Consider the left outer join (L LEFT JOIN R ON L.ID = R.ID
) of L and R. If a row in L is not joined to any row in R, the result will fill the output of the column of R (please see figure 3). You can see the result of the left outer join is a superset of the inner join, which mainly increases the rows generated by the R table.
Figure 3: Differences between Inner and Outer Join
SELECT L.ID, L.C1, R.C2 FROM L LEFT JOIN R ON L.ID = R.ID WHERE R.C2 ='XXX';
/* OuterJoin Elimination * /
SELECT L.ID, L.C1, R.C2 FROM L, R WHERE L.ID = R.ID AND R.C2 ='XXX';
Let’s consider the result of the L LEFT JOIN R
. If there is a filter condition R.C2 = 'XXX'
, the rows generated by filling will be filtered out. The result sets produced by the outer and inner join are the same. We can rewrite the outer join to the inner join. This helps the optimizer consider more of the join order and algorithm. We call condition R.C2 = 'XXX'
a reject-NULL condition in the outer join elimination. The biggest challenge is that given any expression, we can accurately determine whether it is a reject-NULL condition. Let’s consider the following scenario:
It is relatively easy to rewrite the first one, and the second one is more difficult. OceanBase implements the "NULL passing" determination of the expression. Given a nested expression and a set of parameters, it calculates whether the output of the expression is NULL when the input of the parameter is NULL. Based on this determination mechanism, OceanBase supports rewriting in the preceding two scenarios.
We have analyzed the reject-NULL conditions composed of a single column. All columns in the right table are filled in the result of the left outer join. Therefore, we can determine whether the filter expression composed of multiple right table columns is a reject-NULL condition. For example, the following SQL filter condition is an OR condition. Only considering R.C2
, R.C2 IS NOT NULL
is not a reject-NULL condition for the entire SQL. However, if both R.C2
and R.C3
are considered, the OR condition is a reject-NULL condition.
SELECT L.ID, L.C1, R.C2 FROM L LEFT JOIN R ON L.ID = R.ID
WHERE R.C2 IS NOT NULL OR R.C3 IS NOT NULL;
As you can see, outer join elimination is not complicated. It is easy for us to determine whether it is a reject-NULL condition for some simple scenes (i.e., R.C2 directly appears on one side of the judgment expressions). It is also relatively easy to determine some complex scenes (i.e., R.C2 is nested in four arithmetic operations (+,-,*,/)). The difficulty of judging is increased in some more complex scenarios (i.e., R.C2 is nested in some function expressions CASE WHEN, NVL, etc.). In terms of outer join elimination, we first need the abstract reject-NULL property to rewrite correctly. We also need the NULL passing property of abstract expressions to improve the generality of rewriting. Furthermore, the algorithm should be able to identify the reject-NULL condition formed by multiple NULL columns.
We have described how to determine whether a predicate is a reject-NULL predicate as complete as possible. Another important question is whether all reject-NULL predicates can be used to rewrite outer join to inner join. Compared with the previous problem, this problem is of higher importance. It forms the correctness boundary of outer join elimination. If the correctness boundary of the rewrite rule is not accurately summarized, the query rewrite will change the semantics of the business SQL statement, resulting in errors in the query results.
SELECT L.ID, R.ID, T.ID FROM (L LEFT JOIN R ON L.ID = R.ID)
LEFT JOIN T ON (R.ID = T.ID);
Let’s consider the preceding query. The result of the L LEFT JOIN R
will be outer joined with the T
again. The join predicate is R.ID = T.ID
. This predicate contains columns L LEFT JOIN R
in the right table. Independently, this predicate does have the property of reject-NULL. However, as a join predicate of an outer join, even if the judgment result of this predicate is non-true, it will not filter the data on the left side of the outer join. Therefore, although the R.ID = T.ID
determination result is not true when the R.ID
value is NULL, this predicate cannot be used to rewrite the outer join to the inner join.
The query rewriting module of OceanBase implements many rewriting rules. As described in Section 2, rewriting rules are divided into rule-based and cost-based.
After the SQL is parsed, an internal structure (STMT) is formed, which is then handed over to the query to rewrite for equivalence changes. The rewrite framework is responsible for trying each rewrite algorithm in a specific order. When a cost rewrite is encountered, it will trigger additional cost verification. The framework will continue to iterate over the rewrite strategy until the current STMT reaches the convergence state and no more rewrites can be triggered.
In the rewrite framework, the trigger order of each rewrite algorithm is important. Some rewrites need to be triggered before others. There are typical scenarios, such as view merging and predicate movement. View merging attempts to merge multiple query blocks into one, and predicate movement directly pulls up or down the predicate on multiple query blocks. Rewriting the framework will first attempt view merging before triggering predicate movement. This is because if view merging can be successful, it is no longer necessary to consider moving predicates between multiple query blocks. The query rewriting module of OceanBase carefully considers the relationships between the various rewriting algorithms and then determines their trigger order.
Another important issue in the rewrite framework is cost verification. Triggering a cost rewriting may or may not generate a better execution plan. The rewrite framework contains a module for cost verification. The framework iterates over the rewritten results for the rule rewrite set in the process of cost verification. After the cost is rewritten, there may be some new rewriting opportunities. Finally, the query optimizer generates an execution plan for the STMT before and after the rewrite. It determines whether the result after the rewrite can reduce the cost of the entire execution plan.
This article mainly introduces the query rewriting of OceanBase. It briefly introduces the rewriting methods, the types of rewriting algorithms, the main problems when implementing each rewriting algorithm, and how multiple rewriting rules are organized under a rewriting framework. Query rewriting is the focus and difficulty of the query optimizer module. It is also the basic knowledge that both SQL performance tuning workers and DBAs need to master. OceanBase implements many rewrite rules. This article provides an overview of the entire rewrite module. Based on this article, we will systematically launch a series of query rewriting articles to introduce different rewriting rules/algorithms implemented by OceanBase. We will introduce OceanBase's experience in optimizing different relational algebra and calculation expressions, such as subqueries, grouping operations, window functions, join types, and expressions. We believe this series of articles can be a references for readers. You are welcome to communicate and discuss the query rewriting issues with us.
An Interpretation of OceanBase Database Source Code (1): Module Structure
OceanBase - August 25, 2022
OceanBase - August 25, 2022
OceanBase - August 25, 2022
OceanBase - September 9, 2022
OceanBase - May 30, 2022
OceanBase - September 13, 2022
A financial-grade distributed relational database that features high stability, high scalability, and high performance.
Learn MoreAn on-demand database hosting service for SQL Server with automated monitoring, backup and disaster recovery capabilities
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMigrate your legacy Oracle databases to Alibaba Cloud to save on long-term costs and take advantage of improved scalability, reliability, robust security, high performance, and cloud-native features.
Learn MoreMore Posts by OceanBase