By Guoqing Ma (Yiyun)
When you get started to learn about databases, you will always be asked a question: What is the biggest advantage of database systems over file systems? There are many specific advantages, and one of the most important parts is that the database system can perform better concurrent access control.
So, how does the database system perform concurrent access control?
This series of articles takes MySQL 8.0.35 code as an example, and is divided into two parts to provide an overall introduction to concurrent access control in MySQL. This article is part one, which focuses on table-level concurrent access control.
Next part: https://www.alibabacloud.com/blog/how-does-mysql-implement-concurrency-control-part-ii_601785
According to popular concepts in recent years, MySQL is a typical storage-computing separation architecture, with MySQL Server as the computing layer and Storage Engine as the storage layer. Therefore, concurrent access control also needs to be handled separately at the computing layer and the storage layer. It is worth noting that MySQL supported multiple storage engines at the beginning of its design, which is also one of the important reasons for MySQL's rapid popularity. However, with the development of MySQL, InnoDB has basically become the dominant engine in the era of MySQL 8.0. Therefore, the subsequent analysis in this article mainly focuses on the InnoDB engine.
When it comes to data access, MySQL data is divided into tables, rows, and columns from the perspective of users. The internal perspective of MySQL includes table, tablespace, index, B+tree, page, row, and column. In MySQL 8.0, a table occupies a tablespace exclusively by default, so for the sake of simplicity, the following content does not distinguish between tables and tablespaces.
Returning to the topic, the concurrent access control in MySQL is also designed based on the internal data structure of MySQL, including:
The subsequent content will focus on the above three parts, and this article will focus on "table-level concurrent access control".
In the process of using the database, an operation that cannot be bypassed is DDL, especially DDL operations directly on the database running online. One question MySQL users often wonder is: "Will my DDL lock the table? I am afraid that my business may suffer". The reason for this question is that in earlier MySQL versions (before 5.6), DML operations cannot be performed during DDL. As a result, if DDL operations are performed on a large table, the data cannot be written into the business for a long time. To reduce the application to the business during DDL operations, many third-party DDL functions are derived, among which one of the most used is pt-online-schema-change.
In fact, MySQL has already supported Online DDL operations since MySQL 5.6. By MySQL 5.7, the support scope of Online DDL operations has been further expanded. By MySQL 8.0, MySQL has further supported the Instant DDL function. Therefore, executing DDL operations on MySQL will basically not affect the business.
For a detailed introduction to Online DDL, please refer to the official documentation[1]. If you want to directly read the simplified version, you can refer to the article[2] compiled by the author earlier.
[1] Official Documentation: https://dev.mysql.com/doc/refman/8.4/en/innodb-online-ddl-operations.html
[2] http://mysql.taobao.org/monthly/2021/03/06/
Whether DDL will lock the table is actually the most important issue in table-level concurrent access control. The most important structure for implementing concurrent access to DDL, DML, and DQL in MySQL is the MDL lock. Let's look at a simple example:
CREATE TABLE `t1` (
`id` int NOT NULL,
`c1` int DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1, 10);
INSERT INTO t1 VALUES (2, 20);
INSERT INTO t1 VALUES (3, 30);
In the preceding example:
Why does the preceding situation occur? Here, the wait relationship can be seen by combining the metadata_locks table under the performance_schema:
It can be seen that:
From the code path, the MDL locking logic is in the process of opening the table, and the specific entry function is open_and_process_table
. The specific function stack is as follows:
|--> open_and_process_table
| |--> open_table
| | |--> mdl_request.is_write_lock_request
| | |--> thd->mdl_context.acquire_lock // Request the global MDL lock
| | |
| | |--> open_table_get_mdl_lock
| | | |--> thd->mdl_context.acquire_lock // Request the table MDL lock
The entry function for upgrading the MDL lock logic during DDL is mysql_alter_table
. The specific function stack is as follows:
|--> mysql_alter_table
| |--> mysql_inplace_alter_table
| | |--> wait_while_table_is_used
| | | |--> thd->mdl_context.upgrade_shared_lock // Upgrade the MDL lock
| | | | |--> acquire_lock // Request the table MDL EXCLUSIVE lock
Through the above simple example, we know the basic concept of the MDL lock and the so-called reason why DDL causes "table lock". Strictly speaking, an MDL lock is not a table-level lock, but a metadata lock. For a more in-depth introduction to MDL, please refer to the article. This article will not elaborate here. MySQL introduced the MDL lock in version 5.6, so with the MDL lock, is there no need for other table-level locks?
Before answering the above question, let's take a look at the basic process of handling table-level locks at the MySQL Server layer. A table-level lock is required for any operation on a table in MySQL. The specific entry function is lock_tables
. The specific function stack is as follows:
|--> lock_tables
| |--> mysql_lock_tables
| | |--> lock_tables_check // Determine whether a lock is required
| | |--> get_lock_data // Calculate the number of tables to be locked and initialize the MYSQL_LOCK structure
| | | |--> file->lock_count
| | |
| | |--> lock_external
| | | |--> ha_external_lock // Call the engine handler interface
| | |
| | |--> thr_multi_lock
| | | |--> sort_locks
| | | |--> // Traversal locks
| | | |--> thr_lock // Lock or wait
| | | | |--> wait_for_lock // Waiting for the table-level lock
As you can see from the stack above, the entire locking process includes the following steps:
For the InnoDB engine, the lock_count
interface directly returns 0, indicating that the tables of the InnoDB engine do not need to be locked at the Server layer, and all processing is completed directly in the external_lock
interface, which will be elaborated in the subsequent section. For other engines, taking the CSV engine as an example, the lock_count
interface returns 1, so you need to enter the subsequent thr_lock
locking logic. The types of thr_lock
locks and the conflicting relationships between different types of locks will not be elaborated here.
In a narrow sense, the lock added by the thr_lock
interface is the table-level lock at the Server layer. The specific locking logic, mutually exclusive relationship of lock types and lock wait logic will not be elaborated here. If you are interested, you can refer to the related code for understanding.
As mentioned earlier, the lock_tables interface at the Server layer calls the Handler interface at the Engine layer. Specifically, it calls the external_lock interface. Does InnoDB add table-level locks in this interface? Let's take a look at the call stack of the function first:
| |--> // lock_type == F_WRLCK
| |--> m_prebuilt->select_lock_type = LOCK_X
| |
| |--> // lock_type == F_RDLCK && trx->isolation_level == TRX_ISO_SERIALIZABLE
| |--> m_prebuilt->select_lock_type = LOCK_S
| |
| |--> // others
| |--> m_prebuilt->select_lock_type == LOCK_NONE
|--> row_search_mvcc
| |--> lock_table(..., prebuilt->select_lock_type == LOCK_S ? LOCK_IS : LOCK_IX, ...)
As can be seen from the stack above, when entering the locking logic of the InnoDB layer:
The specific types of table-level locks at the InnoDB layer and the conflicting relationships between different types of locks will not be elaborated here. Table-level locks at the Engine layer can be viewed in the data_locks table under the performance_schema:
The MDL locks in MySQL and the table-level locks at the Server and InnoDB layers have been introduced in the above section. What locks are added when performing the LOCK TABLES operation? First, take a look at the execution path of the LOCK TABLES operation:
|--> mysql_execute_command
| |--> // switch (lex->sql_command)
| |--> // SQLCOM_LOCK_TABLES
| |--> trans_commit_implicit // Implicitly commit the previous transaction
| |--> thd->locked_tables_list.unlock_locked_tables // Release the previous table-level lock
| |--> thd->mdl_context.release_transactional_locks // Release the previous MDL lock
| |
| |--> lock_tables_precheck
| |--> lock_tables_open_and_lock_tables
| | |--> open_tables
| | | |--> lock_table_names // Lock the table based on the table name (the table has not been opened at this time)
| | | | |--> mdl_requests.push_front
| | | | |--> thd->mdl_context.acquire_locks
| | | |
| | | |--> open_and_process_table
| | |
| | |--> lock_tables
As you can see from the stack above, for an explicit LOCK TABLES operation:
We will use a table to summarize the locking conditions of different LOCK TABLES operations (InnoDB table):
MDL Lock | Table-level Locks at the Server Layer | Table-level Locks at the InnoDB Layer | |
---|---|---|---|
LOCK TABLES t1 READ; | SHARED_READ_ONLY of the table | None | None |
LOCK TABLES t1 WRITE; | SHARED_NO_READ_WRITE of the table, INTENTION_EXCLUSIVE of the database, Global INTENTION_EXCLUSIVE |
None | None |
FLUSH TABLES WITH READ LOCK; | Global SHARED | None | None |
For the issues caused by table-level locks in MySQL, here are two common online cases:
It is the example raised earlier in the introduction of MDL locks. In fact, this kind of problem is relatively easy to find. You can see a large number of MDL lock waits by directly executing the show processlist. Here we mainly focus on how to deal with this kind of problem. There are mainly two methods to deal with:
A typical scenario is to enable general_log and set the output format to TABLE. Since the genelog_log table is a CSV engine, you need to use the table-level lock at the Server layer to control concurrent inserts. When the write volume is large, the performance bottleneck occurs when writing data to the CSV table. Phenomenally, a large number of connections are "waiting for table level lock". There is no good way to optimize the write performance of the CSV table, so the best way to deal with it is to directly close general_log.
The above content introduces the table-level locking process, let's make a summary:
open_and_process_table
interface), the lock type of MDL needs to be determined according to the type of operation (in fact, most requests have already completed the initialization of MDL requests in the lexical parsing phase).lock_tables
interface will be called to add table-level locks at different locations according to the type of operation. Table-level locks are divided into table-level locks at the Server layer and table-level locks at the Engine layer:external_lock
interface of the Engine layer to add the table-level lock at the Engine layer (It can be known through the previous code stack that it is only to determine the type of locking that needs to be added later, and the locking operation is post-placed). There is no need to add a table-level lock at the Server layer.external_lock
interface, so you need to add a table-level lock at the Server layer.Sevenfold Performance Improvement | Alibaba Cloud AnalyticDB Spark Vectorization Capability Analysis
ApsaraDB - November 22, 2024
Alibaba Clouder - February 3, 2021
Alibaba Cloud Community - May 2, 2024
ApsaraDB - June 1, 2022
Alibaba Cloud Community - May 3, 2024
ApsaraDB - February 19, 2020
Alibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn MoreAnalyticDB for MySQL is a real-time data warehousing service that can process petabytes of data with high concurrency and low latency.
Learn MoreAn on-demand database hosting service for MySQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreApsaraDB Dedicated Cluster provided by Alibaba Cloud is a dedicated service for managing databases on the cloud.
Learn MoreMore Posts by ApsaraDB