Hot row update optimization - PolarDB - Alibaba Cloud Documentation Center

Hot rows are rows in a database that are frequently updated or modified. In high-concurrency scenarios, multiple transactions may attempt to simultaneously update hot rows, which leads to significant row lock contention and wait times. This contention can severely degrade system performance. To resolve this issue, PolarDB implements innovative optimizations at the database kernel level to mitigate lock contention and improve system performance.

Background information

Pain points:

When a transaction updates a row in a database, the row remains locked until the transaction is committed or rolled back. If multiple transactions attempt to simultaneously update the same row, only one transaction can continue. The other transactions must wait. For this serialization of updates on the same hot row, traditional database and table sharding policies may offer limited benefits to performance.
On e-commerce platforms, promotional strategies such as purchase limits and flash sales often lead to a surge of requests to update a hot row in the database in a short period of time. This results in severe row lock contention and prolonged wait times. Prolonged wait times for update requests degrade system performance and negatively affect user experience.

Enhancing hardware is not an effective solution to the preceding issue. To resolve this issue, PolarDB implements innovative optimizations at the database kernel level. This allows the system to automatically identify update requests for hot rows and group update requests for the same hot row at specific time intervals. Different groups of updates are processed in parallel by using a pipelining approach. The optimizations reduce transaction wait times and significantly improve system performance.

Solution

Transition from serial processing to pipeline processing
Parallel processing is the most effective method to improve database system performance. However, issues occur when the system processes requests to update the same hot row in parallel. PolarDB resolves the preceding issue by using an innovative pipelining approach, which maximizes the parallel processing of update requests for hot rows.
The SQL statements used to update hot rows are marked with the autocommit or COMMIT_ON_SUCCESS label. The optimized MySQL kernel layer can automatically recognize the marked update requests. The layer collects the update requests at specific time intervals and hashes the update requests to different buckets based on their primary or unique keys. Updates within each bucket are grouped and processed based on the order of arrival.
To use a pipelining approach to handle update requests, two execution units are used to manage the grouped operations. When the update requests in the first group are ready to be committed, the second group starts collecting new update requests. When the update requests in the second group are ready to be committed, the update requests in the first group are committed and new update requests are collected. The two groups operate in a cyclic manner, which enables parallel processing.
Multi-core CPUs are widely used in the industry. This pipelining approach can make full use of hardware resources, enhance CPU utilization, improve the parallel processing capability of database systems, and maximize the throughput of database systems.
Reduce wait times for acquiring row locks
When a row in a database is updated, the row is locked to ensure data integrity. If another operation attempts to access the same row, the operation must wait until the lock is available. If a lock request cannot be immediately processed, the request enters a waiting state. This increases processing delays and triggers deadlock detection, which consumes additional resources.
The solution introduces a method of grouping updates based on chronological order. The first update request in a group is called the leader. The leader reads and locks the row. The subsequent update requests in the same group are called followers. When followers request to lock the row, they first check if the leader already holds the lock. If the leader has the lock, the followers immediately obtain the lock.
This strategy minimizes the number of times a row lock is requested and reduces the required time to obtain locks. As a result, the overall performance of the database system is improved.
Reduce the traversals of B-tree indexes
MySQL manages data based on B-tree indexes. During each query, MySQL traverses all indexes to locate the required row. If the table is large and has multiple index levels, an extended period of time is required to traverse the indexes.
In the update request grouping mechanism described in the preceding section, only the leader of each group needs to traverse indexes to locate the relevant rows. After the leader updates the rows, the updated rows are cached in memory. After followers in the same group obtain the row locks, the followers can directly read the rows from memory without the need to traverse all indexes.
This mechanism reduces the number of index traversals and the time consumption.

Prerequisites

Your PolarDB cluster runs one of the following database engine versions:
- PolarDB for MySQL 5.6 whose revision version is 20200601 or later.
- PolarDB for MySQL 5.7 whose revision version is 5.7.1.0.17 or later.
- PolarDB for MySQL 8.0 whose revision version is 8.0.1.1.10 or later.
The binary logging feature is enabled for the PolarDB for MySQL cluster.
The rds_ic_reduce_hint_enable parameter is set to OFF.

Note

For information about how to update the minor database engine version of a cluster, see Minor version update.
For information about how to modify the cluster parameters, see Configure cluster and node parameters.
For compatibility with MySQL configuration files, all parameters in the PolarDB console console are prefixed loose_. If you want to modify the rds_ic_reduce_hint_enable parameter in the PolarDB console, use the MySQL-compatible version that has the loose_ prefix, which is loose_rds_ic_reduce_hint_enable.

Limits

The hot row update optimization feature is not supported in the following scenarios:

The table that contains the hot rows is partitioned.
A trigger is defined for the table that contains the hot rows.
The statement queue mechanism is applied to hot rows.
If global binary logging is enabled but session-level binary logging is disabled, the hot row update optimization feature cannot be used for UPDATE statements.

Usage

Enable the hot row update optimization feature
You can use the following parameter in the PolarDB console to enable or disable the hot row update optimization feature.
Parameter
Description
hotspot
Specifies whether to enable the hot row update optimization feature. Valid values:
ON
OFF (default)
Note
- For more information about how to modify the cluster parameters, see Configure cluster and node parameters.
- For compatibility with MySQL configuration files, all parameters in the PolarDB console console is prefixed with loose_. If you want to modify the hotspot parameter in the PolarDB console, use the MySQL-compatible version that has the loose_ prefix, which is loose_hotspot.

The hot row update optimization feature involves three new optimizer hints.

Hint	Required?	Description
COMMIT_ON_SUCCESS	Required	Commits a transaction when the update is successful.
ROLLBACK_ON_FAIL	Optional	Rolls back a transaction when the update fails.
TARGET_AFFECT_ROW(1)	Optional	Explicitly specifies that the request updates only one row. If more than one row or no row is updated, the update fails.

Note

A transaction is automatically committed when a hint in the transaction takes effect. Therefore, you must place a hint in the last SQL statement of the transaction.

Example: Update the values of column c in the sbtest table.

UPDATE /*+ COMMIT_ON_SUCCESS ROLLBACK_ON_FAIL TARGET_AFFECT_ROW(1) */ sbtest SET c = c + 1 WHERE id = 1;

Related operations

Custom parameter configuration

You cannot modify the following parameters in the PolarDB console. To modify the parameters, go to Quota Center, find the PolarDB hotspot row parameter adjustment quota name, and click Apply in the Actions column.

Parameter	Description
hotspot_for_autocommit	Specifies whether to enable the hot row update optimization feature for the `UPDATE` statements in `auto-commit` mode. Valid values: ON OFF (default)
hotspot_update_max_wait_time	The maximum time that the leader in a group waits for the followers to join the group during the process during the grouping and processing of update requests for the same data row. Unit: microseconds. Default value: 100 μs.
hotspot_lock_type	Specifies whether to use a new type of row lock during the grouping and processing of update requests for the same data row. Valid values: ON OFF (default) Note If you set this parameter to true, followers can directly use the lock for the same row after the leader in the group obtains the lock.

View parameter configurations

You can execute the following statement to view the parameter configurations for the hot row update optimization feature:

SHOW variables LIKE "hotspot%";

Sample result:

+------------------------------+-------+
|Variable_name                 | Value |
+------------------------------+-------+
|hotspot                       | OFF   |
|hotspot_for_autocommit        | OFF   |
|hotspot_lock_type             | OFF   |
|hotspot_update_max_wait_time  | 100   |
+------------------------------+-------+

View the usage of the hot row update optimization feature

You can execute the following statement to view the usage of the hot row update optimization feature:

SHOW GLOBAL status LIKE 'Group_update%';

Performance testing

Test table and statement:

Table

CREATE TABLE sbtest (id INT UNSIGNED NOT NULL, c BIGINT UNSIGNED NOT NULL, PRIMARY KEY (id));

Statement

UPDATE /*+ COMMIT_ON_SUCCESS ROLLBACK_ON_FAIL TARGET_AFFECT_ROW(1) */ sbtest SET c = c + 1 WHERE id = 1;

Test tool: Sysbench.
Test scenarios and results
- Scenario 1: one hot row and an 8-core CPU
  In this scenario, an 8-core CPU is used and a large number of concurrent requests are sent to update a hot row. After the hot row update optimization feature is enabled, the performance of the database is improved by up to 64 times.
- Scenario 2: one hot row and a 32-core CPU
  In this scenario, a 32-core CPU is used and a large number of concurrent requests are sent to update a hot row. After the hot row update optimization feature is enabled, the peak queries per second (QPS) are increased by 63 times. When the number of concurrent requests reaches 8,000, the performance of the database is improved by 46 times.
- Scenario 3: eight hot rows and a 32-core CPU
  In this scenario, a 32-core CPU is used and a large number of concurrent requests are sent to update eight hot rows. After the hot row update optimization feature is enabled, the peak QPS is improved by 20 times.
  When the hot row update optimization feature is disabled and the number of concurrent requests reaches 16,000, database failures occur and no results are returned. In contrast, when the hot row update optimization feature is enabled, row data is updated as expected and the QPS value remains stable.