Performance of PolarDB-X Global Binlog Interpretation
This article introduces the design and thinking of PolarDB-X global binlog in terms of performance. First, we show the performance of the global binlog through several actual test cases, and then we use these cases to explain the optimization story of the global binlog in depth.
Test preparation
Prepare a PolarDB-X 2.0 instance. The version of the instance used in this test is 5.4.14-16576195. The instance configuration is as follows:
*Instance topology: 8CN+8DN+2CDC
*Single CN node specification: 32 cores 128GB
*Single DN node specification: 32 cores 128GB
*Single CDC node specification: 16 core 32GB
Prepare two ECS pressure testers, machine configuration: 64 core 128G
Terminology
*EPS
Event Per Second, the number of events written to the binlog file per second
*DML EPS
DML Event Per Second, the number of dml events written to the binlog file per second. The so-called dml event refers to the TableMapEvent, WriteRowsEvent, UpdateRowsEvent and DeleteRowsEvent in the binlog
*BPS
Byte Per Second, the number of bytes written to the binlog file per second. For the convenience of expression, M/s is used as the measurement unit
*TPS
Transaction Per Second, the number of transactions written to the binlog file every second
*FPM
File Per Minute, the number of binlog files generated per minute, and the size of a single file is 500M
*Delay Time
Delay time in ms
Test plan
TPCC
Refer to: https://help.aliyun.com/document_detail/405018.html
In this test case, the configuration of TPCC core parameters is as follows:
*warehouses=2000
*loadWorkers=500
*terminals=1024
runLoader. JVM parameter configuration in sh - Xms60g - Xmx60g
runBenchmark. JVM parameter configuration in sh - Xms60g - Xmx60g
Scenario 1: TPCC data import
Test purpose
When importing pressure test data, the DN node will generate a large number of physical binlog instantaneously to observe the performance indicators of the global binlog
Test method
Deploy multiple tpcc packages on each ECS and run multiple./runDatabaseBuild.sh at the same time to construct traffic
Scenario 2: TPCC transaction test
Test purpose
Execute TPCC test, simulate real transaction scenarios, and investigate the performance of global binlog (focus on delay)
Test method
Adjust the pressure test concurrency, construct different tmpC reference indicators, and observe the global binlog delay indicators. Since the global binlog latency of 8CN+8DN is still low when the pressure is full, the following tests are not limited to 8CN+8DN
Sysbench
Refer to: https://help.aliyun.com/document_detail/405017.html
Scenario 1: Sysbench data import
Test purpose
When importing pressure test data, the DN node will generate a large number of physical binlog instantaneously to observe the performance indicators of the global binlog
Test method
Adjust the parameter values of -- tables and -- threads, and test the performance indicators of the global binlog under different pressure states
Scenario 2: Sysbench oltp_ write_ only
Test purpose
Execute sysbench oltp_ write_ Only, test the global binlog performance in the mixed write scenario
Test method
Execute oltp_ write_ Only, construct different qps reference indicators, and observe the global binlog latency
Large Transaction
Test purpose
Test the performance and stability of CDC in the super large transaction scenario, focusing on the delay time
Test method
Refer to the following script to construct transactions of different sizes for testing. According to the following script, a 500M transaction can be constructed for every 20w pieces of data inserted
Large Transaction
Let's introduce some parameters first
storage.isPersistOn
Whether to enable the swap function, that is, whether to support temporary data swap to disk (RocksDB) when memory is insufficient or major events occur. The default value is true
storage.persist.mode
There are two persistence modes: AUTO and FORCE. In AUTO mode, the system will automatically determine whether to swap the data to the disk according to the memory utilization rate; In FORCE mode, the system will forcibly swap the data to the disk
storage.persistNewThreshold
Threshold value for triggering swap when new data is added to the memory. That is, after the memory utilization reaches 85%, the new data will be swapped to the disk by default
storage.persistAllThreshold
Threshold value for triggering swap for stock data in memory. That is, after the memory utilization rate reaches 95%, the stock data will be swapped to disk by default
5G Close Persist
The size of a single transaction is 5G, and the swap function is turned off. In this test scenario, the memory of the old age is 14G. Even considering the data expansion, it is enough to accommodate 5G of data
Delay Time: delay 7 s when all data is sorted, and delay 17 s when all data is output to the global binlog file
Test preparation
Prepare a PolarDB-X 2.0 instance. The version of the instance used in this test is 5.4.14-16576195. The instance configuration is as follows:
*Instance topology: 8CN+8DN+2CDC
*Single CN node specification: 32 cores 128GB
*Single DN node specification: 32 cores 128GB
*Single CDC node specification: 16 core 32GB
Prepare two ECS pressure testers, machine configuration: 64 core 128G
Terminology
*EPS
Event Per Second, the number of events written to the binlog file per second
*DML EPS
DML Event Per Second, the number of dml events written to the binlog file per second. The so-called dml event refers to the TableMapEvent, WriteRowsEvent, UpdateRowsEvent and DeleteRowsEvent in the binlog
*BPS
Byte Per Second, the number of bytes written to the binlog file per second. For the convenience of expression, M/s is used as the measurement unit
*TPS
Transaction Per Second, the number of transactions written to the binlog file every second
*FPM
File Per Minute, the number of binlog files generated per minute, and the size of a single file is 500M
*Delay Time
Delay time in ms
Test plan
TPCC
Refer to: https://help.aliyun.com/document_detail/405018.html
In this test case, the configuration of TPCC core parameters is as follows:
*warehouses=2000
*loadWorkers=500
*terminals=1024
runLoader. JVM parameter configuration in sh - Xms60g - Xmx60g
runBenchmark. JVM parameter configuration in sh - Xms60g - Xmx60g
Scenario 1: TPCC data import
Test purpose
When importing pressure test data, the DN node will generate a large number of physical binlog instantaneously to observe the performance indicators of the global binlog
Test method
Deploy multiple tpcc packages on each ECS and run multiple./runDatabaseBuild.sh at the same time to construct traffic
Scenario 2: TPCC transaction test
Test purpose
Execute TPCC test, simulate real transaction scenarios, and investigate the performance of global binlog (focus on delay)
Test method
Adjust the pressure test concurrency, construct different tmpC reference indicators, and observe the global binlog delay indicators. Since the global binlog latency of 8CN+8DN is still low when the pressure is full, the following tests are not limited to 8CN+8DN
Sysbench
Refer to: https://help.aliyun.com/document_detail/405017.html
Scenario 1: Sysbench data import
Test purpose
When importing pressure test data, the DN node will generate a large number of physical binlog instantaneously to observe the performance indicators of the global binlog
Test method
Adjust the parameter values of -- tables and -- threads, and test the performance indicators of the global binlog under different pressure states
Scenario 2: Sysbench oltp_ write_ only
Test purpose
Execute sysbench oltp_ write_ Only, test the global binlog performance in the mixed write scenario
Test method
Execute oltp_ write_ Only, construct different qps reference indicators, and observe the global binlog latency
Large Transaction
Test purpose
Test the performance and stability of CDC in the super large transaction scenario, focusing on the delay time
Test method
Refer to the following script to construct transactions of different sizes for testing. According to the following script, a 500M transaction can be constructed for every 20w pieces of data inserted
Large Transaction
Let's introduce some parameters first
storage.isPersistOn
Whether to enable the swap function, that is, whether to support temporary data swap to disk (RocksDB) when memory is insufficient or major events occur. The default value is true
storage.persist.mode
There are two persistence modes: AUTO and FORCE. In AUTO mode, the system will automatically determine whether to swap the data to the disk according to the memory utilization rate; In FORCE mode, the system will forcibly swap the data to the disk
storage.persistNewThreshold
Threshold value for triggering swap when new data is added to the memory. That is, after the memory utilization reaches 85%, the new data will be swapped to the disk by default
storage.persistAllThreshold
Threshold value for triggering swap for stock data in memory. That is, after the memory utilization rate reaches 95%, the stock data will be swapped to disk by default
5G Close Persist
The size of a single transaction is 5G, and the swap function is turned off. In this test scenario, the memory of the old age is 14G. Even considering the data expansion, it is enough to accommodate 5G of data
Delay Time: delay 7 s when all data is sorted, and delay 17 s when all data is output to the global binlog file
Related Articles
-
A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team
-
What Does IOT Mean
Knowledge Base Team
-
6 Optional Technologies for Data Storage
Knowledge Base Team
-
What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers
-
Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00