By digoal
During peak hours, many people in a certain area may book a ride at the same time. For example, a lot of people may book a ride outside an office building when work shifts end or when a concert ends at a stadium.
In these scenarios, a request is initiated from the same location, the nearest car is found, and then the car is locked.
These scenarios may lead to locking conflicts in a database. A previous article describes advisory locks that can significantly improve the database throughput. However, there is still a hidden drawback. If everyone sends a request to lock nearby cars from the same location, redundant scanning and filtering problems will occur. (After A locks the first nearby car, B must skip the first car and lock the second nearby car, and C must skip the first two nearby cars and lock the third car, and so forth.) The higher the concurrency is, the more cars are skipped. If this drawback is resolved, the performance will improve greatly.
How to resolve this drawback?
Optimization method: When more than one person books a ride at the same time and place, and if the same rule of selecting a nearby car is used, the same car will be selected multiple times. This article describes how to use advisory locks to prevent row lock conflicts. Although this method prevents lock conflicts, scanning is still performed from near to far. When concurrent requests are initiated, multiple rows in sessions are scanned for unlocked rows. Therefore, better optimization methods are required.
Here is a method that is similar to "group commit" in a database. In this method, multiple cars are selected simultaneously and allocated by the system to people who book a ride at the same place. There is also a method that must be implemented in a database. A discrete factor is provided, and the car selected each time may not be the nearest car.
Cars within a specified distance can be selected randomly, but this selection must be done in the index, and only one result can be selected during both the index scan and heap scan in the database. This is similar to random scanning of indexed points.
According to the big bang theory, the beginning of the universe was a point. After the big bang, the point gradually expanded and formed the present universe.
The ride-hailing situation for DiDi is similar. During peak hours, many people from an office building concentrate at one point, and available DiDi cars are on the periphery or nearby. As mentioned earlier, if the distance to a car is calculated from this point for all these people, the nearest car is the same car. Therefore, conflict occurs.
You can scatter concentrated points in an office building to prevent conflicts when the system searches for the nearest cars. This reduces the filtering effort, improves the search performance, and increases the throughput.
This article provides only one example because the methods for spatial data processing are similar (specify a discrete range for X and Y axes to scatter the data).
Assume that we have 10 million data records, the ID is the primary key, and the requirement is to search for the ID closest to an input value and lock the ID. If the ID is locked, we need to lock the next ID closest to the input value.
For example, if the input value is 50, lock 50 first. If 50 has been locked by another session, lock 49 or 51, and so forth.
1) Create a test table.
postgres=# create table a(id int primary key, info text, crt_time timestamp);
CREATE TABLE
2) Write 10 million test data records into the table.
postgres=# insert into a select generate_series(1,10000000), 'test', now();
3) Create a GiST index that supports distance search operators.
postgres=# create extension btree_gist;
CREATE EXTENSION
postgres=# create index idx_a_1 on a using gist(id);
Create a test script to simulate the ride-hailing scenario during peak hours: Enter the same point to lock the closest point. If the closest point has been locked, skip it and lock the point next to it. In this example, we use pg_try_advisory_xact_lock() to skip locked rows.
vi test.sql
select * from a where pg_try_advisory_xact_lock(id) order by id <-> 5000000 for update limit 1;
Start the stress testing. The throughput can be up to 49,000 TPS.
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120
progress: 3.0 s, 45775.8 tps, lat 1.224 ms stddev 0.828
progress: 4.0 s, 45571.5 tps, lat 1.229 ms stddev 0.826
progress: 5.0 s, 49345.6 tps, lat 1.135 ms stddev 0.747
progress: 6.0 s, 48948.0 tps, lat 1.144 ms stddev 0.856
progress: 7.0 s, 49578.2 tps, lat 1.129 ms stddev 0.758
As previously mentioned, there is one problem when concurrent requests are initiated from the same point to search for and lock the nearest car. After A locks the first nearby car, B must skip the first car and lock the second nearest car, C must skip the first two and lock the third nearest car, and so forth. More cars are skipped if more parallel requests are received.
The following SQL statements can help understand this problem.
1) Session A
postgres=# begin;
BEGIN
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where pg_try_advisory_xact_lock(id) order by id <-> (5000000-500000+100) for update limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..0.54 rows=1 width=27) (actual time=0.096..0.097 rows=1 loops=1)
Output: id, info, crt_time, ((id <-> 4500100)), ctid
Buffers: shared hit=5
-> LockRows (cost=0.42..397168.88 rows=3333333 width=27) (actual time=0.095..0.095 rows=1 loops=1)
Output: id, info, crt_time, ((id <-> 4500100)), ctid
Buffers: shared hit=5
-> Index Scan using idx_a_1 on public.a (cost=0.42..363835.55 rows=3333333 width=27) (actual time=0.092..0.092 rows=1 loops=1)
Output: id, info, crt_time, (id <-> 4500100), ctid
Order By: (a.id <-> 4500100)
Filter: pg_try_advisory_xact_lock((a.id)::bigint)
Buffers: shared hit=4
Planning time: 0.111 ms
Execution time: 0.135 ms
(13 rows)
2) Session B: Before Session A releases the lock, a request is initiated from the same point to lock the nearest point.
Rows Removed by Filter: 1
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where pg_try_advisory_xact_lock(id) order by id <-> (5000000-500000+100) for update limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..0.54 rows=1 width=27) (actual time=0.128..0.128 rows=1 loops=1)
Output: id, info, crt_time, ((id <-> 4500100)), ctid
Buffers: shared hit=5
-> LockRows (cost=0.42..397168.88 rows=3333333 width=27) (actual time=0.127..0.127 rows=1 loops=1)
Output: id, info, crt_time, ((id <-> 4500100)), ctid
Buffers: shared hit=5
-> Index Scan using idx_a_1 on public.a (cost=0.42..363835.55 rows=3333333 width=27) (actual time=0.114..0.114 rows=1 loops=1)
Output: id, info, crt_time, (id <-> 4500100), ctid
Order By: (a.id <-> 4500100)
Filter: pg_try_advisory_xact_lock((a.id)::bigint)
Rows Removed by Filter: 1
Buffers: shared hit=4
Planning time: 0.112 ms
Execution time: 0.168 ms
(14 rows)
You can see in the preceding snippet that filtering is performed. Car #4500100 should be returned based on the distance, but it has already been locked. Therefore, the system skips it and looks for the next lockable point. Here, one row is removed by the filter.
The higher the number of parallel requests, the more rows are locked and removed by the filter, which affects the performance.
Scatter concentrated points onto a plane with a specified radius, for example, with 1 km radius. That is to say, your current location is not used to calculate the distance from the nearest car. Instead, a random point that is one kilometer away from your current location is used for calculation.
We have set a distance range in this example. Specifically, we distribute concentrated points (customers) to scattered points (customers) within a 5,000 m distance from the concentrated points. Then we calculate the nearest point (car) for a scattered point (customer) and lock it (the car).
vi test.sql
\set seed random(1,5000)
select * from a where pg_try_advisory_xact_lock(id) order by id <-> (5000000+2500-:seed) for update limit 1;
Stress testing result:
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120
progress: 5.0 s, 150380.9 tps, lat 0.372 ms stddev 0.165
progress: 6.0 s, 151711.9 tps, lat 0.369 ms stddev 0.168
progress: 7.0 s, 152098.8 tps, lat 0.368 ms stddev 0.154
progress: 8.0 s, 152003.3 tps, lat 0.368 ms stddev 0.156
progress: 9.0 s, 152421.4 tps, lat 0.367 ms stddev 0.154
progress: 10.0 s, 153108.7 tps, lat 0.366 ms stddev 0.148
progress: 11.0 s, 151427.8 tps, lat 0.370 ms stddev 0.156
We can randomly scatter concentrated points (customers) onto a plane of a specified range to lock the nearby points (cars). This can more than double the throughput, from 49,000 TPS to 151,000 TPS.
When you find something optimizable, go ahead and perform the optimization and improve the performance. This idea is helpful for all application scenarios involving hotspot elimination.
Use Geometry Split to Optimize PostgreSQL Performance in GiST Index-based Polygon Search
PostgreSQL-enabled Electronic Fences: Application Scenarios and Performance
digoal - January 18, 2021
Alibaba Clouder - November 22, 2018
digoal - May 16, 2019
ApsaraDB - October 22, 2024
Iain Ferguson - May 25, 2022
ApsaraDB - September 7, 2021
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMigrate your legacy Oracle databases to Alibaba Cloud to save on long-term costs and take advantage of improved scalability, reliability, robust security, high performance, and cloud-native features.
Learn MoreMigrating to fully managed cloud databases brings a host of benefits including scalability, reliability, and cost efficiency.
Learn MoreMore Posts by digoal