The sequential_uuid extension can produce two sequential UUID generators to solve the random I/O problem caused by random UUID generators.
Prerequisites
The extension is supported on the PolarDB for PostgreSQL clusters that run the following engine:
PostgreSQL 14 (revision version 14.5.1.0 or later)
PostgreSQL 11 (revision version 1.1.28 or later)
You can execute one of the following statements to view the revision version of a PolarDB for PostgreSQL cluster:
PostgreSQL 14
SELECT version();
PostgreSQL 11
SHOW polar_version;
Background information
A random UUID generator produces even values within a given range. This means that the pages where all indexes are located has the same probability of being hit when data is inserted into the indexes. Therefore, all indexes are forced into memory. This problem does not exist when small indexes are used. When the index size exceeds the size of the shared buffer (or RAM), the cache hit rate drops rapidly.
Compared with random UUIDs, sequence-based or timestamp-based UUIDs are sequential. New data is almost always inserted in the rightmost side of the index to improve the cache hit rate. The new sequence value is greater than all previous values and has the same timestamp.
Sequential UUID generators increase the predictability of UUIDs and the probability of cross-machine UUID collisions.
For more information about the advantages of sequential UUIDs, see Sequential UUID Generators.
The sequential_uuid extension is mainly used to create more sequential UUID generators without significantly reducing randomness. Low randomness may increase the probability of collisions and the predictability of UUIDs.
Generator design
A simple way to generate sequential UUIDs is to prefix UUIDs with sequential values. For example, you can use sequences or timestamps and add random data until the randomness of UUIDs reaches 16 B. The UUIDs generated in this way are continuous, but this method causes the following problems:
Low randomness: If you use the sequences that generate bigint values, the randomness of resulting UUIDs is reduced from 16 B to 8 B. Timestamps reduce randomness in a similar way, depending on the precision of the timestamps. Low randomness increases the probability of collisions and the predictability of UUIDs. For example, you can determine which UUIDs are closely generated, and even infer their timestamps.
Bloating: If values continue to grow, indexes may bloat after historical data is deleted. For example, timestamp indexes may bloat in log tables.
To solve the two problems, the sequential_uuid extension produces UUID generators that use periodical loopback. The loopback occurs after a specified number of UUIDs are generated or after a period of time. In either case, UUIDs are generated in units of block and in the form of
(Block ID;Random data)
The size of the block ID depends on the number of blocks and is fixed (depending on the generator parameter). For example, for blocks with the default value of 64 KB, two bytes are used to store block IDs. Loopback eventually occurs as block IDs periodically increase.
A sequential UUID generator can use blocks with 256 UUIDs. The formula to calculate a two-byte block ID:
(nextval('s') / 256) % 65536
Note×Loopback occurs once when the generator produces 16 M (256 × 65536) UUIDs.
The block size is determined by the number of generated UUIDs.
By default, a timestamp-based UUID generator produces 64 KB blocks, which is consistent with that of a sequential generator). The formula to calculate a block ID:
(timestamp / 60) % 65536
NoteLoopback occurs once on the generator every 45 days.
The block size is defined as an interval. The default value is 60 seconds.
UUID generation functions
The sequential_uuid extension provides two functions that produce sequential UUID generators. One function uses sequences and the other uses timestamps.
The uuid_sequence_nextval function accepts the following parameters:
An object of the regclass type (sequence).
A block size of the integer type (default value 65536).
A block number of the integer type (default value 65536).
Use sequences to produce a sequential UUID generator:
CREATE EXTENSION sequential_uuids; CREATE SEQUENCE s; SELECT uuid_sequence_nextval('s'::regclass, 256, 65536);
Sample result:
uuid_sequence_nextval -------------------------------------- 00005547-8a67-452d-bdf7-b390f1edc49b (1 row)
The uuid_time_nextval function accepts the following parameters:
A time interval of the integer type (default value 60).
A block number of the integer type (default value 65536).
Use timestamps to produce a sequential UUID generator:
CREATE EXTENSION sequential_uuids; SELECT uuid_time_nextval(1, 256);
Sample result:
uuid_time_nextval -------------------------------------- 08dac705-8776-4ce3-a45c-123fd65e11e8 (1 row)
The default values of the preceding parameters are applicable in most scenarios.