By digoal
PostgreSQL (PG) supports variable-length file types such as text and bytea. The maximum size of these files is 1 GB, so you can store large texts or files in the database. PG uses pglz compression algorithm to compress and store data. However, if the user wants to query the client’s data, the transmitted content is not compressed. This leads to consuming a large bandwidth during transmission.
The following two methods are commonly used to solve the compression problem during network transmission:
1) Use Secure Socket Layer (SSL) to support data compression. With SSL, the data transmitted between the client and the database is encrypted and compressed. However, this will incur extra CPU overhead for encryption and decryption.
2) Use gzip plug-in. It uses the standard compression library, so it supports decompression as long as the client supports gzip. You can use the compressed data for storage and transmission.
client - search(gzip result) - pgdb
The gzip plug-in is a compression and decompression plug-in. The interfaces for calling this plug-in are as follows:
gzip(uncompressed BYTEA, [compression_level INTEGER]) returns BYTEA
gzip(uncompressed TEXT, [compression_level INTEGER]) returns BYTEA
gunzip(compressed BYTEA) returns BYTEA
To solve the transmission problem, return the data compressed by gzip plug-in to the client.
select gzip (column, 压缩级别)
To solve the storage problem, use the values compressed by gzip plug-in for storage in bytea type.
insert into table values (gzip (column, 压缩级别))
For data query, the decompression is supported both on the database and the client, which requires gzipLIB on the client.
wget http://api.pgxn.org/dist/gzip/1.0.0/gzip-1.0.0.zip
unzip gzip-1.0.0.zip
cd gzip-1.0.0/
USE_PGXS=1 make
USE_PGXS=1 make install
pg12@pg11-test-> psql
psql (12beta2)
Type "help" for help.
postgres=# create extension gzip ;
CREATE EXTENSION
This example is quoted from this site
> SELECT gzip('this is my this is my this is my this is my text');
gzip
--------------------------------------------------------------------------
\x1f8b08000000000000132bc9c82c5600a2dc4a851282ccd48a12002e7a22ff30000000
What, the compressed output is longer? No, it looks that way because in hex every character requires two hex digits. The original string looks like this in hex:
> SELECT 'this is my this is my this is my this is my text'::bytea;
bytea
----------------------------------------------------------------------------------------------------
\x74686973206973206d792074686973206973206d792074686973206973206d792074686973206973206d792074657874
For really long, repetitive things, compression naturally works like a charm:
> SELECT gzip(repeat('this is my ', 100));
bytea
----------------------------------------------------------------------------------------------------
\x1f8b08000000000000132bc9c82c5600a2dc4a859251e628739439ca24970900d1341c5c4c040000
Converting a bytea back into an equivalent text uses the encode() function with the escape encoding:
> SELECT encode(gunzip(gzip('this is my this is my this is my this is my text')), 'escape')
encode
--------------------------------------------------
this is my this is my this is my this is my text
PostgreSQL 12: B-tree Index Improvements – Duplicate Key and Sort by CTID
digoal - December 11, 2019
Alibaba Cloud Serverless - March 19, 2019
digoal - February 5, 2020
digoal - May 17, 2021
digoal - February 5, 2020
Alibaba Clouder - June 1, 2018
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreReach global users more accurately and efficiently via IM Channel
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreMore Posts by digoal