By Peng Xiaoqiang (Lufeng)
Alibaba Cloud AnalyticDB for PostgreSQL (ADB PG, formerly known as HybridDB for PostgreSQL) is a real-time data warehousing service based on the PostgreSQL kernel Massively Parallel Processing (MPP) architecture. This product supports complex extract-transform-load (ETL) tasks and high-performance online queries. It is closely integrated with the Alibaba Cloud ecosystem.
AWS Redshift is also an MPP data warehouse server based on the PostgreSQL kernel engine and is widely used as a data warehouse in AWS. ADB PG is highly compatible with Redshift in terms of architecture and syntax. This article describes how to migrate data between these two data warehouse platforms.
The latest Alibaba Cloud ADB PG 6.0 is built on PostgreSQL 9.4, whereas Redshift is based on PostgreSQL 8.2. Compared to ADB PG, Redshift provides a richer set of features and is fully compatible with PostgreSQL ecosystem tools, including PostGIS, MADlib, and other extended analysis tools. Redshift only supports column-store tables, not PostgreSQL native row-store tables. Comparatively, ADB PG retains support for PostgreSQL row-store tables to implement high-throughput data update operations. It also supports column-store tables for OLAP large table aggregation operations.
Item | ADB PG | Redshift |
---|---|---|
PostgreSQL version | PG 9.4 | PG 8.2 |
SQL syntax | Compatible with PG and partially compatible with Oracle syntax | Compatible with PG |
Transactions | Supported | Supported |
Row-store | Supported | Not supported |
Column-store | Supported | Supported |
Table partitions | Supported | Supported |
Cloud storage | Supports online access to OSS data | Supports online access to S3 data |
Multimodal analysis | PostGIS, MADLib, and vector search |
Both, Alibaba Cloud ADB PG and AWS Redshift are based on the standalone PostgreSQL kernel engine, so their syntax is highly compatible. Some of their syntax descriptions are slightly different, as detailed below.
Syntax | Redshift | ADB PG |
---|---|---|
DISTKEY(col) | DISTRIBUTED BY(col) | |
Random table distribution | DISTSTYLE EVEN | DISTRIBUTED RANDOMLY |
Table replica distribution | DISTSTYLE ALL | DISTRIBUTED REPLICATED |
Data encoding and compression | AZ64 /BYTEDICT /DELTA /LZO /RAW /RUNLENGTH /ZSTD | (COMPRESSTYPE={ZStD/ZLIB/QUICKLZ/RLE_TYPE/NONE}) |
Column-store sort key | SORTKEY (col) | "with(APPENDONLY=true, ORIENTATION=column)sortkey(volume)" |
System functions | PG8.2 and some custom functions | PG9.4 and some custom functions |
Table creation statement for Redshift, including the distribution key DISTKEY and sort column:
CREATE TABLE schema1.table1(
filed1 VARCHAR(100) ENCODE lzo,
filed2 INTEGER DISTKEY,
filed3 INTEGER,
filed4 BIGINT ENCODE lzo,
filed5 INTEGER,)
INTERLEAVED SORTKEY (
filed1,
filed2);
Table creation statement for ADB PG:
CREATE TABLE schema1.table1
(
filed1 VARCHAR(100) ,
filed3 INTEGER,
filed5 INTEGER
)
WITH(APPENDONLY=true,ORIENTATION=column,COMPRESSTYPE=zlib)
DISTRIBUTED BY (filed2)
SORTKEY
(
filed1,
filed2
)
Table creation statement for Redshift, including the ENCODE and SORTKEY options:
CREATE TABLE schema2.table2
(
filed1 VARCHAR(50) ENCODE lzo,
filed2 VARCHAR(50) ENCODE lzo,
filed3 VARCHAR(20) ENCODE lzo,
)
DISTSTYLE EVEN
INTERLEAVED SORTKEY
(
filed1
);
Table creation statement for ADB PG:
CREATE TABLE schema2.table2(
filed1 VARCHAR(50),
filed2 VARCHAR(50),
filed3 VARCHAR(20))
WITH(APPENDONLY=true, ORIENTATION=column, COMPRESSTYPE=zlib)
DISTRIBUTED randomly
SORTKEY
(
filed1
);
Both Redshift and ADB PG support high-speed parallel data import and export from cloud storage.
Migrating data from Redshift to ADB PG involves the following steps:
The migration process is as follows:
Please visit our official documentation page to learn more about migrating data from Amazon Redshift to ApsaraDB AnalyticDB for PostgreSQL.
Who Said All Redis Data Must Be Stored in Memory? Explore Redis Hybrid Storage Instances
New Insights into X-Engine: The New Storage Engine of RDS for MySQL
ApsaraDB - June 15, 2023
Alibaba Cloud MaxCompute - January 4, 2022
Alibaba Clouder - February 7, 2018
ApsaraDB - September 30, 2021
ApsaraDB - July 23, 2021
Alibaba Clouder - January 3, 2018
Build a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreA real-time data warehouse for serving and analytics which is compatible with PostgreSQL.
Learn MoreAlibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreTair is a Redis-compatible in-memory database service that provides a variety of data structures and enterprise-level capabilities.
Learn MoreMore Posts by ApsaraDB