5 Simple Steps to Migrate Data Warehouses from Redshift to Alibaba Cloud AnalyticDB for PG

This article describes how to migrate data between the two data warehouse platforms (from Redshift to Alibaba Cloud AnalyticDB for PG) in just five simple steps.

By Peng Xiaoqiang (Lufeng)

Alibaba Cloud AnalyticDB for PostgreSQL (ADB PG, formerly known as HybridDB for PostgreSQL) is a real-time data warehousing service based on the PostgreSQL kernel Massively Parallel Processing (MPP) architecture. This product supports complex extract-transform-load (ETL) tasks and high-performance online queries. It is closely integrated with the Alibaba Cloud ecosystem.

AWS Redshift is also an MPP data warehouse server based on the PostgreSQL kernel engine and is widely used as a data warehouse in AWS. ADB PG is highly compatible with Redshift in terms of architecture and syntax. This article describes how to migrate data between these two data warehouse platforms.

Product Architecture Comparison

The latest Alibaba Cloud ADB PG 6.0 is built on PostgreSQL 9.4, whereas Redshift is based on PostgreSQL 8.2. Compared to ADB PG, Redshift provides a richer set of features and is fully compatible with PostgreSQL ecosystem tools, including PostGIS, MADlib, and other extended analysis tools. Redshift only supports column-store tables, not PostgreSQL native row-store tables. Comparatively, ADB PG retains support for PostgreSQL row-store tables to implement high-throughput data update operations. It also supports column-store tables for OLAP large table aggregation operations.

Comparison of ADB PG and Redshift

Item	ADB PG	Redshift
PostgreSQL version	PG 9.4	PG 8.2
SQL syntax	Compatible with PG and partially compatible with Oracle syntax	Compatible with PG
Transactions	Supported	Supported
Row-store	Supported	Not supported
Column-store	Supported	Supported
Table partitions	Supported	Supported
Cloud storage	Supports online access to OSS data	Supports online access to S3 data
Multimodal analysis	PostGIS, MADLib, and vector search

Key Syntax Comparison and Migration

Both, Alibaba Cloud ADB PG and AWS Redshift are based on the standalone PostgreSQL kernel engine, so their syntax is highly compatible. Some of their syntax descriptions are slightly different, as detailed below.

Differences in Syntax for DDL Table Creation

Syntax	Redshift	ADB PG
	DISTKEY(col)	DISTRIBUTED BY(col)
Random table distribution	DISTSTYLE EVEN	DISTRIBUTED RANDOMLY
Table replica distribution	DISTSTYLE ALL	DISTRIBUTED REPLICATED
Data encoding and compression	AZ64 /BYTEDICT /DELTA /LZO /RAW /RUNLENGTH /ZSTD	(COMPRESSTYPE={ZStD/ZLIB/QUICKLZ/RLE_TYPE/NONE})
Column-store sort key	SORTKEY (col)	"with(APPENDONLY=true, ORIENTATION=column)sortkey(volume)"
System functions	PG8.2 and some custom functions	PG9.4 and some custom functions

Syntax Guide

DDL Conversion Example 1

Table creation statement for Redshift, including the distribution key DISTKEY and sort column:

CREATE TABLE schema1.table1(
    filed1 VARCHAR(100) ENCODE lzo,
    filed2 INTEGER DISTKEY,
    filed3 INTEGER,
    filed4 BIGINT ENCODE lzo,
    filed5 INTEGER,)
INTERLEAVED SORTKEY (
    filed1,
    filed2);

Table creation statement for ADB PG:

CREATE TABLE schema1.table1
(
    filed1 VARCHAR(100) ,
    filed3 INTEGER,
    filed5 INTEGER
)
WITH(APPENDONLY=true,ORIENTATION=column,COMPRESSTYPE=zlib)
DISTRIBUTED BY (filed2)
SORTKEY
(
    filed1,
    filed2
)

DDL Conversion Example 2

Table creation statement for Redshift, including the ENCODE and SORTKEY options:

CREATE TABLE schema2.table2
(
    filed1 VARCHAR(50) ENCODE lzo,
    filed2 VARCHAR(50) ENCODE lzo,
    filed3 VARCHAR(20) ENCODE lzo,
)
DISTSTYLE EVEN
INTERLEAVED SORTKEY
(
    filed1
);

Table creation statement for ADB PG:

CREATE TABLE schema2.table2(
    filed1 VARCHAR(50),
    filed2 VARCHAR(50),
    filed3 VARCHAR(20))
WITH(APPENDONLY=true, ORIENTATION=column, COMPRESSTYPE=zlib)
DISTRIBUTED randomly
SORTKEY
(
    filed1
);

Data Migration

Both Redshift and ADB PG support high-speed parallel data import and export from cloud storage.

Migrating data from Redshift to ADB PG involves the following steps:

Prepare resources related to Amazon Redshift, Amazon S3 (Amazon Simple Storage Service), ADB PG, and Alibaba Cloud OSS before performing operations.
Import Redshift data to Amazon S3.
Use OSSImport to import CSV data files from Amazon S3 to OSS.
On the destination ADB PG instance, create an object that corresponds to the source Redshift, including Schema, Table, View, and Function.
Use an OSS external table to import data to ADB PG.

The migration process is as follows:

Please visit our official documentation page to learn more about migrating data from Amazon Redshift to ApsaraDB AnalyticDB for PostgreSQL.

Community

5 Simple Steps to Migrate Data Warehouses from Redshift to Alibaba Cloud AnalyticDB for PG

Product Architecture Comparison

Comparison of ADB PG and Redshift

Key Syntax Comparison and Migration

Differences in Syntax for DDL Table Creation

Syntax Guide

DDL Conversion Example 1

DDL Conversion Example 2

Data Migration

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

Data Lake Storage Solution

Hologres

PolarDB for PostgreSQL

Tair