AnalyticDB for PostgreSQL provides the one-click data warehousing feature to synchronize data from ApsaraDB RDS for PostgreSQL to AnalyticDB for PostgreSQL. To synchronize data in real time, you can create a publication on ApsaraDB RDS for PostgreSQL, and then create a one-click data warehousing job as the subscription to the publication on AnalyticDB for PostgreSQL.
Usage notes
The one-click data warehousing feature is supported only for AnalyticDB for PostgreSQL instances of V6.3.8.0 or later in elastic storage mode.
The one-click data warehousing feature is supported only for ApsaraDB RDS for PostgreSQL instances that run PostgreSQL 10.0 or later.
The ApsaraDB RDS for PostgreSQL and AnalyticDB for PostgreSQL instances must reside within the same virtual private cloud (VPC).
The synchronization objects must be tables.
Only the INSERT, UPDATE, DELETE, and TRUNCATE operations can be synchronized. Schema changes cannot be synchronized.
The destination table must have the same schema as the source table.
The source table on ApsaraDB RDS for PostgreSQL must contain a primary key.
Only the following data types are supported for synchronization:
Numeric, character, BOOLEAN, time and date, enumeration, GEO, INTERNET, RANGE, JSON, BYTEA, BIT, and an array of the preceding types.
NoteIf you synchronize data of other types, the one-click data warehousing job may be interrupted.
If data of the source table on ApsaraDB RDS for PostgreSQL is updated by using a statement such as
UPDATE/DELETE WHERE CTID = ******
, the update is not synchronized to the destination table on AnalyticDB for PostgreSQL.
Prerequisites
AnalyticDB for PostgreSQL
An AnalyticDB for PostgreSQL instance is created. For more information, see Create an instance.
A privileged account is created. For more information, see Create a database account.
ApsaraDB RDS for PostgreSQL
An ApsaraDB RDS for PostgreSQL instance is created. For more information, see Create an ApsaraDB RDS for PostgreSQL instance.
A privileged account is created. For more information, see Create an account.
The
wal_level
parameter is set tological
. For information about how to modify parameters, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.NoteThe modification to the parameter takes effect only after you restart the instance. We recommend that you restart the instance during off-peak hours.
The internal endpoint of the AnalyticDB for PostgreSQL instance is added to a whitelist of the ApsaraDB RDS for PostgreSQL instance.
SQL syntax
This section describes the SQL syntax that is used to create, modify, and delete a one-click data warehousing job.
Create a one-click data warehousing job
Syntax
CREATE SUBSCRIPTION <subscription_name>
CONNECTION 'conninfo'
PUBLICATION <publication_name> [, ...]
[ WITH ( subscription_parameter [= value] [, ... ] ) ]
Parameters
Parameter | Description |
subscription_name | The name of the one-click data warehousing job. |
conninfo | The connection information of the ApsaraDB RDS for PostgreSQL instance, including the endpoint, port number, database name, account, and password.
Example: |
publication_name | The name of the publication that is created on the ApsaraDB RDS for PostgreSQL instance. |
subscription_parameter | The optional parameter of the one-click data warehousing job. Only
|
Modify a one-click data warehousing job
Refresh the synchronization objects of a one-click data warehousing job
When a new source table is added to the publication on the ApsaraDB RDS for PostgreSQL instance, you must perform a REFRESH operation to allow the destination table to subscribe to data of the new source table.
Syntax
ALTER SUBSCRIPTION <subscription_name> REFRESH PUBLICATION [ WITH ( refresh_option [= value] [, ... ] ) ]
Parameters
Parameter
Description
subscription_name
The name of the one-click data warehousing job.
refresh_option
The optional parameter of the one-click data warehousing job. Only
copy_data (boolean)
is supported.copy_data (boolean)
: specifies whether to synchronize existing data from the source table to the destination table when the one-click data warehousing job is created. Valid values:true
false
Start a one-click data warehousing job
Syntax
ALTER SUBSCRIPTION <subscription_name> ENABLE
Parameters
Parameter
Description
subscription_name
The name of the one-click data warehousing job.
Stop an ongoing one-click data warehousing job
Syntax
ALTER SUBSCRIPTION <subscription_name> DISABLE
Parameters
Parameter
Description
subscription_name
The name of the one-click data warehousing job.
Modify the owner of a one-click data warehousing job
Syntax
ALTER SUBSCRIPTION <subscription_name> OWNER TO { <new_owner> | CURRENT_USER | SESSION_USER }
Parameters
Parameter
Description
subscription_name
The name of the one-click data warehousing job.
new_owner
The new owner of the one-click data warehousing job.
Modify the name of a one-click data warehousing job
Syntax
ALTER SUBSCRIPTION <subscription_name> RENAME TO <new_name>
Parameters
Parameter
Description
subscription_name
The name of the one-click data warehousing job.
new_name
The new name of the one-click data warehousing job.
Delete a one-click data warehousing job
Syntax
DROP SUBSCRIPTION [ IF EXISTS ] <subscription_name>
Parameters
Parameter | Description |
subscription_name | The name of the one-click data warehousing job that you want to delete. |
For more information about the syntax, see the PostgreSQL documentation.
Start data synchronization
The following example shows how to synchronize a source table named logical_tb1
from the ApsaraDB RDS for PostgreSQL instance to the AnalyticDB for PostgreSQL instance:
Create a source table named logical_tb1 on the ApsaraDB RDS for PostgreSQL instance.
CREATE TABLE logical_tb1(a int primary key, b varchar(20));
Create a publication named pub1 on the ApsaraDB RDS for PostgreSQL instance and add the logical_tb1 table to the pub1 publication.
CREATE PUBLICATION pub1 FOR TABLE logical_tb1;
Create a destination table that has the same schema as the source table on the AnalyticDB for PostgreSQL instance.
CREATE TABLE logical_tb1(a int primary key, b varchar(20));
Create a one-click data warehousing job on the AnalyticDB for PostgreSQL instance.
If you want to synchronize existing data and new data from the source table, execute the following statement:
CREATE SUBSCRIPTION sub1 CONNECTION 'host=****** port=5432 user=*** dbname=*** password=***' PUBLICATION pub1;
If you want to synchronize only new data from the source table, execute the following statement:
CREATE SUBSCRIPTION sub1 CONNECTION 'host=****** port=5432 user=*** dbname=*** password=***' PUBLICATION pub1 WITH(copy_data = false);
Modify the tables that you want to synchronize
Assume that a one-click data warehousing job named sub1 is created on the AnalyticDB for PostgreSQL instance to synchronize the logical_tb1 table that is contained in the pub1 publication on the ApsaraDB RDS for PostgreSQL instance. The following example shows how to synchronize another table named logical_tb2 to the AnalyticDB for PostgreSQL instance:
Create a source table named logical_tb2 on the ApsaraDB RDS for PostgreSQL instance.
CREATE TABLE logical_tb2(id int primary key, name varchar(20), age int);
Add the logical_tb2 table to the pub1 publication on the ApsaraDB RDS for PostgreSQL instance.
ALTER PUBLICATION pub1 ADD TABLE logical_tb2;
Create a destination table that has the same schema as the source table on the AnalyticDB for PostgreSQL instance.
CREATE TABLE logical_tb2(id int primary key, name varchar(20), age int);
Refresh the subscribed publication information on the AnalyticDB for PostgreSQL instance. The logical_tb2 table is synchronized to the AnalyticDB for PostgreSQL instance.
ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION;
Stop data synchronization
Assume that a publication named pub1 is created on the ApsaraDB RDS for PostgreSQL instance and the pub1 publication contains the logical_tb1 and logical_tb2 tables. The following example shows how to continue synchronizing the logical_tb1 table and stop synchronizing the logical_tb2 table:
Remove the logical_tb2 table from the pub1 publication on the ApsaraDB RDS for PostgreSQL instance.
ALTER PUBLICATION pub1 DROP TABLE logical_tb2;
Refresh the subscribed publication information on the AnalyticDB for PostgreSQL instance. The logical_tb1 table is synchronized to the AnalyticDB for PostgreSQL instance but the logical_tb1 table is not.
ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION;
Query the state of a one-click data warehousing job
Query whether a one-click data warehousing job is complete
Execute the following statement on the ApsaraDB RDS for PostgreSQL instance to query whether a one-click data warehousing job is complete:
SELECT confirmed_flush_lsn = pg_current_wal_lsn() FROM pg_replication_slots WHERE slot_name = '<subscription_name>';
subscription_name specifies the name of the one-click data warehousing job that you want to query.
Return values:
t: The one-click data warehousing job is complete.
f: The one-click data warehousing job is in progress.
null: The one-click data warehousing job does not exist or is abnormal.
Query the state of a one-click data warehousing job
Execute the following statement on the ApsaraDB RDS for PostgreSQL instance to query the state of a one-click data warehousing job:
SELECT active FROM pg_replication_slots WHERE slot_name='<subscription_name>';
subscription_name specifies the name of the one-click data warehousing job that you want to query.
Return values:
t: The one-click data warehousing job is normal.
f: The one-click data warehousing job is abnormal.
null: The one-click data warehousing job does not exist.
Query the number of one-click data warehousing jobs on an AnalyticDB for PostgreSQL instance
Execute the following statement on the AnalyticDB for PostgreSQL instance to query the number of one-click data warehousing jobs on the instance:
SELECT count(*) FROM pg_subscription WHERE subenabled = true;
References
For information about how to create a publication on the ApsaraDB RDS for PostgreSQL instance, see Use the logical subscription feature.
After you use one-click data warehousing to synchronize data, you can analyze the data on the AnalyticDB for PostgreSQL instance. For more information, see Vector analysis overview and Real-time materialized views.