All Products
Search
Document Center

ApsaraDB for OceanBase:Migrate data from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database

Last Updated:Jul 03, 2024

This topic describes how to use the data transmission service to migrate data from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database.

Important

A data migration project remaining in an inactive state for a long time may fail to be resumed depending on the retention period of incremental logs. Inactive states include Failed, Paused, and Completed. The data transmission service automatically releases data migration projects that remain in an inactive state for more than 7 days to recycle resources. We recommend that you configure alerting for projects and handle project exceptions in a timely manner.

Prerequisites

  • The data transmission service has the privilege to access cloud resources. For more information, see Grant privileges to roles for data transmission.

  • You have created a dedicated privileged account for data migration in the source ApsaraDB RDS for PostgreSQL instance. For more information, see PostgreSQL data source.

  • You have created a dedicated database user for data migration in the destination Oracle tenant of OceanBase Database and granted corresponding privileges to the user. For more information, see Create a database user.

  • If you need to perform incremental synchronization, perform the following operations first:

    • The data transmission service does not support automatic synchronization of DDL statements during incremental synchronization. If a DDL statement needs to be executed on the table to be migrated, manually execute the DDL statement at the destination and then execute it in the source ApsaraDB RDS for PostgreSQL instance.

      To correctly parse incremental DML operations performed after the DDL statement is executed, you must create a corresponding trigger and a table for recording the DDL statement. For more information, see Create a trigger.

    • If you have selected Incremental Synchronization, you must set the wal_level parameter to logical. For more information, see Change the log level for an ApsaraDB RDS for PostgreSQL instance.

Limitations

  • Limitations on the source database

    Do not perform DDL operations that modify database or table schemas during schema migration or full data migration. Otherwise, the data migration project may be interrupted.

  • Only instances of ApsaraDB RDS for PostgreSQL V11.x and V12.x are supported.

  • The data transmission service does not support the migration of partitioned tables, unlogged tables, or temporary tables from an ApsaraDB RDS for PostgreSQL instance.

  • The data transmission service does not support triggers in the destination database. If triggers exist in the destination database, the data migration may fail.

  • The data transmission service supports the migration of an object only when the following conditions are met: the database name, table name, and column name of the object are ASCII-encoded without special characters. The special characters are line breaks, spaces, and the following characters: . | " ' ` ( ) = ; / & \.

  • The data transmission service supports incremental synchronization only from the primary database.

Considerations

  • If you have selected Incremental Synchronization, the table-level REPLICA IDENTITY option must meet the following requirements:

    • If you select Specify Objects to specify the synchronization objects, the specified tables must have a primary key or you need to set the table-level REPLICA IDENTITY option to FULL. Otherwise, the business data update and delete operations will fail.

    • If you select Match Rules to specify the synchronization objects, the ApsaraDB RDS for PostgreSQL instance must subscribe to the changes to all tables of the selected databases, and all tables must have a primary key or you need to set the table-level REPLICA IDENTITY option to FULL. Otherwise, the business data update and delete operations will fail.

    You can execute the following statement to set the table-level REPLICA IDENTITY option to FULL:

    ALTER TABLE table_name REPLICA IDENTITY FULL;
  • When you migrate schemas or incremental DDL operations from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database, lowercase letters in table names and field names are converted into uppercase letters based on the default strategies of the data transmission service. For example, the source table name a is converted into A at the destination by default. You can specify a table name or field name in the format of a, A, or "A", but not "a".

  • The Incr-Sync component of an ApsaraDB RDS for PostgreSQL instance automatically creates publications and slots, but you must monitor the disk usage of the instance logs. By default, the data transmission service notifies the instance, at an interval of 10 minutes, to update the confirmed_flush_lsn value of a slot to the latest log sequence number (LSN) of 10 minutes ago. Therefore, each Incr-Sync component retains at least 10 minutes of logs of the ApsaraDB RDS for PostgreSQL instance.

    Note

    If you want to modify the notification interval, or the retention period of generated log files of the ApsaraDB RDS for PostgreSQL instance, contact OceanBase Technical Support.

    If logs of the ApsaraDB RDS for PostgreSQL instance cannot be cleared during data migration due to the existence of slots, you must delete the data migration project and then clear the logs. The smallest slot restart_lsn value among all slots determines whether the log files of the ApsaraDB RDS for PostgreSQL instance can be recycled. If the smallest value is within the log file range, the log files are not recycled.

  • When you synchronize data to the destination, duplicate data may exist if the destination table does not have a primary key or non-null unique key.

  • In a reverse incremental migration scenario, if data migration is performed in full-column matching mode for UPDATE and DELETE operations, the following issues may occur:

    • Performance issues

      Due to the absence of a primary key index, each UPDATE or DELETE operation will be performed after a full-table scan.

    • Data inconsistency

      The LIMIT clause is not supported for UPDATE and DELETE operations in an ApsaraDB RDS for PostgreSQL instance. In this case, if multiple records are matched in full-column matching mode, the number of data records affected by an UPDATE or DELETE operation may be larger than expected. Assume that the t1 table without a primary key has two columns, c1 and c2. Two data records where c1 = 1 and c2 = 2 exist at the source. When you delete only one data record from the source based on the where c1 = 1 and c2 = 2 condition, the two data records in the destination will be deleted as they both match the condition. This causes data inconsistency between the source and the destination.

  • The data transmission service supports reverse incremental migration of tsvector fields from OceanBase Database to an ApsaraDB RDS for PostgreSQL instance. The tsvector fields must be written to OceanBase Database in the supported formats. Here is an example.

    • Data written to OceanBase Database in the 'a b c' format will be converted into the "'a' 'b' 'c'" format in the ApsaraDB RDS for PostgreSQL instance.

    • Data written to OceanBase Database in the 'a:1 b:2 c:3' format will be converted into the "'a':1 'b':2 'c':3" format in the ApsaraDB RDS for PostgreSQL instance.

    Data written to OceanBase Database in a non-tsvector format such as "'a':cccc" cannot be migrated to the ApsaraDB RDS for PostgreSQL instance. For more information about the supported formats, see the 8.11. Text Search Types in PostgreSQL documentation.

  • If the UTF-8 character set is used in the source, we recommend that you use a compatible character set, such as UTF-8 or UTF-16, in the destination to avoid garbled characters.

  • Check whether the migration precision of the data transmission service for columns of data types such as DECIMAL, FLOAT, and DOUBLE is as expected. If the precision of a destination field type is lower than the precision of the corresponding source field type, the value with a higher precision may be truncated. This may result in data inconsistency between the source and destination fields.

  • If you modify a unique index at the destination, you must restart the data migration project to avoid data inconsistency.

  • If the clocks between nodes or between the client and the server are out of synchronization, the latency may be inaccurate during incremental synchronization or reverse incremental migration.

    For example, if the clock is earlier than the standard time, the latency can be negative. If the clock is later than the standard time, the latency can be positive.

  • Take note of the following considerations if you want to aggregate multiple tables:

    • We recommend that you configure the mappings between the source and destination databases by importing objects or specifying matching rules.

    • We recommend that you manually create schemas at the destination. If you create a schema by using the data transmission service, skip the failed objects in the schema migration step.

  • If you have selected only Incremental Synchronization when you created the data migration project, the data transmission service requires that the local incremental logs of the source database be retained for more than 48 hours.

    If you have selected Full Migration and Incremental Synchronization when you created the data migration project, the data transmission service requires that the local incremental logs of the source database be retained for at least 7 days. If the data transmission service cannot obtain incremental logs, the data migration project may fail or even the data between the source and destination databases may be inconsistent after migration.

Supported source and destination instances

In the following table, OB_Oracle stands for an Oracle tenant of OceanBase Database.

Source

Destination

PostgreSQL (ApsaraDB RDS instance)

OB_Oracle (OceanBase cluster instance)

Data type mappings

ApsaraDB RDS for PostgreSQL instance

Oracle tenant of OceanBase Database

Integer

NUMBER(10)

smallint

NUMBER(5)

bigint

NUMBER(20)

decimal

NUMBER(p,s)

Numeric

NUMBER(p,s)

real

BINARY_FLOAT

double precision

BINARY_DOUBLE

smallserial

NUMBER(5)

serial

NUMBER(10)

bigserial

NUMBER(20)

char

CHAR(n)

Note

The default length and maximum length of a column of the CHAR data type are 1 byte and 2,000 bytes.

Varchar

VARCHAR2(n)

text

CLOB

timestamp

TIMESTAMP(p)

timestamp without time zone

TIMESTAMP [(p)] WITH TIME ZONE

time

DATE

time with time zone

TIMESTAMP [(p)] WITH TIME ZONE

boolean

NUMBER(1)

bytea

BLOB

citext

CLOB

tsvector

CLOB

Procedure

  1. Log on to the ApsaraDB for OceanBase console and purchase a data migration project.

    For more information, see Purchase a data migration project.

  2. Choose Data Transmission > Data Migration. On the page that appears, click Configure for the data migration project.

    image.png

    If you want to reference the configurations of an existing project, click Reference Configuration. For more information, see Reference and clear data migration project configurations.

  3. On the Select Source and Destination page, configure the related parameters.

    image.png

    Parameter

    Description

    Migration Project Name

    We recommend that you set it to a combination of digits and letters. It must not contain any spaces and cannot exceed 64 characters in length.

    Tag (Optional)

    Click the field and select a target tag from the drop-down list. You can also click Manage Tags to create, modify, and delete tags. For more information, see Use tags to manage data migration projects.

    Source

    If you have created a PostgreSQL data source, select it from the drop-down list. If not, click New Data Source in the drop-down list to create one in the dialog box on the right side. For more information, see Create a PostgreSQL data source.

    Destination

    If you have created an Oracle tenant in OceanBase Database as a data source, select it from the drop-down list. If not, click New Data Source in the drop-down list to create one in the dialog box on the right side. For more information about the parameters, see Create an OceanBase data source.

    Important

    If the destination is an Oracle tenant of OceanBase Database, Instance Type cannot be set to Self-Managed Database in VPC.

  4. Click Next. On the Select Migration Type page, specify the migration types of the current data migration project.

    Options available for Migration Type are Schema Migration, Full Migration, Incremental Synchronization, Full Verification, and Reverse Incremental Migration.

    image

    Migration type

    Description

    Schema migration

    After a schema migration task is started, the data transmission service migrates the definitions of database objects (such as tables, indexes, constraints, comments, and views) from the source database to the destination database and automatically filters out temporary tables.

    Full migration

    After a full migration task is started, the data transmission service will migrate existing data from tables in the source database to corresponding tables in the destination database.

    Incremental synchronization

    After an incremental synchronization task is started, the data transmission service synchronizes changed data (data that is added, modified, or removed) from the source database to corresponding tables in the destination database.

    DML Synchronization is supported for Incremental Synchronization. You can select operations as needed. For more information, see Configure DDL/DML synchronization.

    Full verification

    After the full migration and incremental synchronization tasks are completed, the data transmission service automatically initiates a full verification task to verify the tables in the source and destination databases.

    Note
    • If you have selected Incremental Synchronization but did not select all DML operations in the DML Synchronization section, you cannot select Full Verification.

    • The data transmission service supports full verification only for tables with primary keys or non-null unique keys.

    Reverse incremental migration

    Data changes made in the destination database after the business database switchover are synchronized to the source database in real time through reverse incremental migration.

    Generally, incremental synchronization configurations are reused for reverse incremental migration. You can also customize the configurations for reverse incremental migration as needed.

  5. Click Next. On the Select Migration Objects page, specify the migration objects for the data migration project.

    You can select Specify Objects or Match Rules to specify the migration objects.

    Important
    • The names of tables to be migrated, as well as the names of columns in the tables, must not contain Chinese characters.

    • If a database or table name contains double dollar signs ($$), you cannot create the migration project.

    • If you select Specify Objects, select the objects to be migrated on the left and click > to add them to the list on the right. You can select tables and views of one or more databases as the migration objects.

      The data transmission service allows you to import objects from text files, rename destination objects, specify settings, and remove a single migration object or all migration objects.

      Note

      When you select Match Rules to specify migration objects, object renaming is implemented based on the syntax of the specified matching rules. In the operation area, you can only set filtering conditions. For more information, see Configure matching rules.

      image.png

      Operation

      Description

      Import objects

      1. In the list on the right, click Import Objects in the upper-right corner.

      2. In the dialog box that appears, click OK.

        Important

        This operation will overwrite previous selections. Proceed with caution.

      3. In the Import Objects dialog box, import the objects to be migrated.

        You can import CSV files to rename databases or tables and set row filtering conditions. For more information, see Download and import the settings of migration objects.

      4. Click Validate.

        After you import the migration objects, check their validity. Column field mapping is not supported at present.

      5. After the validation succeeds, click OK.

      Rename an object

      The data transmission service allows you to rename migration objects. For more information, see Rename a database table.

      Configure settings

      The data transmission service allows you to filter rows by using WHERE conditions. For more information, see Use SQL conditions to filter data.

      You can also view column information of the migration objects in the View Columns section.

      Remove one or all objects

      The data transmission service allows you to remove a single object or all migration objects that are added to the right-side list during data mapping.

      • Remove a single migration object

        In the list on the right, move the pointer over the object that you want to remove, and click Remove to remove the migration object.

      • Remove all migration objects

        In the list on the right, click Remove All in the upper-right corner. In the dialog box that appears, click OK to remove all migration objects.

    • Select Match Rules. For more information, see Configure matching rules.

  6. Click Next. On the Migration Options page, configure the parameters.

    • Full migration

      The following table describes the parameters for full migration, which are displayed only if you have selected Full Data Migration on the Select Migration Type page.

      image

      Parameter

      Description

      Read Concurrency Configuration

      The concurrency for reading data from the source during full migration. The maximum value is 512. A high read concurrency may incur excessive stress on the source, affecting the business.

      Write Concurrency Configuration

      The concurrency for writing data to the destination during full migration. The maximum value is 512. A high write concurrency may incur excessive stress on the destination, affecting the business.

      Full Data Migration Rate Limit

      You can choose whether to limit the full migration rate as needed. If you choose to limit the full migration rate, you must specify the records per second (RPS) and bytes per second (BPS). The RPS specifies the maximum number of data rows migrated to the destination per second during full migration, and the BPS specifies the maximum amount of data in bytes migrated to the destination per second during full migration.

      Note

      The RPS and BPS values specified here are only for throttling. The actual full migration performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Processing Strategy When Destination Table Has Records

      • If you select Ignore, when the data to be inserted conflicts with existing data of a destination table, the data transmission service logs the conflicting data while retaining the existing data.

        Important

        If you select Ignore, data is pulled in IN mode during full verification. In this case, verification is inapplicable if the destination contains data that does not exist in the source, and the verification performance is downgraded.

      • If you select Stop Migration and a destination table contains records, an error prompting migration unsupported is reported during full migration. In this case, you must process the data in the destination table and then continue with the migration.

        Important

        If you click Resume in the dialog box prompting the error, the data transmission service ignores this error and continues to migrate data. Proceed with caution.

      Whether to Allow Post-indexing

      Specifies whether to create indexes after the full migration is completed. Post-indexing can shorten the time required for full migration. For more information about the considerations on post-indexing, see the description below.

      Important
      • This parameter is displayed only if both Schema Migration and Full Data Migration are selected on the Select Migration Type page.

      • Only non-unique key indexes can be created after the migration is completed.

      • If the name is already used by an existing object error occurs in the destination Oracle tenant of OceanBase Database during indexing, the data transmission service ignores the error and determines that the index is created, without creating an index again.

      If post-indexing is allowed, we recommend that you adjust the parameter settings based on the hardware conditions of OceanBase Database and the business traffic.

      • If you use OceanBase Database V4.x, adjust the settings of the following parameters of the sys tenant and business tenants by using a command-line interface (CLI) client.

        • Adjust the parameter settings of the sys tenant

          // parallel_servers_target specifies the queue condition for parallel queries on each server. 
          // To maximize performance, we recommend that you set this parameter to a value greater than, for example, 1.5 times, the number of physical CPU cores. In addition, make sure that the value does not exceed 64, to prevent database kernels from contending for locks. 
          set global parallel_servers_target = 64; 
        • Adjust the parameter settings of a business tenant

          // Specify the limit on the file memory buffer size.
          alter system set _temporary_file_io_area_size = '10' tenant = 'xxx'; 
          // Disable throttling in V4.x.
          alter system set sys_bkgd_net_percentage = 100;
      • If you use OceanBase Database V3.x, adjust the settings of the following parameters of the sys tenant by using a CLI client.

        // parallel_servers_target specifies the queue condition for parallel queries on each server. 
        // To maximize performance, we recommend that you set this parameter to a value greater than, for example, 1.5 times, the number of physical CPU cores. In addition, make sure that the value does not exceed 64, to prevent database kernels from contending for locks. 
        set global parallel_servers_target = 64; 
        // data_copy_concurrency specifies the maximum number of concurrent data migration and replication tasks allowed in the system. 
        alter system set data_copy_concurrency = 200;
    • Incremental synchronization

      The following table describes the parameters for incremental synchronization, which are displayed only if you have selected Incremental Synchronization on the Select Migration Type page.

      image

      Parameter

      Description

      Write Concurrency Configuration

      The concurrency for writing data to the destination during incremental synchronization. The maximum value is 512. A high write concurrency may incur excessive stress on the destination, affecting the business.

      Incremental Synchronization Rate Limit

      You can choose whether to limit the incremental synchronization rate as needed. If you choose to limit the incremental synchronization rate, you must specify the RPS and BPS. The RPS specifies the maximum number of data rows synchronized to the destination per second during incremental synchronization, and the BPS specifies the maximum amount of data in bytes synchronized to the destination per second during incremental synchronization.

      Note

      The RPS and BPS values specified here are only for throttling. The actual incremental synchronization performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Incremental Synchronization Start Timestamp

      This parameter is not displayed when Full Migration is selected on the Select Migration Type page.

      When you migrate data from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database, the default start timestamp for incremental synchronization is the time when incremental synchronization is started, and cannot be modified.

    • Reverse incremental migration

      The following table describes the parameters for reverse incremental migration, which are displayed only if you have selected Reverse Increment on the Select Migration Type page. By default, incremental synchronization configurations are reused for reverse incremental migration.

      image

      You can choose not to reuse the incremental synchronization configurations and configure reverse incremental migration as needed.

      Parameter

      Description

      Write Concurrency Configuration

      The concurrency for writing data to the source during reverse incremental migration. The maximum value is 512. A high concurrency may incur excessive stress on the source, thereby affecting the business.

      Incremental Synchronization Rate Limit

      You can choose whether to limit the incremental synchronization rate as needed. If you choose to limit the reverse incremental migration rate, you must specify the RPS and BPS. The RPS specifies the maximum number of data rows synchronized to the source per second during reverse incremental migration, and the BPS specifies the maximum amount of data in bytes synchronized to the source per second during reverse incremental migration.

      Note

      The RPS and BPS values specified here are only for throttling. The actual reverse incremental migration performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Incremental Synchronization Start Timestamp

      By default, the forward switchover start timestamp (if any) prevails. This parameter cannot be modified.

    • Advanced migration configuration

      This section is displayed only when if destination is an Oracle tenant of OceanBase Database and you have selected Schema Migration on the Select Migration Type page.

      image

      This parameter specifies the storage type for destination table objects during schema migration or incremental synchronization. The storage types supported for destination table objects are Default, Row storage, Column storage, and Hybrid columnar storage. For more information, see default_table_store_format.

      Note

      The value Default means that other parameters are automatically set based on the parameter configurations of the destination. For table objects in schema migration, the schemas are subject to the specified storage type.

  7. Click Precheck to start a precheck on the data migration project.

    During the precheck, the data transmission service checks the read and write privileges of the database users and the network connections of the databases. The data migration project can be started only after it passes all check items. If an error is returned during the precheck, you can perform the following operations:

    • Identify and troubleshoot the problem and then perform the precheck again.

    • Click Skip in the Actions column of the failed precheck item. In the dialog box that prompts the consequences of the operation, click OK.

  8. After the precheck succeeds, click Start Project.

    If you do not need to start the project now, click Save. After that, you can only manually start the project or start it in a batch operation on the Migration Projects page. For more information about batch operations, see Perform batch operations on data migration projects. After the data migration project is started, it will be executed based on the selected migration types. For more information, see View details of a data migration project.

    The data transmission service allows you to remove migration objects during data migration. For more information, see Remove migration objects.

    Note

    You cannot add migration objects during data migration from an ApsaraDB RDS for PostgreSQL instance to an Oracle tenant of OceanBase Database.

References