Key considerations for migrating an Oracle database to a PolarDB cluster

Updated at: 2025-01-03 12:32

This topic describes the key points that you must consider before you migrate an Oracle database to a PolarDB cluster. Carefully read this topic before the migration to ensure a smooth migration process.

Sequence value synchronization

Question

Are sequence values in the source Oracle database automatically synchronized to the destination PolarDB cluster during the migration? How do I synchronize the sequence values from the Oracle database to the PolarDB cluster?

Analysis

PolarDB does not automatically synchronize sequence values from a source database. PolarDB fully supports the logical replication feature defined by the PostgreSQL protocol. However, sequence values are not included in the logical replication scope of the PostgreSQL protocol.

  • During the initial schema migration, a one-time snapshot of the sequence values (Q1 for example) at the time of the migration is synchronized to the PolarDB cluster. Data continues to be added or updated in the Oracle database after the initial migration. The sequence values in the Oracle database may increase to Q2, while the sequence values in the PolarDB cluster remain at Q1. However, the table data in the destination PolarDB cluster is synchronized with the source Oracle database in real time.

  • If you use a value in Q1 to insert a new record into the PolarDB cluster, a duplicate key error occurs because the same sequence number has been used in the Oracle database.

Solutions

Encoding settings

Question

The Oracle database may use GBK or UTF-8 encoding. Which encoding do I use for the PolarDB cluster?

Solution

An Oracle database may use the GBK or UTF-8 character set. To migrate an Oracle database to a PolarDB cluster, we recommend that you specify the UTF-8 character set for the PolarDB cluster. The UTF-8 character set has better compatibility and provides the following advantages:

  • Avoids the risk of encoding conflicts: We recommend that you do not use GBK as the character set of the PolarDB cluster. GBK has specific characters that share the same encoding range as ASCII codes. This may result in encoding errors during data migration.

  • Prevents garbled code insertion: Oracle allows the insertion of invalid GBK encoding, which may result in garbled text when the data is read back. PolarDB does not support the insertion of invalid GBK encoding. If you try to insert invalid GBK encoding into PolarDB, the data may be rejected.

Therefore, we recommend that you use UTF-8 encoding to ensure a smooth and error-free migration from the Oracle database to the PolarDB cluster.

Case sensitivity of table names

Question

By default, table names and column names are in uppercase in an Oracle database and in lowercase in a PolarDB cluster. How do I address the differences in case sensitivity when I migrate data from an Oracle database to a PolarDB cluster?

Solution

By default, Oracle converts all unquoted names, such as schema names, table names, and column names, to uppercase. By default, PolarDB for PostgreSQL (Compatible with Oracle) converts all unquoted names to lowercase. When you migrate data from an Oracle database to a PolarDB for PostgreSQL (Compatible with Oracle) cluster, DTS uses the following default solution based on your business requirements:

  • Convert unquoted names from uppercase to lowercase

    • Applicable scenario: Schema names, table names, and column names in the Oracle database are not quoted.

    • Solution: DTS automatically converts all unquoted names to lowercase during the migration to the PolarDB for PostgreSQL (Compatible with Oracle) cluster. After the migration, you can continue to use the names without double quotation marks. No changes or special configurations are required.

  • Keep the original case of quoted names

    • Applicable scenario: Table and column names use mixed case and are enclosed in double quotation marks in the Oracle database.

    • Solution: DTS keeps the original case of the names during the migration. Make sure that you disable the specific mappings. After migration to the PolarDB for PostgreSQL (Compatible with Oracle) cluster, you can continue to use the names with double quotation marks. No changes or special configurations are required.

  • Allow case-insensitive queries

    By default, you can query a table whose name is in uppercase in a PolarDB cluster by using the table name in uppercase or lowercase.

CHAR and VARCHAR data length semantics

Question

Does VARCHAR(20) specify the same data length in Oracle and PolarDB?

Analysis

The VARCHAR(20) data type is differently implemented in Oracle and PolarDB.

  • In an Oracle database, the CHAR and VARCHAR data types specify lengths in bytes.

  • In a PolarDB for PostgreSQL (Compatible with Oracle) cluster, the CHAR and VARCHAR data types specify lengths in characters by default.

  • The PolarDB for PostgreSQL (Compatible with Oracle) cluster allows you to use the polar_default_char_length_semantics parameter to change the default length semantics of CHAR and VARCHAR data types. By default, the parameter value is OFF and the length of CHAR and VARCHAR data types is measured in characters. If you set the parameter to ON, the length of CHAR and VARCHAR data types is measured in bytes.

    Note

    You can configure the polar_default_char_length_semantics parameter in the PolarDB console. For more information, see Configure cluster parameters.

Solution

  • When the polar_default_char_length_semantics parameter is set to ON:

    • Scenario: In the Oracle database, a CHAR(10) column uses the GBK character. In the PolarDB for PostgreSQL (Compatible with Oracle) cluster, the corresponding CHAR(10) column uses the UTF-8 character set.

    • Issue: During the forward synchronization from the Oracle database to the PolarDB for PostgreSQL (Compatible with Oracle) cluster, a string "测试测试测" is written into the CHAR(10) column in the Oracle database. The string occupies 2 × 5 = 10bytes, which is exactly the maximum number of bytes of the column. However, the string "测试测试测" occupies 15 characters in the PolarDB for PostgreSQL (Compatible with Oracle) cluster, which exceeds the maximum number of bytes of the column. An error is reported.

    • Solutions

      Solution 1: Set the polar_default_char_length_semantics parameter to OFF to ensure that no related errors occur during the migration.

      Solution 2: Change the length of the column in the PolarDB for PostgreSQL (Compatible with Oracle) cluster to a larger size.

  • When the polar_default_char_length_semantics parameter is set to OFF:

    • Scenario: In the Oracle database, a CHAR(10) column uses the GBK character. In the PolarDB for PostgreSQL (Compatible with Oracle) cluster, the corresponding CHAR(10) column uses the UTF-8 character set.

    • Issue: During the reverse synchronization from the PolarDB for PostgreSQL (Compatible with Oracle) cluster to the Oracle database, a string "测试" is written into the CHAR(10) column in the PolarDB for PostgreSQL (Compatible with Oracle) cluster. The actual number of occupied characters is 2 + 8 = 10. When the string is converted to bytes, the string occupies 2 × 3 + 8 = 14 bytes, which exceeds the length of the CHAR(10) column in the Oracle database. A data truncation error is reported.

    • Solutions

      Solution 1: Change the length of the column in the Oracle database to a larger size.

      Solution 2: Configure an extract, transform, load (ETL) task in DTS to truncate the excess characters. For more information, see Configure ETL in a data migration or synchronization task.

Table without a primary key or unique key

Question

What do I do if a table without a primary key or unique key exists in the Oracle database? Is data validation affected?

Solution

If a table does not have a primary key or unique key, data consistency cannot be guaranteed during the migration process, which may result in duplicate entries in the destination cluster. You can use one of the following solutions based on your business requirements:

  • If data consistency is not important to your business, or you plan to perform deduplication after the migration, you can proceed with the migration process.

  • If data consistency is important to your business, you can use the ROWID column in the Oracle database as the hidden primary key (non-empty unique key) of the destination cluster. Log on to the DTS console and set the Add a hidden primary key for a table without a primary key/a table without a unique key parameter to Yes. For more information, see Migrate data from a self-managed Oracle database to a PolarDB for PostgreSQL (Compatible with Oracle) cluster. After the migration task is complete, click the corresponding button to delete the hidden primary key. Then, you can perform data validation. Take note that tables without a primary key do not support data validation.

\0 characters handling

Question

How do I handle hidden \0 characters in the Oracle database?

Solution

PolarDB for PostgreSQL (Compatible with Oracle) does not support \0 characters. DTS removes \0 characters during the migration process.

  • If a common field contains \0 characters, the removal of the characters may result in data discrepancies before and after the migration.

  • If a primary key field contains \0 characters, the removal of the characters may lead to data conflicts. For example, you have two records with primary key values dts\0 and dts. After the migration, the two key values are transformed to dts, which results in a data conflict.

Handing of empty strings

Question

In an Oracle database, an empty string ('') is treated as a NULL value. Does PolarDB support this behavior?

Solution

According to the SQL standard, an empty string is distinct from a NULL value. However, Oracle treats empty strings as NULL values. PolarDB supports this behavior of Oracle by using the polar_empty_string_is_null_enable parameter. By default, the parameter is set to ON and empty strings are treated as NULL values in a PolarDB cluster. If you no longer need the feature, you can disable it in the PolarDB console. For more information, see Configure cluster parameters.

Time zone settings

Question

How do I set the time zone for a PolarDB cluster? What are the impacts of the time zone setting?

Solution

By default, the time zone value of a PolarDB cluster is UTC. UTC stands for Coordinated Universal Time. All timestamps stored or retrieved from the cluster are based on UTC.

For example, you query the current data and time of the cluster at 18:13:34 on December 3, 2024 (UTC+8).

SELECT * FROM now();

The following result is returned:

              now               
--------------------------------
 2024-12-03 10:13:34.018557 +00
(1 row)

If you want to use the UTC+8 time in a PolarDB cluster, set the timezone parameter to PRC in the PolarDB console. For more information, see Configure cluster parameters. Query the current time.

SELECT * FROM now();

The following result is returned:

              now               
--------------------------------
 2024-12-03 18:14:34.841027 +08
(1 row)

INTEGER data type precision

Question

In an Oracle database, the INTEGER type is NUMBER(38), which can store integers up to 38 digits. In a PolarDB for PostgreSQL (Compatible with Oracle)cluster, the INTEGER type is a 32-bit integer. How do I address the difference in INTEGER data type precision between the source Oracle database and the destination PolarDB cluster?

Solution

To prevent out of integer errors in the PolarDB cluster, replace CAST(val AS INTEGER) in queries with CAST(val AS NUMBER(38)).

In the following example, the number 99999999999 exceeds the 32-bit integer limit of the PolarDB for PostgreSQL (Compatible with Oracle) cluster. This results in an out of integer error on the PolarDB for PostgreSQL (Compatible with Oracle) cluster. In contrast, the Oracle database can correctly handle the number.

SELECT CAST(99999999999 AS INTEGER) FROM dual;
  • Sample result for the Oracle database:

       CAST(99999999999ASINTEGER)
    _____________________________
                      99999999999
  • Sample result for the PolarDB for PostgreSQL (Compatible with Oracle) cluster:

    ERROR:  integer out of range

Change the query in the PolarDB for PostgreSQL (Compatible with Oracle) cluster.

SELECT CAST(99999999999 AS NUMBER(38)) FROM dual;

Sample result for the PolarDB for PostgreSQL (Compatible with Oracle) cluster:

   numeric   
-------------
 99999999999
(1 row)
  • On this page (1, T)
  • Sequence value synchronization
  • Encoding settings
  • Case sensitivity of table names
  • CHAR and VARCHAR data length semantics
  • Table without a primary key or unique key
  • \0 characters handling
  • Handing of empty strings
  • Time zone settings
  • INTEGER data type precision
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare