All Products
Search
Document Center

Data Transmission Service:Migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance

Last Updated:Oct 17, 2024

You can use Data Transmission Service (DTS) to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance or a self-managed Kafka cluster. The data migration feature allows you to extend message processing capabilities. This topic describes how to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance.

Prerequisites

  • The version number of the self-managed Oracle database is 9i, 10g, 11g, 12c, 18c, or 19c.

  • Supplemental logging, SUPPLEMENTAL_LOG_DATA_PK, and SUPPLEMENTAL_LOG_DATA_UI are enabled for the self-managed Oracle database. For more information, see Supplemental Logging.

  • The self-managed Oracle database is running in ARCHIVELOG mode. Archived log files are accessible and a suitable retention period is set for archived log files. For more information, see Managing Archived Redo Log Files.

  • The network environment is deployed for the source self-managed Oracle database. For more information, see Preparation overview.

  • The tables to be migrated from the self-managed Oracle database contain primary keys or UNIQUE NOT NULL indexes.

  • The version number of the Message Queue for Apache Kafka instance is in the range from 0.10.1.0 to 2.x. The version number of the self-managed Kafka cluster is in the range from 0.10.1.0 to 2.7.0.

  • The available storage space of the destination Message Queue for Apache Kafka instance is larger than the total size of the data in the self-managed Oracle database.

  • In the destination Kafka instance, a topic is created to receive the synchronized data. For more information, see Create a topic.

Usage notes

  • DTS uses read and write resources of the source and destination databases during full data migration. This may increase the loads of the database servers. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. For example, you can migrate data when the CPU utilization of the source and destination databases is less than 30%.

  • If a data migration task fails and stops, DTS automatically resumes the task. Before you switch your workloads to the destination database, stop or release the data migration task. Otherwise, the data from the source database overwrites the data in the destination database after the task is resumed.

  • If the self-managed Oracle database is deployed in a Real Application Cluster (RAC) architecture and is connected to DTS over an Alibaba Cloud virtual private cloud (VPC), you must connect the Single Client Access Name (SCAN) IP address of the Oracle RAC and the virtual IP address (VIP) of each node to the VPC and configure routes. The settings ensure that your DTS task can run as expected. For more information, see Connect a data center to DTS by using VPN Gateway.

    Important

    When you configure the source Oracle database in the DTS console, you can specify the SCAN IP address of the Oracle RAC as the database endpoint or IP address.

  • If the version of your Oracle database is 12c or later, the names of the tables to be migrated cannot exceed 30 bytes in length.

  • The tables to be migrated in the source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

Billing rules

Migration type

Task configuration fee

Internet traffic fee

Schema migration and full data migration

Free of charge.

Charged only when data is migrated from Alibaba Cloud over the Internet. For more information, see Billing overview.

Incremental data migration

Charged. For more information, see Billing overview.

Migration types

Migration type

Description

Schema migration

DTS migrates the schemas of the required objects from the source database to the destination database. In this scenario, DTS can migrate only the schemas of tables.

Full data migration

DTS migrates the historical data of required objects from the source database to the destination database.

Note

During schema migration and full data migration, do not perform DDL operations on the objects to be migrated. Otherwise, the objects may fail to be migrated.

Incremental data migration

After full data migration is complete, DTS retrieves redo log files from the source Oracle database. Then, DTS migrates incremental data from the source Oracle database to the destination database in real time. Incremental data migration ensures service continuity when you migrate data between self-managed databases.

During incremental data migration, DTS can synchronize DML and DDL operations.

Before you begin

Log on to the self-managed Oracle database, create an account that you want to use to collect data, and grant permissions to the account.

Note

If you have created a database account and the account has permissions that are listed in the following table, skip this step.

Database

Schema migration

Full data migration

Incremental data migration

Self-managed Oracle database

Permissions of the schema owner

Permissions of the schema owner

Database administrator (DBA)

For more information about how to create a database account and grant permissions to the database account, see the following topics:

Self-managed Oracle database: CREATE USER and GRANT

Enable logging and grant fine-grained permissions to an Oracle database account

Important

If you need to migrate incremental data from an Oracle database but the database administrator (DBA) permissions cannot be granted to the database account, you can enable archive logging and supplemental logging, and grant fine-grained permissions to the account.

  1. Enable archive logging and supplemental logging.

    Type

    Procedure

    Archive logging

    Execute the following statements to enable archive logging:

    shutdown immediate;
    startup mount;
    alter database archivelog;
    alter database open;
    archive log list;

    Supplemental logging

    Enable supplemental logging at the database or table level based on your business requirements.

    Note

    You can enable database-level supplemental logging to ensure the stability of Data Transmission Service (DTS) tasks. You can enable table-level supplemental logging to reduce the disk usage of the source Oracle database.

    • Enable database-level supplemental logging

      1. Execute the following statement to enable minimal supplemental logging:

        alter database add supplemental log data;
      2. Execute the following statement to enable primary key and unique key supplemental logging at the database level:

        alter database add supplemental log data (primary key,unique index) columns;
    • Enable table-level supplemental logging

      1. Execute the following statement to enable minimal supplemental logging:

        alter database add supplemental log data;
      2. Enable table-level supplemental logging by using one of the following methods:

        • Enable primary key supplemental logging at the table level

          alter table table_name add supplemental log data (primary key) columns;
        • Enable table-level supplemental logging for all columns

          alter table tb_name add supplemental log data (all) columns;

    Force logging

    Execute the following statement to enable force logging:

    alter database force logging;
  2. Grant fine-grained permissions to an Oracle database account.

    Oracle versions 9i to 11g

    # Create a database account named rdsdt_dtsacct and grant permissions to the account.
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create session to rdsdt_dtsacct;
    grant connect to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select on V_$LOGMNR_LOGS to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    -- system tables
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;

    Oracle versions 12c to 19c that use the multitenant architecture

    # Switch to the pluggable database (PDB). Create a database account named rdsdt_dtsacct and grant permissions to the account.
    ALTER SESSION SET container = ORCLPDB1;
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create  session to rdsdt_dtsacct;
    grant connect  to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    -- V$PDBS privileges
    grant select on V_$PDBS to rdsdt_dtsacct;
    grant select on v$database to rdsdt_dtsacct;
    grant select on dba_objects to rdsdt_dtsacct;
    grant select on DBA_TAB_COMMENTS to rdsdt_dtsacct;
    grant select on dba_tab_cols to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;
    
    # Switch to the CDB$ROOT, which is the root container of the container database (CDB). Create a database account and grant permissions to the account.
    ALTER SESSION SET container = CDB$ROOT;
    # Create a database account named rdsdt_dtsacct and grant permissions to the account. You must modify the default parameters of the Oracle database. 
    alter session set "_ORACLE_SCRIPT"=true;
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create session to rdsdt_dtsacct;
    grant connect to rdsdt_dtsacct;
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant LOGMINING TO rdsdt_dtsacct;
    grant EXECUTE_CATALOG_ROLE to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;

    Oracle versions 12c to 19c that use a non-multitenant architecture

    # Create a database account named rdsdt_dtsacct and grant permissions to the account.
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create  session to rdsdt_dtsacct;
    grant connect  to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant select on V_$LOGMNR_LOGS to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    grant select on v$database to rdsdt_dtsacct;
    grant select on dba_objects to rdsdt_dtsacct;
    grant select on DBA_TAB_COMMENTS to rdsdt_dtsacct;
    grant select on dba_tab_cols to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    grant LOGMINING TO rdsdt_dtsacct;
    grant EXECUTE_CATALOG_ROLE to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;

Procedure

  1. Log on to the DTS console.

    Note

    If you are redirected to the Data Management (DMS) console, you can click the old icon in the image to go to the previous version of the DTS console.

  2. In the left-side navigation pane, click Data Migration.

  3. In the upper part of Migration Tasks page, select the region in which the destination instance resides.

  4. In the upper-right corner of the page, click Create Migration Task.

  5. Configure the source and destination databases.

    Configure the source and destination databases

    Section

    Parameter

    Description

    N/A

    Task Name

    The task name that DTS automatically generates. We recommend that you specify a name that indicates your business requirements for easy identification. You do not need to use a unique name.

    Source Database

    Instance Type

    The access method of the source database. In this example, User-Created Database with Public IP Address is selected.

    Note

    If the source self-managed database is of another type, you must set up the environment that is required for the self-managed database. For more information, see Preparation overview.

    Instance Region

    If you select User-Created Database with Public IP Address as the instance type, you do not need to specify the Instance Region parameter.

    Note

    If an IP address whitelist is configured for the self-managed Oracle database, you must add the CIDR blocks of DTS servers to the IP address whitelist of the database. You can click Get IP Address Segment of DTS next to Instance Region to obtain the CIDR blocks of DTS servers.

    Database Type

    The type of the source database. Select Oracle.

    Hostname or IP Address

    The endpoint that is used to connect to the self-managed Oracle database. In this example, the public IP address of the database is used.

    Port Number

    The service port number of the self-managed Oracle database. Default value: 1521.

    Note

    The service port of the self-managed Oracle database must be accessible over the Internet.

    Instance Type

    • The architecture type of the self-managed Oracle database. If you select Non-RAC Instance, you must specify the SID parameter.

    • If you select RAC or PDB Instance, you must specify the Service Name parameter.

    Database Account

    The account of the self-managed Oracle database. For information about the permissions that are required for the account, see Before you begin.

    Database Password

    The password of the account of the self-managed Oracle database.

    Note

    After you specify the information about the source database, you can click Test Connectivity next to Database Password to check whether the information is valid. If the information is valid, the Passed message appears. If the Failed message appears, click Check next to Failed. Then, modify the information based on the check results.

    Destination Database

    Instance Type

    The access method of the source database. Select User-Created Database Connected Over Express Connect, VPN Gateway, or Smart Access Gateway.

    Note

    You cannot specify Message Queue for Apache Kafka for the Instance Type parameter. You can use Message Queue for Apache Kafka as a self-managed Kafka cluster to configure data synchronization.

    Instance Region

    The region in which the destination Message Queue for Apache Kafka instance resides.

    Peer VPC

    The ID of the virtual private cloud (VPC) to which the destination Message Queue for Apache Kafka instance belongs. To obtain the VPC ID, perform the following operations: Log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Configuration Information section, view the VPC ID. kafka_vpcid

    Database Type

    The type of the destination database. Select Kafka.

    IP Address

    The IP address that is included in the Default Endpoint parameter of the Message Queue for Apache Kafka instance.

    Note

    To obtain an IP address, perform the following operations: Log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Endpoint Information section of the Instance Information tab, view the IP address included in the Default Endpoint parameter.

    Port Number

    The service port number of the Message Queue for Apache Kafka instance. Default value: 9092.

    Database Account

    The database account that is used to log on to the Message Queue for Apache Kafka instance.

    Note

    If the Message Queue for Apache Kafka instance is of the VPC Instance type, you do not need to specify the Database Account and Database Password parameters.

    Database Password

    The password of the database account that is used to log on to the Message Queue for Apache Kafka instance.

    Topic

    Click Get Topic list next to Topic and select a topic from the drop-down list.

    Topic for storing DDL

    Click Get Topic list next to Topic for storing DDL, and select a topic from the drop-down list. The topic is used to store the DDL information. If you do not specify this parameter, the DDL information is stored in the topic that is specified by the Topic parameter.

    Kafka Version

    The version of the Message Queue for Apache Kafka instance.

    Encryption

    Specifies whether to encrypt the connection to the destination cluster. Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.

    Whether to Use Kafka schema registry

    Kafka Schema Registry provides a serving layer for your metadata. It provides a RESTful API to store and retrieve your Avro schemas.

    • No: does not use Kafka Schema Registry.

    • Yes: uses Kafka Schema Registry. In this case, you must enter the URL or IP address that is registered in Kafka Schema Registry for your Avro schemas.

  6. In the lower-right corner of the page, click Set Whitelist and Next.

    Warning

    If the CIDR blocks of DTS servers are automatically or manually added to the whitelist of the database or instance, or to the ECS security group rules, security risks may arise. Therefore, before you use DTS to migrate data, you must understand and acknowledge the potential risks and take preventive measures, including but not limited to the following measures: enhance the security of your username and password, limit the ports that are exposed, authenticate API calls, regularly check the whitelist or ECS security group rules and forbid unauthorized CIDR blocks, or connect the database to DTS by using Express Connect, VPN Gateway, or Smart Access Gateway.

  7. Select the migration types, the migration policy, and the objects to be migrated.

    Select the objects to be migrated

    Parameter or setting

    Description

    Migration Types

    Select Schema Migration, Full Data Migration, and Incremental Data Migration.

    Important

    If Incremental Data Migration is not selected, we recommend that you do not write data to the source database during full data migration. This ensures data consistency between the source and destination databases.

    Select the data format used in Kafka

    The data that is migrated to the Kafka cluster is stored in the Avro format. You must parse the migrated data based on the Avro schema. For more information, see DTS Avro schema.

    Select the policy for migrating data to Kafka partitions

    Select a migration policy based on your business requirements. For more information, see Specify the policy for migrating data to Kafka partitions.

    Select the objects to be migrated

    Select one or more tables from the Available section and click the Rightwards arrow icon to add the tables to the Selected section.

    Note

    DTS maps the table names to the name of the topic that you select in Step 5. For more information about how to change the topic name, see Object name mapping.

    Specify whether to rename objects

    You can use the object name mapping feature to rename the objects that are migrated to the destination instance. For more information, see Object name mapping.

    Specify the retry time range for failed connections to the source or destination database

    By default, if DTS fails to connect to the source or destination database, DTS retries within the following 12 hours. You can specify the retry time range based on your business requirements. If DTS is reconnected to the source or destination database within the specified retry time range, DTS resumes the data migration task. Otherwise, the data migration task fails.

    Note

    Within the retry time range in which DTS attempts to reconnect to the source and destination databases, you are charged for using the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at the earliest opportunity after the source and destination instances are released.

  8. In the lower-right corner of the page, click Precheck.

    Important
    • Before you can start the data migration task, a precheck is performed. You can start the data migration task only after the task passes the precheck.

    • If the task fails to pass the precheck, you can click the Info icon icon next to each failed item to view details.

      • After you troubleshoot the issues based on the causes, you can run a precheck again.

      • If you do not need to troubleshoot the issues, you can ignore failed items and run a precheck again.

  9. After the task passes the precheck, click Next.

  10. In the Confirm Settings dialog box, specify the Channel Specification parameter and select Data Transmission Service (Pay-As-You-Go) Service Terms.

  11. Click Buy and Start to start the data migration task.

Stop the data migration task

Warning

We recommend that you prepare a rollback solution to migrate incremental data from the destination database to the source database in real time. This allows you to minimize the negative impact of switching your workloads to the destination database. For more information, see Switch workloads to the destination database. If you do not need to switch your workloads, you can perform the following steps to stop the data migration task.

  • Full data migration

    Do not manually stop a task during full data migration. Otherwise, the system may fail to migrate all data. Wait until the migration task automatically ends.

  • Incremental data migration

    The task does not automatically end during incremental data migration. You must manually stop the migration task.

    1. Wait until the task progress bar shows Incremental Data Migration and The migration task is not delayed. Then, stop writing data to the source database for a few minutes. In some cases, the progress bar shows the delay time of incremental data migration.
    2. After the status of incremental data migration changes to The migration task is not delayed, manually stop the migration task.Stop a task during incremental migration

What to do next

The database accounts that are used for data migration have read and write permissions. After data migration is complete, you must delete the account of the self-managed Oracle database. You must also modify the permissions of the RAM user in the destination Kafka instance. For more information, see Grant permissions to RAM users.