All Products
Search
Document Center

ApsaraDB for OceanBase:Migrate data from a Lindorm database to OBKV

Last Updated:Dec 12, 2024

This topic describes how to use the data transmission service to migrate data from a cloud-native multi-model Lindorm database to OBKV.

Important

A data migration task remaining in an inactive state for a long time may fail to be resumed depending on the retention period of incremental logs. Inactive states are Failed, Stopped, and Completed. The data transmission service releases data migration tasks remaining in an inactive state for more than 3 days to reclaim related resources. We recommend that you configure alerting for data migration tasks and handle task exceptions in a timely manner.

Background

You can create a task in the ApsaraDB for OceanBase console to seamlessly migrate existing and incremental business data from a Lindorm database to OBKV through schema migration, full migration, and incremental synchronization.

Lindorm is a cloud-native multi-model integrated database designed and optimized for the Internet of Things (IoT), Internet, and Internet of Vehicles (IoV). The data transmission service supports LindormTable, a wide table engine service. For more information, see Overview.

OBKV belongs to the Standard Edition (Key-Value) series supported by ApsaraDB for OceanBase Standard Edition. It is developed based on OceanBase Database, a relational SQL database system, and fully reuses the Shared-Nothing architecture of OceanBase Database. For more information, see Product series.

Prerequisites

  • The data transmission service has the privilege to access cloud resources. For more information, see Grant privileges to roles for data transmission.

  • You have created a dedicated user in the source Lindorm database for data migration, and granted required permissions to the user. For more information, see Lindorm data source.

  • You have created a dedicated user in the destination OBKV instance for data migration, and granted required permissions to the user.

Limitations

  • At present, the data transmission service supports LindormTable V2.x, and OBKV V4.2.1 and V4.2.4.

  • The data transmission service can migrate only the data whose tablename and columnfamily values consist of digits (0 to 9), lowercase letters (a to z), uppercase letters (A to Z), and underscores (_) from a Lindorm database to OBKV.

  • When you migrate data from a Lindorm database to OBKV, only HBase table objects are supported. Other types of table objects may cause data quality issues.

  • We recommend that you set the number of Kafka partitions to 1 when you configure incremental synchronization settings.

Considerations

  • We recommend that you create a data migration task between a Lindorm database and OBKV only for data migration, but not for long-term data synchronization. In the following cases, data quality issues may occur in the incremental synchronization phase and will be detected in the full verification phase.

    • If you update values by specifying the version, which is a timestamp, the data transmission service cannot identify the updates.

    • If you perform a Put operation that sets the field value of a rowkey to an empty string ("), the value may be deleted during the incremental synchronization phase.

  • When you migrate data from a Lindorm database to OBKV, data is pulled only in IN mode during full verification. In this mode, verification is inapplicable if the destination contains data that does not exist in the source, and as a result, the verification performance is downgraded.

  • If the source or destination contains table objects that differ only in letter cases, the data migration results may not be as expected due to case insensitivity in the source or destination.

  • If the clocks between nodes or between the client and the server are out of synchronization, the latency may be inaccurate during incremental synchronization or reverse incremental migration.

    For example, if the clock is earlier than the standard time, the latency can be negative. If the clock is later than the standard time, the latency can be positive.

  • If the Time To Live (TTL) values configured for the Lindorm database and OBKV are inconsistent, the data in the source and destination may be inconsistent.

Procedure

  1. Log on to the ApsaraDB for OceanBase console and purchase a data migration task.

    For more information, see Purchase a data migration task.

  2. Choose Data Transmission > Data Migration. On the page that appears, click Configuration for the data migration task.

    image.png

    If you want to reference the configurations of an existing task, click Reference Configuration. For more information, see Reference the configuration of a data migration task.

  3. On the Select Source and Destination page, configure the parameters.

    Parameter

    Description

    Data Migration Task Name

    We recommend that you set it to a combination of digits and letters. It must not contain any spaces and cannot exceed 64 characters in length.

    Tag

    Click the field and select a target tag from the drop-down list. You can also click Manage Tags to create, modify, and delete tags. For more information, see Use tags to manage data migration tasks.

    Source

    If you have created a Lindorm data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create a Lindorm data source.

    Destination

    If you have created an OBKV data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create an OceanBase data source.

  4. Click Next. On the Select Migration Type page, specify migration types for the current data migration task.

    Options for Migration Type are Schema Migration, Full Data Migration, Incremental Synchronization, Full Verification, and Reverse Increment.

    image

    Migration type

    Description

    Schema migration

    When a schema migration task starts, the data transmission service migrates the definitions of data objects, such as tables and column families, from the source database to the destination database, and automatically filters out temporary tables.

    Full data migration

    After a full migration task is started, the data transmission service migrates existing data of tables in the source database to corresponding tables in the destination database.

    Incremental synchronization

    After an incremental synchronization task is started, the data transmission service synchronizes changed data (data that is added, modified, or removed) from the source database to corresponding tables in the destination database.

    DML Synchronization is supported for Incremental Synchronization. You can select operations as needed. For more information, see Configure DDL/DML synchronization.

    Note
    • If you create a Lindorm data source without binding it to a Kafka data source, you cannot select Incremental Synchronization.

    • If you want to select DML Synchronization in Incremental Synchronization, contact Alibaba Cloud Technical Support to confirm that the data delivered to Kafka is ordered. Otherwise, the data in the source and destination may be inconsistent. For more information about the configuration for delivering data from Lindorm to Kafka, see Overview.

    Full verification

    After the full migration and incremental synchronization tasks are completed, the data transmission service automatically initiates a full verification task to verify the tables in the source and destination databases.

    Reverse incremental migration

    Data changes made in the destination database after the business database switchover are synchronized to the source database in real time through reverse incremental migration.

    Generally, incremental synchronization configurations are reused for reverse incremental migration. You can also customize the configurations for reverse incremental migration as needed.

  5. (Optional) Click Next. In the More about Data Sources dialog box, select the Kafka data source and the corresponding topic. Then, click Test Connectivity. After the test succeeds, click Save.

    Note

    The More about Data Sources dialog box appears during task creation only if you do not select Incremental Synchronization Settings when you create the Lindorm data source, but select Incremental Synchronization in the Select Migration Type step. For more information about the parameters, see Create a Lindorm data source.

    image

  6. Click Next. On the Select Migration Objects page, select the migration objects of the current data migration task.

    Select the object to be migrated on the left, and click > to add it to the list on the right.

    The data transmission service allows you to import objects from text files, rename destination objects, set partitions, and remove a single migration object or all migration objects.

    image

    Operation

    Description

    Import objects

    1. In the list on the right, click Import Objects in the upper-right corner.

    2. In the dialog box that appears, click OK.

      Important

      This operation will overwrite previous selections. Proceed with caution.

    3. In the Import Objects dialog box, import the objects to be migrated.

      You can import CSV files to rename databases or tables and set row filtering conditions. For more information, see Download and import the settings of migration objects.

    4. Click Validate.

      After you import the migration objects, check their validity.

    5. After the validation succeeds, click OK.

    Rename objects

    The data transmission service allows you to rename migration objects. The name of a migration object is in the tablename$columnfamily format. For more information, see Rename a database table.

    Important

    When you migrate data from a Lindorm database to OBKV, you can migrate only objects in the `default` namespace to the `default` database of OBKV, and you cannot rename the `default` database.

    Configure settings

    The data transmission service allows you to set partitions only if Schema Migration is selected.

    In the Partition Settings dialog box, Partition Method supports only Key Partition. You can configure the following options.

    • Partition Key

      The default value is `K` and cannot be modified.

    • Number of partitions

      You can specify a value ranging from 1 to 1024.

    • Use combined virtual column partition

      By default, this option is not selected, which specifies to use KEY partitioning without adding a virtual column.

      If you select this option, you need to specify Virtual Column Definition. Only substring virtual columns are supported, for example, substring(K,1,4).

    Remove one or all objects

    The data transmission service allows you to remove a single object or all migration objects that are added to the right-side list during data mapping.

    • Remove a single migration object

      In the list on the right, move the pointer over the object that you want to remove, and click Remove to remove the migration object.

    • Remove all migration objects

      In the list on the right, click Remove All in the upper-right corner. In the dialog box that appears, click OK to remove all migration objects.

  7. Click Next. On the Migration Options page, configure the parameters.

    • Full data migration

      The following table describes the parameters for full migration, which are displayed only if you have selected Full Data Migration on the Select Migration Type page.

      image

      Parameter

      Description

      Read Concurrency Configuration

      The concurrency for reading data from the source during full migration. The maximum value is 512. A high read concurrency may incur excessive stress on the source, affecting the business.

      Write Concurrency Configuration

      The concurrency for writing data to the destination during full migration. The maximum value is 512. A high write concurrency may incur excessive stress on the destination, affecting the business.

      Full Data Migration Rate Limit

      You can choose whether to limit the full migration rate as needed. If you choose to limit the full migration rate, you must specify the records per second (RPS) and bytes per second (BPS). The RPS specifies the maximum number of data rows migrated to the destination per second during full migration, and the BPS specifies the maximum amount of data in bytes migrated to the destination per second during full migration.

      Note

      The RPS and BPS values specified here are only for throttling. The actual full migration performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Processing Strategy When Destination Table Has Records

      The processing strategy adopted when a destination table contains records. This parameter is displayed only if Full Data Migration is selected on the Select Migration Type page. Valid values: Overwrite and Stop Migration.

      • If you select Overwrite, existing data in the destination may be overwritten by data in the source. For example, if the destination contains a data record with the same key and timestamp as a data record in the source, other values of this data record in the destination will be replaced with those in the source.

      • If you select Stop Migration and a destination table contains records, an error prompting migration unsupported is reported during full migration. In this case, you must process the data in the destination table and then continue with the migration.

        Important

        If you click Resume in the dialog box prompting the error, the data transmission service ignores this error and continues to migrate data. Proceed with caution.

      Migrate Data Version

      Lindorm supports multi-version storage and marks data row versions by timestamps. If you select All Versions, all versions of rows are migrated during full data migration. If you select Latest Version, only rows of the latest timestamps are migrated during full data migration.

    • Incremental synchronization

      The following table describes the parameters for incremental synchronization, which are displayed only if you have selected Incremental Synchronization on the Select Migration Type page.

      image

      Parameter

      Description

      Write Concurrency Configuration

      The concurrency for writing data to the destination during incremental synchronization. The maximum value is 512. A high write concurrency may incur excessive stress on the destination, affecting the business.

      Incremental Synchronization Rate Limit

      You can choose whether to limit the incremental synchronization rate as needed. If you choose to limit the incremental synchronization rate, you must specify the the records per second (RPS) and bytes per second (BPS). The RPS specifies the maximum number of data rows synchronized to the destination per second during incremental synchronization, and the BPS specifies the maximum amount of data in bytes synchronized to the destination per second during incremental synchronization.

      Note

      The RPS and BPS values specified here are only for throttling. The actual incremental synchronization performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Incremental Synchronization Start Timestamp

      • If you have set the migration type to Full Data Migration, this parameter is not displayed.

      • If you have selected Incremental Synchronization but not Full Data Migration, specify a point in time after which the data is to be synchronized. The default value is the current system time. For more information, see Set an incremental synchronization timestamp.

    • Reverse incremental migration

      The following table describes the parameters for reverse incremental migration, which are displayed only if you have selected Reverse Increment on the Select Migration Type page. By default, incremental synchronization configurations are reused for reverse incremental migration.

      image

      You can choose not to reuse the incremental synchronization configurations and configure reverse incremental migration as needed.

      Parameter

      Description

      Write Concurrency Configuration

      The concurrency for writing data to the source during reverse incremental migration. The maximum value is 512. A high concurrency may incur excessive stress on the source, thereby affecting the business.

      Incremental Synchronization Rate Limit

      You can choose whether to limit the incremental synchronization rate as needed. If you choose to limit the reverse incremental migration rate, you must specify the the RPS and BPS. The RPS specifies the maximum number of data rows synchronized to the source per second during reverse incremental migration, and the BPS specifies the maximum amount of data in bytes synchronized to the source per second during reverse incremental migration.

      Note

      The RPS and BPS values specified here are only for throttling. The actual reverse incremental migration performance is subject to factors such as the settings of the source and destination and the instance specifications.

      Incremental Synchronization Start Timestamp

      By default, the forward switchover start timestamp (if any) prevails. This parameter cannot be modified.

    • Full verification

      This section is displayed only if you have selected Full Verification on the Select Migration Type page.

      image

      Lindorm supports multi-version storage and marks data row versions by timestamps. If you select All Versions for Verification Data Version, all data row versions are verified during full verification. If you select Latest Version for Verification Data Version, only data rows of the latest timestamp are verified during full verification.

  8. Click Precheck to start a precheck on the data migration task.

    During the precheck, the data transmission service checks the read and write privileges of the database users and the network connections of the databases. A data migration task can be started only after it passes all check items. If an error is returned during the precheck, you can perform the following operations:

    • Identify and troubleshoot the problem and then perform the precheck again.

    • Click Skip in the Actions column of the failed precheck item. In the dialog box that prompts the consequences of the operation, click OK.

  9. After the precheck succeeds, click Start Task.

    If you do not need to start the task now, click Save. You can start the task later on the Migration Tasks page or by performing batch operations. For more information about batch operations, see Perform batch operations on data migration tasks. After the data migration task is started, it will be executed based on the selected migration types. For more information, see View the details of a data migration task.

    The data transmission service allows you to remove migration objects when the data migration task is running. For more information, see View and modify migration objects.

    Important

    You cannot add migration objects during the running of a task for migrating data from a Lindorm database to OBKV.

References