All Products
Search
Document Center

Data Online Migration:Migrate data

Last Updated:Nov 15, 2024

Data migration between Alibaba Cloud Object Storage Service (OSS) buckets means copying data from one OSS bucket to another. Such migration feature can help you efficiently transfer and manage data between different OSS buckets in scenarios such as data backup, data migration, and disaster recovery. This topic describes the usage notes of, limits on, and procedure for data migration between OSS buckets.

Usage notes

When you migrate data by using Data Online Migration, take note of the following items:

  • Data Online Migration accesses the source data address by using the public interfaces provided by the storage service provider of the source data address. The access behavior depends on the interface implementation of the storage service provider.

  • When Data Online Migration is used for migration, it consumes resources at the source and destination data addresses. This may interrupt your business. To ensure business continuity, we recommend that you enable throttling for your migration tasks or run the migration tasks during off-peak hours after careful assessment.

  • Before a migration task starts, Data Online Migration checks the files at the source and destination data addresses. If a file at the source data address and a file at the destination data address have the same name, and the File Overwrite Method parameter of the migration task is set to Yes, the file at the destination data address is overwritten during migration. If the two files contain different information and the file at the destination data address needs to be retained, we recommend that you change the name of one file or back up the file at the destination data address.

  • The LastModifyTime attribute of the source file is retained after the file is migrated to the destination bucket. If a lifecycle rule is configured for the destination bucket and takes effect, the migrated file whose last modification time is within the specified time period of the lifecycle rule may be deleted or archived in specific storage types.

Limits

  • If the static website hosting feature is enabled for the files at the source data address, directories that do not exist are found during the scanning for data migration. For example, if you upload the myapp/resource/1.jpg file and enable the static website hosting feature for the file, the following objects are found during the scanning for data migration: myapp/, myapp/resource/, and myapp/resource/1.jpg. The myapp/ and myapp/resource/ directories fail to be migrated because they do not exist. The myapp/resource/1.jpg file is migrated as expected.

  • Symbolic links that exist at the source data address are directly migrated to the destination data address. For more information, see Create symbolic links.

  • Data Online Migration allows you to migrate only the data of a single bucket in a task. You cannot migrate all data that belongs to your account in a single task.

  • Data Online Migration does not support data migration in Alibaba Finance Cloud or Alibaba Gov Cloud.

  • Only specific attributes of data can be migrated between OSS buckets.

    • Attributes that can be migrated are x-oss-meta-*, LastModifyTime, Content-Type, Cache-Control, Content-Encoding, Content-Disposition, Content-Language, and Expires.

    • Attributes that cannot be migrated include but are not limited to StorageClass, Acl, server-side encryption, Tagging, and user-defined x-oss-persistent-headers.

      Note

      The attributes that cannot be migrated include but are not limited to the preceding attributes. Check the actual migration results to find out other attributes that cannot be migrated.

Step 1: Select a region

How to select a region

The region in which you access the Data Migration console determines whether you are charged for reading data from the source OSS bucket. The following figure shows how to select the region in which you access the Data Migration console. You must select a region before you create a migration task.

1. If the source OSS bucket resides in the region in which you access the Data Migration console, you are not charged for reading data from the source OSS bucket over the Internet.

Note

For example, if the source and destination OSS buckets reside in the China (Beijing) region and you select the China (Beijing) region in the Data Migration console, no fees are generated for reading data from the source OSS bucket over the Internet during the migration.

同区域.jpg

2. If the source OSS bucket does not reside in the region in which you access the Data Migration console, you are charged for reading data from the source OSS bucket over the Internet.

Note

For example, if you migrate data from an OSS bucket that resides in the China (Beijing) region to an OSS bucket that resides in the Singapore region, and you select the Singapore region in the Data Migration console, fees are generated for reading data from the source OSS bucket over the Internet.

跨地域.jpg

Important

To transfer data by using the shortest connection, we recommend that you select the region in which the source OSS bucket resides when you access the Data Migration console. If no region is available, we recommend that you select a region that is close to your business to ensure high migration performance.

Procedure

  1. Log on to the Data Online Migration console as the Resource Access Management (RAM) user that you created for data migration.

    Note

    To migrate data across Alibaba Cloud accounts, you can log on as a RAM user that is created within the source or destination Alibaba Cloud account.

  2. In the upper-left corner of the top navigation bar, select the region in which the source data address resides or select the region that is closest to the region in which the source data address resides.选择地域

    The region that you select is the region in which your Data Online Migration is deployed. Supported regions inside China include China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Ulanqab), and China (Hong Kong), and supported regions outside China include Singapore, Germany (Frankfurt), and US (Virginia).

    Important
    • The data addresses and migration tasks that you create in a region cannot be used in another region. Select the region with caution.

    • We recommend that you select the region in which the source data address resides. If the region in which the source data address resides is not supported by Data Online Migration, select the region that is closest to the region in which the source data address resides.

    • To speed up cross-border data migration, we recommend that you enable transfer acceleration. If you enable transfer acceleration for OSS buckets, you are charged transfer acceleration fees. For more information, see Transfer acceleration.

Step 2: Create a source data address

  1. In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.

  2. In the Create Address panel, configure the parameters and click OK. The following table describes the parameters.

    Parameter

    Required

    Description

    Name

    Yes

    The name of the source data address. The name must meet the following requirements:

    • The name is 3 to 63 characters in length.

    • The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).

    • The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).

    Type

    Yes

    The type of the source data address. Select Alibaba OSS.

    Custom Domain Name

    No

    Specifies whether custom domain names are supported.

    Region

    Yes

    The region in which the source data address resides. Example: China (Hangzhou).

    Authorize Role

    Yes

    • The source bucket belongs to the Alibaba Cloud account that is used to log on to the Data Online Migration console

      • We recommend that you create and authorize a RAM role in the Data Online Migration console. For more information, see Authorize a RAM role in the Data Online Migration console.

      • You can also manually attach policies to a RAM role in the RAM console. For more information, see the "Step 3: Grant permissions on the source bucket to a RAM role" section of the Preparations topic.

    • The source bucket does not belong to the Alibaba Cloud account that is used to log on to the Data Online Migration console

      • You can attach policies to a RAM role in the OSS console. For more information, see the "Step 3: Grant permissions on the source bucket to a RAM role" section of the Preparations topic.

    Bucket

    Yes

    The name of the OSS bucket in which the data to be migrated is stored.

    Prefix

    No

    The prefix of the source data address. You can specify a prefix to migrate specific data. The prefix cannot start with a forward slash (/) but must end with a forward slash (/). Example: data/to/oss/.

    • Specify a prefix for the source data address: For example, you set the prefix of the source data address to example/src/, store a file named example.jpg in example/src/, and set the prefix of the destination data address to example/dest/. After the example.jpg file is migrated to the destination data address, the full path of the file is example/dest/example.jpg.

    • Do not specify a prefix for the source data address: For example, you specify no prefix for the source data address, the path of the file to be migrated is srcbucket/example.jpg, and you set the prefix of the destination data address to destbucket/. After the example.jpg file is migrated to the destination data address, the full path of the file is destbucket/srcbucket/example.jpg.

    Tunnel

    No

    The name of the tunnel that you want to use.

    Important
    • This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.

    • If data at the destination data address is stored in a local file system or you need to migrate data over an Express Connect circuit in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.

    Agent

    No

    The name of the agent that you want to use.

    Important
    • This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.

    • You can select up to 30 agents at a time for a specific tunnel.

Step 3: Create a destination data address

  1. In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.

  2. In the Create Address panel, configure the parameters and click OK. The following table describes the parameters.

  3. Parameter

    Required

    Description

    Name

    Yes

    The name of the destination data address. The name must meet the following requirements:

    • The name is 3 to 63 characters in length.

    • The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).

    • The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).

    Type

    Yes

    The type of the destination data address. Select Alibaba OSS.

    Custom Domain Name

    No

    Specifies whether custom domain names are supported.

    Region

    Yes

    The region in which the destination data address resides. Example: China (Hangzhou).

    Authorize Role

    Yes

    • The destination bucket belongs to the Alibaba Cloud account that is used to log on to the Data Online Migration console

      • We recommend that you create and authorize a RAM role in the Data Online Migration console. For more information, see Authorize a RAM role in the Data Online Migration console.

      • You can also manually attach policies to a RAM role in the RAM console. For more information, see the "Step 4: Grant permissions on the destination bucket to the RAM role" section of the Preparations topic.

    • The destination bucket does not belong to the Alibaba Cloud account that is used to log on to the Data Online Migration console

      • You can attach policies to a RAM role in the OSS console. For more information, see the "Step 4: Grant permissions on the destination bucket to the RAM role" section of the Preparations topic.

    Bucket

    Yes

    The name of the OSS bucket to which the data is migrated.

    Prefix

    No

    The prefix of the destination data address. You can specify a prefix to migrate specific data. The prefix cannot start with a forward slash (/) but must end with a forward slash (/). Example: data/to/oss/.

    • Specify a prefix for the destination data address: For example, you set the prefix of the source data address to example/src/, store a file named example.jpg in example/src/, and set the prefix of the destination data address to example/dest/. After the example.jpg file is migrated to the destination data address, the full path of the file is example/dest/example.jpg.

    • Do not specify a prefix for the destination data address: If you do not specify a prefix for the destination data address, the source data is migrated to the root directory of the destination bucket.

    Tunnel

    No

    The name of the tunnel that you want to use.

    Important
    • This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.

    • If data at the destination data address is stored in a local file system or you need to migrate data over an Express Connect circuit in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.

    Agent

    No

    The name of the agent that you want to use.

    Important
    • This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.

    • You can select up to 30 agents at a time for a specific tunnel.

Step 4: Create a migration task

Important

Up to five concurrent migration tasks can be executed in each region. If the number of concurrent migration tasks in a region exceeds this limit, periodic task scheduling may not be performed as expected.

  1. In the left-side navigation pane, choose Data Online Migration > Migration Tasks. On the Migration Tasks page, click Create Task.

  2. In the Select Address step, configure the parameters and click Next. The following table describes the parameters.

    Parameter

    Required

    Description

    Name

    Yes

    The name of the migration task. The name must meet the following requirements:

    • The name is 3 to 63 characters in length.

    • The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).

    • The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).

    Source Address

    Yes

    The source data address that you created.

    Destination Address

    Yes

    The destination data address that you created.

  3. In the Task Configurations step, configure the parameters that are described in the following table.

    Parameter

    Required

    Description

    Migration Bandwidth

    No

    The maximum bandwidth that is available to the migration task. Valid values:

    • Default: Use the default upper limit for the migration bandwidth. The actual migration bandwidth depends on the file size and the number of files.

    • Specify an upper limit: Specify a custom upper limit for the migration bandwidth as prompted.

    Important
    • The actual migration speed depends on multiple factors, such as the source data address, network, throttling at the destination data address, and file size. Therefore, the actual migration speed may not reach the specified upper limit.

    • Specify a reasonable value for the upper limit of the migration bandwidth based on the evaluation of the source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.

    Files Migrated Per Second

    No

    The maximum number of files that can be migrated per second. Valid values:

    • Default: Use the default upper limit for the number of files that can be migrated per second.

    • Specify an upper limit: Specify a custom upper limit as prompted for the number of files that can be migrated per second.

    Important
    • The actual migration speed depends on multiple factors, such as the source data address, network, throttling at the destination data address, and file size. Therefore, the actual migration speed may not reach the specified upper limit.

    • Specify a reasonable value for the upper limit of the migration bandwidth based on the evaluation of the source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.

    Overwrite Method

    Yes

    Specifies whether to overwrite a file at the destination data address if the file has the same name as a file at the source data address. Valid values:

    • Do not overwrite: does not migrate the file at the source data address.

    • Overwrite All: overwrites the file at the destination data address.

    • Overwrite based on the last modification time:

      • If the last modification time of the file at the source data address is later than that of the file at the destination data address, the file at the destination data address is overwritten.

      • If the last modification time of the file at the source data address is the same as that of the file at the destination data address, the file at the destination data address is overwritten if the files differ from one of the following aspects: size and Content-Type header.

    • Warning
      • If you select Overwrite based on the last modification time, a newer file may be overwritten by an older one that has the same name.

      • If you select Overwrite based on the last modification time, make sure that the file at the source data address contains information such as the last modification time, size, and Content-Type header. Otherwise, the overwrite policy may become invalid and unexpected migration results may occur.

      • If you select Do not overwrite or Overwrite based on the last modification time, the system sends a request to the source and destination data addresses to obtain the meta information and determines whether to overwrite a file. Therefore, request fees are generated for the source and destination data addresses.

    Migration Report

    Yes

    Specifies whether to push a migration report. Valid values:

    • Do not push (default): does not push the migration report to the destination bucket.

    • Push: pushes the migration report to the destination bucket. For more information, see Subsequent operations.

    Important
    • The migration report occupies storage space at the destination data address.

    • The migration report may be pushed with a delay. Wait until the migration report is generated.

    • A unique ID is generated for each execution of a task. A migration report is pushed only once. We recommend that you do not delete the migration report unless necessary.

    Migration Logs

    Yes

    Specifies whether to push migration logs to Simple Log Service (SLS). Valid values:

    • Do not push (default): does not push migration logs.

    • Push: pushes migration logs to SLS. You can view the migration logs in the SLS console.

    • Push only file error logs: pushes only error migration logs to SLS. You can view the error migration logs in the SLS console.

    If you select Push or Push only file error logs, Data Online Migration creates a project in SLS. The name of the project is in the aliyun-oss-import-log-Alibaba Cloud account ID-Region of the Data Online Migration console format. Example: aliyun-oss-import-log-137918634953****-cn-hangzhou.

    Important

    To prevent errors in the migration task, make sure that the following requirements are met before you select Push or Push only file error logs:

    • SLS is activated.

    • You have confirmed the authorization on the Authorize page.

    Authorize

    No

    This parameter is displayed if you set the Migration Logs parameter to Push or Push only file error logs.

    Click Authorize to go to the Cloud Resource Access Authorization page. On this page, click Confirm Authorization Policy. The RAM role AliyunOSSImportSlsAuditRole is created and permissions are granted to the RAM role.

    File Name

    No

    The filter based on the file name.

    Both inclusion and exclusion rules are supported. However, only the syntax of specific regular expressions is supported. For more information about the syntax of regular expressions, visit re2. Example:

    • .*\.jpg$ indicates all files whose names end with .jpg.

    • By default, ^file.* indicates all files whose names start with file in the root directory.

      If a prefix is configured for the source data address and the prefix is data/to/oss/, you need to use the ^data/to/oss/file.* filter to match all files whose names start with file in the specified directory.

    • .*/picture/.* indicates files whose paths contain a subdirectory called picture.

    Important
    • If an inclusion rule is configured, all files that meet the inclusion rule are migrated. If multiple inclusion rules are configured, files are migrated as long as one of the inclusion rules is met.

      For example, the picture.jpg and picture.png files exist and the inclusion rule .*\.jpg$ is configured. In this case, only the picture.jpg file is migrated. If the inclusion rule .*\.png$ is configured at the same time, both files are migrated.

    • If an exclusion rule is configured, all files that meet the exclusion rule are not migrated. If multiple exclusion rules are configured, files are not migrated as long as one of the exclusion rules is met.

      For example, the picture.jpg and picture.png files exist and the exclusion rule .*\.jpg$ is configured. In this case, only the picture.png file is migrated. If the exclusion rule .*\.png$ is configured at the same time, neither file is migrated.

    • Exclusion rules take precedence over inclusion rules. If a file meets both an exclusion rule and an inclusion rule, the file is not migrated.

      For example, the file.txt file exists, and the exclusion rule .*\.txt$ and the inclusion rule file.* are configured. In this case, the file is not migrated.

    File Modification Time

    No

    The filter based on the last modification time of files.

    You can specify the last modification time as a filter rule. If you specify a time period, only the files whose last modification time is within the specified time period are migrated. Examples:

    • If you specify January 1, 2019 as the start time and do not specify the end time, only the files whose last modification time is not earlier than January 1, 2019 are migrated.

    • If you specify January 1, 2022 as the end time and do not specify the start time, only the files whose last modification time is not later than January 1, 2022 are migrated.

    • If you specify January 1, 2019 as the start time and January 1, 2022 as the end time, only the files whose last modification time is not earlier than January 1, 2019 and not later than January 1, 2022 are migrated.

    Execution Time

    No

    Important
    1. If the current execution of a migration task is not complete by the next scheduled start time, the task starts its next execution at the subsequent scheduled start time after the current migration is complete. This process continues until the task is run the specified number of times.

    2. If Data Online Migration is deployed in the China (Hong Kong) region or the regions in the Chinese mainland, up to 10 concurrent migration tasks are supported. If Data Online Migration is deployed in regions outside China, up to five concurrent migration tasks are supported. If the number of concurrent tasks exceeds the limit, executions of tasks may not be complete as scheduled.

    The time when the migration task is run. Valid values:

    • Immediately: The task is immediately run.

    • Scheduled Task: The task is run within the specified time period every day. By default, the task is started at the specified start time and stopped at the specified stop time.

    • Periodic Scheduling: The task is run based on the execution frequency and number of execution times that you specify.

      • Execution Frequency: You can specify the execution frequency of the task. Valid values: Every Hour, Every Day, Every Week, Certain Days of the Week, and Custom. For more information, see the Supported execution frequencies section of this topic.

      • Executions: You can specify the maximum number of execution times of the task as prompted. By default, if you do not specify this parameter, the task is run once.

    Important

    You can manually start and stop tasks at any point in time. This is not affected by the custom execution time of tasks.

  4. Read Data Online Migration Agreement. Select I have read and agree to the Alibaba Cloud International Website Product Terms of Service. and I have understood that when the migration task is complete, the migrated data may be different from the source data. Therefore, I have the obligation and responsibility to confirm the consistency between the migrated data and source data. Alibaba Cloud is not responsible for the confirmation of the consistency between the migrated data and source data. Then, click Next.

  5. Verify that the configurations are correct and click OK. The migration task is created.

Supported execution frequencies

Frequency

Description

Example

Every Hour

Schedule a migration task to run every hour. If you select this execution frequency, you can also specify the maximum number of execution times of the task.

Schedule a migration task to run every hour for three times. If the current time is 08:05, the task starts its first execution at the beginning of the next hour, which is 09:00.

  • If the task completes its first execution before the beginning of the next hour, which is 10:00, the task starts its second execution at 10:00. This process continues until the task is run the specified number of times.

  • If the task completes its first execution at 12:30 on that same day, the task starts its second execution at the beginning of the next hour, which is 13:00. This process continues until the task is run the specified number of times.

Every Day

Schedule a migration task to run every day. If you select this execution frequency, you must schedule the task to run at the beginning of an hour from 00:00 to 23:00. You can also specify the maximum number of execution times of the task.

Schedule a migration task to run at 10:00 every day for five times. If the current time is 08:05, the task starts its first execution at 10:00 on the same day.

  • If the task completes its first execution before 10:00 on the next day, the task starts its second execution at 10:00 on the next day. This process continues until the task is run the specified number of times.

  • If the task completes its first execution at 12:05 on the next day, the task starts its second execution at 10:00 on the day after the next day. This process continues until the task is run the specified number of times.

Every Week

Schedule a migration task to run every week. If you select this execution frequency, you must specify a day of the week and schedule the task to run at the beginning of an hour from 00:00 to 23:00. You can also specify the maximum number of execution times of the task.

Schedule a migration task to run at 10:00 every Monday for 10 times. If the current time is 08:05 on Monday, the task starts its first execution at 10:00 on the same day.

  • If the task completes its first execution before 10:00 on the next Monday, the task starts its second execution at 10:00 on the next Monday. This process continues until the task is run the specified number of times.

  • If the task completes its first execution at 12:05 on the next Monday, the task starts its second execution at 10:00 on the Monday after the next Monday. This process continues until the task is run the specified number of times.

Certain Days of the Week

Schedule a migration task to run on specific days of the week. If you select this execution frequency, you must specify several days of the week and schedule the task to run at the beginning of an hour from 00:00 to 23:00.

Schedule a migration task to run at 10:00 every Monday, Wednesday, and Friday. If the current time is 08:05 on Wednesday, the task starts its first execution at 10:00 on the same day.

  • If the task completes its first execution before 10:00 on Friday, the task starts its second execution at 10:00 on Friday. This process continues until the task is run the specified number of times.

  • If the task completes its first execution at 12:05 on the next Monday, the task starts its second execution at 10:00 on the next Wednesday. This process continues until the task is run the specified number of times.

Custom

Use a CRON expression to specify a custom start time for a migration task.

Note

A CRON expression consists of six fields that are separated by spaces. The six fields specify the start time of a migration task in the following order: second, minute, hour, day of the month, month, and day of the week.

The following sample CRON expressions are for reference only. To generate more CRON expressions, use a CRON expression generator.

  • 0 0 * * * *: specifies that a migration task is run at the beginning of each hour.

  • 0 0 0/1 * * ?: The task is run at an interval of 1 hour. The minimum interval is 1 hour.

  • 0 0 12 * * MON-FRI: specifies that a migration task is run at 12:00 every Monday to Friday.

  • 0 30 8 1,15 * *: specifies that a migration task is run at 8:30 on the 1st and 15th days of each month.

Step 5: Verify data

Data Online Migration solely handles the migration of data and does not ensure data consistency or integrity. After a migration task is complete, you must review all the migrated data and verify the data consistency between the source and destination data addresses.

Warning

Make sure that you verify the migrated data at the destination data address after a migration task is complete. If you delete the data at the source data address before you verify the migrated data at the destination data address, you are liable for the losses and consequences caused by any data loss.