Migrate data of an Internet-based application to OSS - - Alibaba Cloud Documentation Center

Background information

Enterprise A provides an Internet-based application that can be used to edit media files such as images and videos. The main components of the application are deployed in a cloud-based architecture that is provided by Enterprise B. Enterprise A uses a cloud-based storage service that is provided by Enterprise B to store data. The media file editing application contains approximately 100,000,000 existing files and the total size of data is approximately 320 TB. The size of data increases by 20 GB per day. The business scenario of the application requires a maximum bandwidth of 50 Mbit/s. OSS provides a bandwidth of 250 Mbit/s for data writes and data reads.

To grow their business, Enterprise A wants to migrate data to OSS. The existing data and the incremental data must be migrated to OSS. To ensure that the data migration does not interrupt the business of the application, the following requirements must be met by the migration job:

During the migration, customers of Enterprise A can read data.
After the migration is complete, all data of the application is migrated to OSS and the application can run as expected without interruptions.

Migration solution

The following migration solution is created based on the requirements and the background information of Enterprise A:

Use Data Online Migration to migrate the existing data to OSS. Before the migration job is complete, the storage configuration of the application cannot be changed.
Create pull-from-origin rules in OSS. After the existing data is migrated to OSS, the pull-from-origin feature must be enabled. This way, customers of Enterprise A can access incremental data that is not migrated to OSS.
Change the data store of the application to OSS.
After the data store of the application is changed to OSS, use Data Online Migration to migrate the incremental data to OSS.
After the migration is complete, you must check and make sure that all data is migrated to OSS, and then delete the data from the source.

Step 1: Migrate the existing data

Create an OSS bucket that is used to store the migrated data. For more information, see Create buckets.
Obtain the AccessKey pairs that are used to migrate the data:
- Obtain the AccessKey pair that is used to access the data source. You can log on to the service console that is provided by Enterprise B and view the AccessKey pair.
- Obtain the AccessKey pair of the RAM user that you use to access the destination OSS bucket. For more information, see Create a RAM user and grant permissions to the RAM user.
In Data Online Migration console, create a migration job. For more information, see Implement migration.
In the Job Config step, configure the migration job. The following figure shows a sample configuration.
In the Performance step, configure the parameters to control the migration performance. The following figure shows a sample configuration.
After the migration is complete, you can export and view the result report to check whether all data is migrated. For information about how to obtain the result report, see View a migration report.

Note If a file fails to be migrated, troubleshoot the failure. For information about how to troubleshoot migration failures, see Common causes of a migration failure and solutions.

Step 2: Create pull-from-origin rules

The job takes approximately 25 days to migrate the existing data. During the migration, incremental data is generated and stored in the source bucket. To ensure business continuity and a seamless business switchover, you must create pull-from-origin rules to enable the pull-from-origin feature. After a pull-from-origin rule is created, OSS can fetch incremental data from the source bucket when incremental data that is not migrated to the OSS bucket is queried.

Log on to the OSS console.
In the left-side navigation pane, click Buckets.
On the Buckets page, click the name of the bucket for which you want to create a pull-from-origin rule.
In the left-side navigation pane, choose Basic Settings > Back-to-Origin. In the Back-to-Origin section, click Configure.
On the page that appears, click Create Rule. In the Create Rule panel, configure the parameters.
- Mode: Select Mirroring.
- Prerequisite: By default, HTTP Status Code 404 is selected. You can specify a prefix and a suffix for the names of files.
- Origin URL: Enter the URL of the source bucket.
- For more information about how to configure the parameters, see Create pull-from-origin rules.
Note You can create a maximum of five pull-from-origin rules for an OSS bucket. The created rules take effect at the same time. If the data is migrated from multiple source buckets, you can create multiple pull-from-origin rules and specify different prefixes for the names of files from different source buckets.
Click OK.

Step 3: Change the data store of the application to OSS

Modify the configurations of the servers to change the URL to which the servers connect to write and read data to the URL of the OSS bucket.

Step 4: Migrate the incremental data

During data migration, the application generates approximately 100,000 files that contain approximately 500 GB of data. These files must be migrated to the OSS bucket.

Create a migration job to migrate the incremental data. For more information, see Step 1: Migrate the existing data.

In the Job Config step, configure the migration job. The following figure shows a sample configuration.
In the Performance step, configure the parameters to control the migration performance. The following figure shows a sample configuration.
Click Create. After the migration job is created, the system starts to migrate data.
After the migration is complete, you can export and view the result report to check whether all data is migrated. For information about how to obtain the result report, see View a migration report.

Note If a file fails to be migrated, troubleshoot the failure. For information about how to troubleshoot migration failures, see Common causes of a migration failure and solutions.

Step 5: Delete the data from the source bucket

To avoid extra storage costs, you can set the lifecycle of the objects that are stored in the source bucket to one day after all data is migrated to OSS. This way, all data in the source bucket is deleted the next day.