All Products
Search
Document Center

Data Management:Archive data to an OSS bucket

Last Updated:Jun 27, 2024

This topic describes how to archive data to an Object Storage Service (OSS) bucket and restore the archived data to the source database.

Prerequisites

  • The source database from which you want to archive data is of one of the following types:

    • MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, and AnalyticDB for MySQL V3.0

    • PostgreSQL: ApsaraDB RDS for PostgreSQL and PolarDB for PostgreSQL

    • MongoDB 3.6 and later

    • PolarDB-X

    Note

    The account of a MySQL database must have the REPLICATION CLIENT permission.

  • The source and destination database instances are managed in Stable Change or Security Collaboration mode.

    Note

    If you want to configure a periodic data archiving task and the source and destination database instances are managed in Stable Change mode, we recommend that you enable security hosting for the database instances or change the control mode of the database instances to Security Collaboration. Otherwise, the data archiving task may fail due to instance logon expiration. For more information, see the Enable security hosting section of the "Security hosting" topic and Change the control mode of an instance.

  • A Standard or an Infrequent Access (IA) OSS bucket is purchased. For more information, see the Activate OSS section of the "Get started by using the OSS console" topic.

  • The OSS bucket is registered with Data Management (DMS). For more information, see Register an Alibaba Cloud database instance.

    Important
    • We recommend that you register an OSS bucket with DMS over a public endpoint. If an OSS bucket is registered with DMS over an internal endpoint, you cannot archive data to the OSS bucket. If you have other questions, contact technical support in the DingTalk group (ID 19040023327).

    • Before you register the OSS bucket with DMS, make sure that you have an AccessKey ID and AccessKey secret of the current Alibaba Cloud account to complete identity verification. For more information about how to obtain an AccessKey pair, see Create an AccessKey pair.

  • The source tables whose data you want to archive have a primary key or a unique key.

    Note

    We recommend that you provide a field that indicates the time of each data modification operation in the source tables. You can use this field as a filter condition when you archive data from the source tables.

Usage notes

  • If the source table has a composite primary key that consists of three columns or the source table has three or more unique keys, the data archiving task may fail. We recommend that you fully test the source tables that have a three-column composite primary key before you perform data archiving, or use other archiving methods such as Database Backup (DBS).

  • The data archiving feature is supported only in the Singapore and Indonesia (Jakarta) regions.

Procedure

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, choose Solution > Data archiving.
  3. In the upper-right corner of the Data Archiving Tickets page, click Data Archiving.

  4. On the Ticket Application page, configure parameters to create a data archiving ticket. The following table describes the parameters.

    Parameter

    Description

    Task Name

    The name of the data archiving task. We recommend that you specify a descriptive name that makes it easy to identify the task.

    Archiving Destination

    The destination to which you want to archive data. Select OSS.

    OSS Instance

    The destination OSS instance to which you want to archive data.

    Note

    Select a Standard or an IA OSS instance.

    OSS BUCKET

    The OSS bucket that belongs to the selected OSS instance.

    Important
    • If you cannot find an OSS bucket that you require in the selected OSS instance, log on to the OSS instance in the DMS console. For more information, see Register an OSS bucket.

    • The archived data is stored in the archive folder in the OSS bucket. Do not delete the folder.

    Source Database

    The source database from which you want to archive data.

    Archive Configuration

    Specifies one or more tables whose data you want to archive.

    • Optional. Specify one or more filter conditions for querying data to be archived in the specified tables.

      Note

      If you want to use time variables in scenarios such as archiving data generated six months ago, you can set the variables in the Variable Configuration section before you configure the parameters in the Archiving Configuration section.

    • Optional. Click Add to add more source tables.

    Archive Table Mapping

    The table settings to be used in the destination bucket. You can find a table and click Edit in the Actions column to configure the table information.

    Variable Configuration

    Optional. The variables that are used when you configure conditions for filtering archived data. For example, you have created a time variable named 6_month_ago in the yyyy-MM-dd format and set the offset to - 6 Month. In this case, if the current date is August 12, 2021, the value of the ${6_month_ago} variable is 2021-02-11, which indicates February 11, 2021. For more information about how to configure time variables, see the Configure time variables section of the "Variables" topic.

    Post-behavior

    • Specifies whether to delete the archived data from the source tables.

      • If you clear Clean up the archived data of the original table (delete-No Lock), the archived data is not deleted from the source tables.

      • If you select Clean up the archived data of the original table (delete-No Lock), DMS automatically deletes the archived data from the source tables. The source tables are not locked during the deletion process.

    • Specifies whether to verify archived data.

      • If you clear Verify the archived data (time-consuming), DMS does not verify the archived data.

      • If you select Verify the archived data (time-consuming), DMS compares the archived data with the original data in the source database and displays data discrepancies in addition to the archived data.

        The amount of time consumed for archived data verification depends on the amount of archived data.

    Operation Mode

    The method to be used for running the data archiving task. Valid values:

    • Single execution: After the data archiving ticket is approved, DMS runs the data archiving task only once.

    • Cyclic scheduling: After the data archiving ticket is approved, DMS runs the data archiving task based on the schedule that you specify. For more information, see the Periodic scheduling section of the "Archive data to a Lindorm instance" topic.

  5. Click Submit.

    After the data archiving ticket is approved, DMS automatically runs the data archiving task.

    Note

    The data archiving feature implements data archiving and post-behavior based on the UTC+0 time zone regardless of the time zone of the database to be archived.

  6. Optional. After the data is archived and you verify that the archived data is correct, delete the archived data from the source tables and optimize the storage usage of the source tables.

    1. Delete the archived data from the source tables.

      • If you select Clean up the archived data of the original table (delete-No Lock) for the Post-behavior parameter, DMS deletes the archived data from the source tables and generates temporary backup tables in the source database.

        To delete the temporary backup tables from the source database, create a Normal Data Modify ticket. For more information, see Normal data modify.

        Note

        In the lower part of the Data Archiving Ticket Details panel, click Click here to view the DROP statement used to drop the temporary table. and copy the SQL statements that are used to delete the temporary backup tables.

      • If you clear Clean up the archived data of the original table (delete-No Lock) for the Post-behavior parameter, you need to manually delete the archived data from the source tables.

        To delete the archived data from the source tables, create a Normal Data Modify ticket.

    2. To optimize the storage usage of the source tables, create a lock-free change ticket. For more information, see Perform lock-free DDL operations.

      Note

      In the lower part of the Data Archiving Ticket Details panel, click View SQL Statements Used to Optimize Storage Usage of Source Table and copy the SQL statements that are used to optimize the storage usage of the source tables.

  7. Query the archived data. For more information, see the Query data archived to OSS buckets section of the "Query data archived to OSS buckets or dedicated storage" topic.

Restore the archived data to the source database

Billing

You can restore the archived data to the source database free of charge.

Limits

You can use this feature only if the source database engine is ApsaraDB RDS for MySQL, PolarDB for MySQL, or AnalyticDB for MySQL V3.0.

Procedure

  1. On the Data Archiving Tickets page, set the View parameter to Ticket View.

  2. Find the ticket that you want to manage and click Details in the Actions column.

  3. On the Ticket Details page, click Execute. In the Execute section, click Restore to Source Database in the Actions column.

  4. In the message that appears, click OK.

    A data restoration task is created. The system automatically executes the data restoration task. Wait until the task is complete. The duration of this process depends on the amount of data to be restored.

    Important

    After you restore the archived data to the source database, the system automatically creates temporary tables in the source database. This may cause insufficient storage space in the source database. Proceed with caution.

  5. Query the restored data.

    1. In the Basic Information section of the Ticket Details page, move the pointer over the database name next to Database to Change and click Query in the pop-up window.

    2. Double-click the table name and click Execute.

      You can view the restored data on the Execution History tab that appears in the lower part of the SQLConsole tab.