All Products
Search
Document Center

Object Storage Service:Snapshot (Use snapshots to back up and restore data)

Last Updated:Dec 08, 2024

You can use snapshots created by using the Snapshot command to restore data that is accidentally deleted or to back up data to ensure service continuity when an error occurs. You can use the snapshot feature of OSS-HDFS in the same manner that you use the snapshot feature of HDFS. The snapshot feature of OSS-HDFS supports directory-level operations.

Important

This feature is in trial and small-scale use and is not recommended for large-scale use.

Prerequisites

Step 1: Configure environment variables

  1. Connect to an Elastic Compute Service (ECS) instance. For more information, see Connect to an instance.

  2. Go to the bin directory of the installed JindoSDK JAR package.

    cd jindosdk-x.x.x/bin/
    Note

    x.x.x indicates the version number of the JindoSDK JAR package.

  3. Grant read and write permissions to the jindo-util file in the bin directory.

    chmod 700 jindo-util
  4. Rename the jindo-util file to jindo.

    mv jindo-util jindo
  5. Create a configuration file named jindosdk.cfg, and then add the following parameters to the configuration file:

    [common] Retain the following default configurations: 
    logger.dir = /tmp/jindo-util/
    logger.sync = false
    logger.consolelogger = false
    logger.level = 0
    logger.verbose = 0
    logger.cleaner.enable = true
    hadoopConf.enable = false
    
    [jindosdk] Specify the following parameters: 
    <!-- In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.   -->
    fs.oss.endpoint = cn-hangzhou.oss-dls.aliyuncs.com
    <! -- Configure the AccessKey ID and AccessKey secret that are used to access OSS-HDFS.   -->
    fs.oss.accessKeyId = LTAI********    
    fs.oss.accessKeySecret = KZo1********                                        
  6. Configure environment variables.

    export JINDOSDK_CONF_DIR=<JINDOSDK_CONF_DIR>

    Set <JINDOSDK_CONF_DIR> to the absolute path of the jindosdk.cfg configuration file.

Step 2: Perform snapshot-related operations

Enable the snapshot feature

For example, you have a bucket named examplebucket and a directory named exampledir in the bucket. To enable the snapshot feature for the exampledir directory, run the following command in the JindoSDK shell CLI:

./jindo admin -allowSnapshot -dlsUri oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir

For more information about how to configure the endpoint of OSS-HDFS, see Connect non-EMR clusters to OSS-HDFS.

Create a snapshot

After you enable the snapshot feature for the exampledir directory in the examplebucket bucket, perform the following operations to create a snapshot:

  1. Create subdirectories and objects.

    Create subdirectories named dir1 and dir2, and objects named file1 and file2 in the exampledir directory.

    # Create the dir1 subdirectory. 
    hdfs dfs -mkdir oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/dir1
    # Create the dir2 subdirectory. 
    hdfs dfs -mkdir oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/dir2
    # Create the file1 object. 
    hdfs dfs -touchz oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/file1.txt
    # Create the file2 object. 
    hdfs dfs -touchz oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/file2.txt
  2. Create a snapshot named S1 for the exampledir directory.

    hdfs dfs -createSnapshot oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir S1

Rename a snapshot

You can run the following command in the HDFS shell CLI to rename the S1 snapshot S2:

hdfs dfs -renameSnapshot oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir S1 S2

Access directories and objects in a snapshot

To access the dir1 subdirectory in the exampledir root directory of the examplebucket bucket, run the following command in the HDFS shell CLI:

hdfs dfs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/dir1

You can access the exampledir root directory by accessing the S1 snapshot that you created for this directory. If you want to access directories and objects in the S1 snapshot, run the following command in the HDFS shell CLI:

hdfs dfs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/.snapshot/S1/dir1

Compare snapshots

To compare the S1 and S2 snapshots in the exampledir directory, run the following command in the JindoSDK shell CLI:

./jindo admin -snapshotDiff \
                -dlsUri -dlsUri oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir \
                -fromSnapshot S1 \
                -toSnapshot S2

Use a snapshot to restore data

You can use the snapshot feature to back up and restore data. The snapshot feature allows you to restore data that you accidentally deleted in a timely manner. For example, you delete the dir1 object from the exampledir root directory of the examplebucket bucket by running the following command:

hdfs dfs -rm -r oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/dir1

You can use the S1 snapshot that you created for the exampledir root directory of the examplebucket bucket to restore the deleted object by running the following command in the HDFS shell CLI:

hdfs dfs -cp oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/.snapshot/S1/dir1  oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir

After the data is restored, run the following command to view the directory or object that you accidentally deleted:

hdfs dfs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir/dir1

Delete a snapshot

If you do not want to retain the S1 snapshot that you created for the exampledir root directory of the examplebucket bucket or you do not want to retain the S2 snapshot obtained by renaming the S1 snapshot, run the following command in the HDFS shell CLI to delete the S1 or S2 snapshot:

  • Delete the S1 snapshot

    hdfs dfs -deleteSnapshot oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir S1
  • Delete the S2 snapshot

    hdfs dfs -deleteSnapshot oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir S2

Disable the snapshot feature

If you no longer need to use the snapshot feature, run the following command in the JindoSDK shell CLI to disable the snapshot feature:

./jindo admin -disallowSnapshot -dlsUri oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampledir
Important

Before you disable the snapshot feature, make sure that all snapshots in the destination path are deleted. Otherwise, an error occurs when you disable the snapshot feature.