All Products
Search
Document Center

E-MapReduce:Use JindoTable to archive and unarchive tables or partitions in OSS

Last Updated:Nov 01, 2023

JindoTable allows you to run the archiveTable and unarchiveTable commands to archive and unarchive tables or partitions in Object Storage Service (OSS). This topic describes the archiveTable and unarchiveTable commands.

Limits

This topic is suitable only for scenarios in which a Hive metastore is used to store metadata.

archiveTable

You can use the archiveTable command to archive tables or partitions in OSS.

Obtain help information

Run the following command to obtain help information:

jindotable -help archiveTable

Parameter description

jindotable -archiveTable -t <dbName.tableName> -i/-a/-ca [-c "<condition>" | -fullTable] [-b/-before <before days>] [-p/-parallel <parallelism>] [-mr/-mapReduce] [-e/-explain] [-w/-workingDir <working directory>][-l/-logDir <log directory>]

Parameter

Description

Required

-t <dbName.tableName>

The name of the table that you want to archive. You must configure this parameter in the Database name.Table name format.

  • Separate the database name and table name with a period (.).

  • The table can be a partitioned table or a non-partitioned table.

Yes

-i/-a/-ca

The storage class in which you want to archive data. You can use one of the following options to specify a storage class:

  • -a: Archive

  • -i: Infrequent Access (IA)

    Note

    If you use the -i option in the command, the files whose storage class is Archive are skipped.

  • -ca: Cold Archive

Yes

-c "<condition>" / -fullTable

You must configure either -c "<condition>" or -fullTable.

  • If you configure -c "<condition>", only the partitions that meet the filter condition are archived. Common operators, such as greater-than signs (>), are supported. For example, if the partition key column is the ds column whose data type is String and you want to archive partitions whose partition names are greater than 'd', use -c " ds > 'd' ".

  • If you configure -fullTable, the entire partitioned or non-partitioned table is archived.

No

-b/before <before days>

Only the tables or partitions that were created at least the specified days ago can be archived.

No

-p/-parallel <parallelism>

The parallelism of archive operations.

No

-mr/-mapReduce

Hadoop MapReduce instead of local multithreading is used to archive data.

No

-e/-explain

The explain mode is used. In explain mode, the list of partitions to be archived is displayed, but no data is archived.

No

-w/-workingDir

The working directory of a MapReduce job. This option is used only when you run a MapReduce job. The directory can be left empty or not. Temporary files are created when you run the MapReduce job and are automatically deleted after the job is complete.

No

-l/-logDir <log directory>

The directory in which log files are stored.

No

unarchiveTable

You can use the unarchiveTable command to unarchive tables or partitions in OSS.

Obtain help information

Run the following command to obtain help information:

jindotable -help unarchiveTable

Parameter description

jindotable -unarchiveTable -t <dbName.tableName> [-i/-a/-o/-cr] [-notWait] [-c "<condition>" | -fullTable] [-d/-restoreDays <restore days>] [-b/-before <before days>] [-p/-parallel <parallelism>] [-mr/-mapReduce] [-e/-explain] [-w/-workingDir <working directory>][-l/-logDir <log directory>]

Parameter

Description

Required

-t <dbName.tableName>

The name of the table that you want to unarchive. You must configure this parameter in the Database name.Table name format. Separate the database name and table name with a period (.). The table can be a partitioned table or a non-partitioned table.

Yes

-i/-a/-o/-cr

If you do not configure this parameter, the storage class of the data that you want to unarchive is changed to Standard.

You can use one of the following options to specify a storage class:

  • -i: The storage class of the data is changed to IA. Files whose storage class is Standard are skipped.

  • -a: The storage class of the data is changed to Archive.

  • -o: Data is only temporarily unarchived and the storage class of the data is retained. Files whose storage class is Standard and files whose storage class is IA are skipped. Files that are previously unarchived are also skipped. This way, these files are not repeatedly unarchived.

  • -cr: checks whether all unarchive jobs are complete.

No

-notWait

This parameter is valid only when you unarchive data. If you configure this parameter, the system exits the current process without waiting for the completion of the unarchive operation performed by the OSS server. If you do not configure this parameter, the system exits the current process after the unarchive operation is complete or times out. The timeout period is 10 minutes.

No

-c "<condition>" / -fullTable

You must configure either -fullTable or -c "<condition>".

  • If you configure -fullTable, the entire partitioned or non-partitioned table is unarchived.

  • If you configure -c "<condition>", only the partitions that meet the filter condition are unarchived. Common operators, such as greater-than signs (>), are supported. For example, if the partition key column is the ds column whose data type is String and you want to unarchive partitions whose partition names are greater than 'd', use -c " ds > 'd' ".

No

-d/-restoreDays <restore days>

The number of days to retain the unarchive state when you perform only the unarchive operation. The default value is one day.

For example, this parameter takes effect if you use the -o option to perform only the unarchive operation on tables that are stored to Cold Archive storage. This parameter takes effect in the intermediate stage when the storage class of a table that is never unarchived is changed from Archive or Cold Archive to Standard. This parameter does not take effect when the storage class of a table is changed from IA to Standard.

No

-b/-before <before days>

Only the tables or partitions that were created at least the specified days ago can be unarchived.

No

-p/-parallel <parallelism>

The parallelism of unarchive operations.

No

-mr/-mapReduce

Hadoop MapReduce instead of local multithreading is used to unarchive data.

No

-e/-explain

The explain mode is used. In explain mode, the list of partitions to be unarchived is displayed, but no data is unarchived.

No

-w/-workingDir

The working directory of a MapReduce job. This option is used only when you run a MapReduce job. The directory can be left empty or not. Temporary files are created when you run the MapReduce job and are automatically deleted after the job is complete.

No

-l/-logDir <log directory>

The directory in which log files are stored.

No