JindoTable allows you to run the archiveTable and unarchiveTable commands in SDK mode to archive and unarchive data in Object Storage Service (OSS). The commands do not rely on the Jindo Namespace Service component of SmartData. This topic describes how to use the archiveTable and unarchiveTable commands.
Prerequisites
- Java Development Kit (JDK) 8 is installed on your computer.
- An E-MapReduce (EMR) cluster is created. For more information, see Create a cluster.
- The partitioned table or non-partitioned table that you want to archive is stored in OSS. Only table data can be archived.
Background information
You can use the original archive and unarchive commands of JindoTable to archive or unarchive tables or partitions in OSS. However, these commands rely on the Jindo Namespace Service component of SmartData. The new commands archiveTable and unarchiveTable do not rely on the Jindo Namespace Service component.
- You can run the archiveTable and unarchiveTable commands even if the SmartData service is not deployed in your cluster. For example, you can run the commands on a self-managed cluster.
- You can configure filter parameters in the archiveTable or unarchiveTable command to archive or unarchive a large number of partitions on multiple threads at the same time. If local multithreading cannot meet your business requirements, you can run MapReduce tasks on the entire cluster to archive or unarchive data.
For more information about the original archive and unarchive commands, see Use JindoTable.
Limits
The archiveTable and unarchiveTable commands are supported only in EMR V3.36.0 and later minor versions, and EMR V5.2.0 and later minor versions.
archiveTable
You can use the archiveTable command to archive tables or partitions in OSS.
unarchiveTable
The syntax of the unarchiveTable command is similar to the syntax of the archiveTable command. You can use the unarchiveTable command to unarchive tables or partitions in OSS.
In the unarchiveTable command, the optional parameter -i/-o is used instead of the required parameter -a/-i. This is the only difference between the parameters of the unarchiveTable command and the archiveTable command.
- If you do not specify -i/-o, the storage class of the data that you want to unarchive is changed to Standard.
- If you specify the -i option, the storage class of the data that you want to unarchive is changed to IA. Files whose storage class is Standard are skipped.
- If you specify the -o option, data is only temporarily unarchived and its storage class is retained. Files whose storage class is Standard and files whose storage class is IA are skipped. Files that are previously unarchived are also skipped. This way, these files are not repeatedly unarchived.