In E-MapReduce (EMR) V3.30, JindoFS provides a tiered storage feature. This feature allows you to store cold and hot data in different storage media. This helps reduce the data storage costs and accelerate data access.
Use jindo jfs
Run the following command to obtain help information:
jindo jfs -help archive
Tiered-storage commands in JindoFS are asynchronous and start only related tasks.
Cache
You can use this command to back up data stored in a specific path to local disks.
Then, you can read the data from local disks, without the need to read data from Object
Storage Service (OSS).
jindo jfs -cache -p <path>
-p can be used to ensure that local data is not cleared based on disk usage.
Uncache
You can use this command to delete backup data from local disks and store data only
to OSS Standard storage.
jindo jfs -uncache <path>
Archive
You can use this command to delete backup data from local disks and store data to
OSS Infrequent Access (IA) or Archive storage. For information about the storage classes,
see Overview.
jindo jfs -archive -i|-a|-c <path>
- If you specify the -i option, data is stored to OSS IA storage.
- If you specify the -a option, data is stored to OSS Archive storage.
- If you specify the -c option, data is stored to OSS Cold Archive storage.
Unarchive
You can use this command to convert the storage class of data from Archive to IA or
Standard. You can temporarily unarchive data that is stored to Archive storage to
allow the data to be readable for one day.
jindo jfs -unarchive -i/-o <path>
By default, this command can be used to store data to OSS Standard storage.
- If you specify the -i option, data is stored to OSS IA storage.
- If you specify the -o option, data stored to Archive storage can be temporarily unarchived and becomes readable.
Status
You can use this command to view task details. By default, the number of files for
which you want to use tiered storage in a specific directory and the data to which
tiered storage has been applied are measured.
jindo jfs -status -detail/-sync <path>
- If you specify the -detail option, the storage progress of file data can be viewed.
- If you specify the -sync option, the command exits only after a tiered-storage task is completed.
ls2
JindoFS provides the ls2 command that allows you to view the file storage status on
the basis of Hadoop ls commands.
hadoop fs -ls2 <path>
Example of command output, which includes the file storage class:
drwxrwxrwx - - 0 2020-06-05 04:27 oss://xxxx/warehouse
-rw-rw-rw- 1 Archive 1484 2020-09-23 16:40 oss://xxxx/wikipedia_data.csv
-rw-rw-rw- 1 Standard 1676 2020-06-07 20:04 oss://xxxx/wikipedia_data.json