You can use Jindo CLI commands to perform operations, such as upload, download, and delete operations, on a bucket for which OSS-HDFS is enabled.
Environment preparation
You can use one of the following methods to access OSS-HDFS:
If you want to access OSS-HDFS by using an Alibaba Cloud EMR cluster, make sure that an EMR cluster whose version is 3.44.0 or later or 5.10.0 or later is created. EMR clusters that meet the version requirements are integrated with JindoSDK by default. For more information, see Create a cluster.
In a non-EMR environment, install JindoSDK first. For more information, see Deploy JindoSDK in an environment other than EMR.
Procedure
Configure environment variables.
If you want to access OSS-HDFS by using an Alibaba Cloud EMR cluster, skip this step and proceed to Step 2.
If you want to access OSS-HDFS by using an non-Alibaba Cloud EMR cluster, perform the following steps:
Connect to an ECS instance. For more information, see Connect to an ECS instance.
Go to the bin directory of the installed JindoSDK JAR package.
cd jindosdk-x.x.x/bin/
Notex.x.x indicates the version number of the JindoSDK JAR package.
Grant read and write permissions to the
jindo-util
file in the bin directory.chmod 700 jindo-util
Rename the
jindo-util
file tojindo
.mv jindo-util jindo
Create a configuration file named
jindosdk.cfg
, and then add the following parameters to the configuration file.[common] Retain the following default configurations. logger.dir = /tmp/jindo-util/ logger.sync = false logger.consolelogger = false logger.level = 0 logger.verbose = 0 logger.cleaner.enable = true hadoopConf.enable = false [jindosdk] Specify the following parameters. <!-- In this example, the China (Hangzhou) region is used. Specify your actual region. --> fs.oss.endpoint = cn-hangzhou.oss-dls.aliyuncs.com <! -- Configure the AccessKey ID and AccessKey secret that is used to access OSS-HDFS. --> fs.oss.accessKeyId = LTAI******** fs.oss.accessKeySecret = KZo1********
Configure environment variables.
export JINDOSDK_CONF_DIR=<JINDOSDK_CONF_DIR>
Set <JINDOSDK_CONF_DIR> to the absolute path of the
jindosdk.cfg
configuration file.
Use Jindo CLI commands to access OSS-HDFS.
Upload an object
Run the following command to upload a local file named examplefile.txt in the local root directory to a bucket named examplebucket:
./jindo fs -put examplefile.txt oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
Create a directory
Run the following command to create a directory named dir/ in a bucket named examplebucket:
./jindo fs -mkdir oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/dir/
Query objects or directories
Run the following command to query the objects or directories in a bucket named examplebucket:
./jindo fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
Query the sizes of objects or directories
Run the following command to query the sizes of all objects or directories in a bucket named examplebucket:
./jindo fs -du oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
Query the content of an object
Run the following command to query the content of an object named localfile.txt in a bucket named examplebucket:
./jindo fs -cat oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/localfile.txt
Important The content of the queried file is displayed in plain text. If the content is encoded, use HDFS API for Java to read and decode the content.Download an object
Run the following command to download an object named exampleobject.txt from a bucket named examplebucket to a directory named /tmp in the root directory of your computer:
./jindo fs -get oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampleobject.txt /tmp/
Delete directories or objects
Run the following command to delete a directory named destfolder/ and all objects in the directory from a bucket named examplebucket:
./jindo fs -rm -r oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/destfolder/