In MaxCompute, external volumes are used as a distributed file system and used to store unstructured data. You can create external volumes to use the MaxCompute engine to query and process data of files stored in Object Storage Service (OSS). This way, you do not need to import the data into MaxCompute tables. This helps reduce data redundancy and transmission overheads. This topic describes common operations that you can perform on external volumes.
The following table describes common operations that you can perform on external volumes.
Operation | Description | Authorized user | Operation platform |
Creates an external volume in a project. |
| ||
Views the directory structure of an external volume. | |||
Deletes an external volume. |
Prerequisites
The MaxCompute client V0.43.0 or later is installed. For more information, see MaxCompute client (odpscmd). You can also run commands provided in this topic on the DataStudio page or on the SQL Query page of the DataWorks console. To run the commands in the DataWorks console, you must make sure that the version of the MaxCompute client that is integrated with DataStudio or SQL Query must be V0.43.2 or later. You can run the
Show version;
command on the DataStudio page or on the SQL Query page to query the MaxCompute client version. For more information, see Use MaxCompute in DataWorks.If you use the SDK for Java, the version of the SDK for Java must be V0.43.0 or later.
An application for trial use of the external volume feature is submitted, and the application is approved.
Before you use the external volume feature, you must submit an application for enabling this feature at the project level. For more information, see Apply for trial use of new features.
The Alibaba Cloud account or RAM user is granted access permissions on OSS. For more information about authorization, see STS authorization for OSS.
Create an external volume
Syntax
vfs -create <volume_name>
-storage_provider <oss>
-url <oss://oss_endpoint/bucket/path>
-acd <true|false>
-role_arn <arn:aliyun:xxx/aliyunodpsdefaultrole>
The following table describes the parameters.
Parameter | Required | Description |
volume_name | Yes | The name of the external volume that you want to create. |
storage_provider | Yes | The storage provider. Only OSS is supported. Therefore, you must set this parameter to |
url | Yes | The OSS directory in which data files are stored. The OSS directory is in the Important You must specify both the names of the bucket and the level-2 directory for the
|
acd | No | Specifies whether to automatically create a directory if the directory does not exist. Valid values:
Note If the |
role_arn | Yes | The Alibaba Cloud Resource Name (ARN) of the RAM role that has the permissions to access OSS. For more information about how to obtain the ARN, see Use temporary credentials provided by STS to access OSS. |
The path of the created external volume is in the odps://[project_name]/[volume_name]
format. project_name specifies the name of the MaxCompute project. volume_name specifies the name of the external volume. This path can be used by the Spark engine and MapReduce tasks.
Examples
Create an external volume named test_ext_l
.
vfs -create test_ext_l -storage_provider oss -url oss://oss-cn-hangzhou-internal.aliyuncs.com/test/ex_volume/ -role_arn acs:ram::xxxxxxx:role/aliyunodpsdefaultrole;
View the list of external volumes and the directory structure of an external volume
Syntax
-- View the list of external volumes.
vfs -ls /;
-- View the directory structure of an external volume.
vfs -ls [-R] /<volume_name>;
The following table describes the parameters.
Parameter | Required | Description |
volume_name | Yes | The name of the external volume that you want to view. |
Examples
View the list of external volumes.
vfs -ls /;
Sample response:
> vfs -ls /; Found 2 items drwxrwxrwx - 0 2023-03-11 12:06 /test_ext_l -> oss://oss-cn-shanghai-internal.aliyuncs.com/test/ex_volume drwxrwxrwx - 0 2023-03-21 07:33 /myfirst_volume4 -> oss://oss-cn-shanghai-internal.aliyuncs.com/paristech/data
If a user does not have permissions on an external volume, no information is displayed in the returned result. For example, a user named dev01 does not have permissions on the
myfirst_volume4
external volume. If the user dev01 wants to query data from themyfirst_volume4
external volume, you must run the following command to grant the user dev01 the Read permission on themyfirst_volume4
external volume:grant Read on volume myfirst_volume4 to RAM$xxxxxx:dev01;
NoteThe following permissions on external volumes can be granted: Read, Write, and CreateVolume.
View the directory structure of an external volume named
test_ext_l
.vfs -ls -R /test_ext_l;
Sample response:
drwxrwxrwx - 0 2023-03-27 07:31 /test_ext_l/test -> oss://oss-cn-hangzhou-internal.aliyuncs.com/test/ex_volume/test
Delete an external volume
Syntax
Syntax 1:
vfs -rm -r /<volume_name>
Syntax 2:
vfs -rmv /<volume_name>
The following table describes the parameters.
Parameter | Required | Description |
volume_name | Yes | The name of the external volume that you want to delete. |
Examples
Delete an external volume named test_ext_l
.
vfs -rm -r /test_ext_l;
References
For more information about how to manage external volumes by using SDKs, see Manage external volumes by using SDKs.
In MaxCompute, you can create an external volume and mount the external volume to an OSS path. Then, you can use the MaxCompute permission management system to control access to the external volume in a fine-grained manner. You can also use the MaxCompute engine to process data of files stored in the external volume. For more information about the examples on how to use external volumes, see Use MaxCompute external volumes to process unstructured data.