This topic describes how to create and manage the dataflow tasks of Cloud Parallel File Storage (CPFS) for Lingjun file systems and view the causes of task failures in the File Storage NAS (NAS) console.
Background information
The dataflow tasks that you create in the NAS console are batch tasks. A batch dataflow task is used to import or export all files from one directory to another directory at a time. You cannot use a batch dataflow task to import or export files one by one. If you need to import or export files one by one, use a streaming dataflow task by calling API operations. For more information, see Best practice of streaming dataflow tasks.
Prerequisites
A dataflow is created. For more information, see the Create a dataflow within the same account or Create a dataflow across accounts section of the "Manage dataflows" topic.
Versioning is enabled for the source Object Storage Service (OSS) bucket that is associated with your CPFS for Lingjun file system if you create a dataflow task to export data. Do not disable versioning when you use the dataflow feature. Otherwise, an error is reported when you run a dataflow task to export data. For more information, see Overview.
Create a dataflow task
Log on to the NAS console.
In the left-side navigation pane, choose File System > File System List.
In the top navigation bar, select a region.
On the File System List page, click the name of the CPFS for Lingjun file system that you want to manage.
On the details page of the file system, click Dataflow in the left-side pane.
On the Dataflow page, find the dataflow that you want to manage and click Task Management in the Actions column.
In the Task Management panel, click Create Job.
In the Create Job panel, create different types of tasks and configure the tasks.
Import data
After a symbolic link is imported to CPFS for Lingjun, the symbolic link is converted into a regular data file that contains no symbolic link information.
If an OSS bucket contains data of multiple versions, only data of the latest version is imported.
The name of a file or a subdirectory can be up to 255 bytes in length.
If a file and a subdirectory have the same name, an object conflict occurs in the CPFS for Lingjun file system. In this case, only one object with the name can be imported.
Parameter
Description
Data Type
The type of the data to be imported. Set the value to Data + Metadata. This value specifies that both the data blocks and metadata of an object are imported.
Specify OSS Object Prefix Subdirectory
The directory or list of files whose data you want to import. Select Import Objects from OSS. You must specify a relative path with the specified OSS object prefix. The OSS path that you specify must start and end with a forward slash (/).
NoteIf the CPFS file system path that you specify for a dataflow does not exist, you can select If the CPFS directory you created does not exist, the system automatically creates a CPFS directory to prevent data import failures.
Conflict Resolution Policy
The policy used when the CPFS for Lingjun file system and the OSS bucket have objects with the same name. Valid values:
Skip Files with the Same Name (Default): ignores the objects with the same name and does not synchronize these objects.
Keep the Latest File: compares the update time (mtime) of the objects with the same name and keeps the latest object. Both OSS and CPFS for Lingjun use the modification time for the comparison.
Overwrite Files with the Same Name: replaces the file with the same name in the CPFS for Lingjun file system with the source object in the OSS bucket. Select Use the source file to overwrite the existing file with the same name on the destination. Make sure that you have backed up key data.
Export data
Make sure that versioning is enabled for the source OSS bucket that is associated with your CPFS for Lingjun file system. Do not disable versioning when you use the dataflow feature. Otherwise, an error is reported when you run a dataflow task to export data. For more information, see Overview.
After a symbolic link is synchronized to OSS, the file to which the symbolic link points is not synchronized to OSS. In this case, the symbolic link is converted into a regular object that contains no data.
Hard links can be synchronized to OSS only as regular files that contain no link information.
The files of the Socket, Device, or Pipe type cannot be exported to an OSS bucket.
The path of a directory can be up to 1,023 characters in length.
CPFS for Lingjun exports the File Modification timestamps attribute to the custom metadata of an OSS bucket. The metadata field is named
x-oss-meta-alihbr-sync-mtime
and cannot be deleted or modified. Otherwise, an error occurs when you access the File Modification timestamps attribute of the file system.
Parameter
Description
Export Data Type
The type of the data to be exported. Select Data + Metadata. This value specifies that both the data blocks and metadata of a file are exported.
Specify CPFS Subdirectory
The directory or list of files whose data you want to export. Select Export Files from CPFS. You must specify a directory in the specified CPFS directory. The directory that you specify must start and end with a forward slash (/).
Conflict Resolution Policy
The policy used when the CPFS for Lingjun file system and the OSS bucket have objects with the same name. Valid values:
Skip Files with the Same Name (Default): ignores the objects with the same name and does not synchronize these objects.
Keep the Latest File: compares the update time (mtime) of the objects with the same name and keeps the latest object. Both OSS and CPFS for Lingjun use the modification time for the comparison.
Overwrite Files with the Same Name: replaces the object with the same name in the OSS bucket with the source file in the CPFS for Lingjun file system. Select Use the source file to overwrite the existing file with the same name on the destination. Make sure that you have backed up key data.
Click OK.
View the cause of a task failure
If a dataflow task fails, the system displays the failure cause or generates a task report about the failure. You can view the failure cause or download the task report in the NAS console and troubleshoot the issue.
Log on to the NAS console.
In the left-side navigation pane, choose File System > File System List.
In the top navigation bar, select a region.
On the File System List page, click the name of the file system.
On the details page of the file system, click Dataflow.
On the Dataflow page, find the dataflow that you want to manage and click Task Management in the Actions column.
In the Task Management panel, find the failed task and move the pointer over the icon next to Failed in the Status column to view the failure cause or download the task report.
NoteIf no failure cause is displayed, no task report is generated, or you cannot troubleshoot the issue based on the failure cause or task report, submit a ticket for troubleshooting.
What to do next
Operation | Description | Procedure |
View a task | You can view the configurations and status of a dataflow task in the console. |
|
Cancel a task | You can cancel a running dataflow task in the console. |
|
Copy a task | You can copy a dataflow task that is run to run the task again. |
|
View the report of a successful task | After a dataflow task is run, the system generates a task report that provides successful information about the task. You can download the report from the console and view the details of the task. |
|