All Products
Search
Document Center

Cloud Parallel File Storage:Manage data flows

Last Updated:Dec 16, 2024

Before implementing data flow between the CPFS Intelligent Edition file system and the OSS Bucket, ensure that the corresponding data flow has been created. This topic describes how to create and manage CPFS Intelligent Edition data flow in the file storage console.

Prerequisites

  • The source OSS Bucket has been tagged (key: cpfs-dataflow, value: true). During the use of data flow, do not delete or modify this tag, otherwise, the CPFS Intelligent Edition file system data flow cannot access the Bucket data. For more information, see OSS Bucket Tagging.

  • To prevent data conflicts when multiple data flows export data to the same OSS Bucket, versioning must be enabled for the OSS Bucket. For more information, see Versioning Introduction.

Create data flow within the same account

  1. Log on to the NAS console.

  2. In the left-side navigation pane, choose File System > File System List.

  3. In the top navigation bar, select a region.

  4. On the File System List page, click the name of the file system.

  5. On the file system product page, click Dataflow.

  6. On the Dataflow tab, click Create Dataflow.

  7. In the Create Dataflow dialog box, configure the following parameters.

    Parameter

    Description

    CPFS File System Path

    Specify the path for data flow with OSS.

    Restrictions:

    • Length is 1 to 1023 English characters.

    • Must start and end with a forward slash (/).

    OSS Bucket

    Associate the source OSS Bucket with the CPFS Intelligent Edition file system path.

    Select Select A Bucket In The Current Account, then choose the name of the target OSS Bucket from the dropdown list.

    OSS Object Prefix

    The path of the source OSS Bucket.

    Restrictions:

    • Length is 1 to 1023 English characters.

    • Must start and end with a forward slash (/).

    • Must be an existing prefix in the OSS Bucket.

    OSS Bucket SSL

    Choose whether to use HTTPS to access OSS.

    SLR Authorization

    When Creating Data Flow for the first time, you must agree to authorize CPFS to access the Object Storage Service (OSS) service resource's service-linked role. For more information, see Cloud Parallel File Storage Service-linked Role.

  8. Click OK.

    After clicking Confirm, the system will verify the correctness of the input information, which usually takes 1 to 2 minutes. The window will close automatically after verification, please do not close the window manually.

Create cross-account data flow

When you need to flow data from a source OSS Bucket under account B to a CPFS Intelligent Edition file system under account A, you need to first log on to the account where the Bucket is located and perform the AliyunNasCrossAccountDataFlowDefaultRole role authorization, and add the UID of the account where the CPFS Intelligent Edition file system is located to the role's permissions. Then log on to account A to create cross-account data flow and data import/export tasks, etc.

This topic uses the data flow between a CPFS Intelligent Edition file system under Alibaba Cloud account A and an OSS Bucket under account B as an example to introduce the process.

Procedure

  1. Authorize the account where the source OSS Bucket is located.

    1. Log on to the NAS console using account B.

    2. On the Overview page, in the Common Entry area, click Authorization Management.

      image

    3. In the Authorization Management panel, click Go To Authorization in the Cross-account Data Flow Authorization area.

    4. Click Agree To Authorize.

    5. Return to the Authorization Management panel of the NAS console, click View Details in the Cross-account Data Flow Authorization area to enter the details page of the AliyunNasCrossAccountDataFlowDefaultRole role.

    6. On the Trust Policy tab, click Edit Trust Policy.

    7. Modify the Service field to the format of Alibaba Cloud account@nas.aliyuncs.com.

      For example, if Alibaba Cloud account A is 178321033379****, you need to change Service from nas.aliyuncs.com to 178321033379****@nas.aliyuncs.com, indicating that this XX role can be assumed by the data flow service under Alibaba Cloud account A 178321033379****@nas.aliyuncs.com.

      {
        "Statement": [
          {
            "Action": "sts:AssumeRole",
            "Effect": "Allow",
            "Principal": {
              "Service": [
                "178321033379****@nas.aliyuncs.com" 
              ]
            }
          }
        ],
        "Version": "1"
      }
  2. Create cross-account data flow.

    1. Log on to the NAS console using account A.

    2. In the left-side navigation pane, choose File System > File System List.

    3. In the top navigation bar, select a region.

    4. On the File System List page, click the name of the file system.

    5. On the file system product page, click Dataflow.

    6. On the Dataflow tab, click Create Dataflow.

    7. In the Create Dataflow dialog box, configure the following parameters.

      Parameter

      Description

      CPFS File System Path

      Specify the path for data flow with OSS.

      Restrictions:

      • Length is 1 to 1023 English characters.

      • Must start and end with a forward slash (/).

      OSS Bucket

      Associate the source OSS Bucket with the CPFS Intelligent Edition file system path.

      Select Specify A Bucket In Another Account, then enter the UID of the account where the source OSS Bucket is located in the Account ID box, and enter the name of the target source OSS Bucket in the Bucket Name box.

      OSS Object Prefix

      The path of the source OSS Bucket.

      Restrictions:

      • Length is 1 to 1023 English characters.

      • Must start and end with a forward slash (/).

      • Must be an existing prefix in the OSS Bucket.

      OSS Bucket SSL

      Choose whether to use HTTPS to access OSS.

    8. Click Confirm.

      After clicking Confirm, the system will verify the correctness of the input information, which usually takes 1 to 2 minutes. The window will close automatically after verification, please do not close the window manually.

Related operations

You can view the created data flows, modify data flow configurations, delete data flows, or stop data flows through the console.

Operation

Description

Step

View data flow

You can view the created data flows and create data flow tasks on the specified data flow.

On the Dataflow tab, you can query the configuration information of the specified data flow.

Modify data flow

Only the description of the data flow can be modified.

  1. On the Dataflow tab, find the target data flow.

  2. Click Modify to modify the description information of the specified data flow.

  3. Click OK.

Delete data flow

After deleting the data flow, all tasks of the specified data flow will be purged, and data cannot be synchronized.

Important

If there are running stream tasks or batch tasks in progress, the data flow cannot be deleted.

  1. On the Dataflow tab, find the target data flow.

  2. Click Delete to confirm the target data flow.

  3. Click Confirm.

What to do next

After successfully creating a data flow, you also need to create export or import tasks as needed to achieve data flow between the CFPFS Intelligent Edition file system and the OSS Bucket. For specific operations, see Create Task.