All Products
Search
Document Center

MaxCompute:Migrate data from MaxCompute to OSS

Last Updated:Feb 20, 2025

This topic describes how to use the data synchronization feature of DataWorks to migrate data from MaxCompute to Object Storage Service (OSS).

Prerequisites

Procedure

  1. Create a table in the DataWorks console.

    1. Login DataWorks console.

    2. In the left-side navigation pane, click Workspace.

    3. On the Workspaces page, find the desired workspace and choose Shortcuts > Data Development in the Actions column.

    4. Right-click a created Workflow, Select Create Table > Table

    5. In Create Table page, select the engine type, and enter Name.

    6. On the table editing page, click DDL Statement.

    7. In the DDL dialog box, enter the following CREATE TABLE statement and click Generate Table Schema.

      CREATE TABLE transs
      (name    STRING,
      id    STRING,
      gender    STRING);
    8. Click Submit to Production Environment.

  2. Import data to the table transs.

    1. Click Import on the DataStudio page.

    2. In Data import wizard dialog box that appears, enter at least three letters to search for the table to which data is to be imported, and then click Next.

    3. In the dialog box that appears, set Select Data Import Method to Upload Local File and click Browse next to Select File. Select the local file that you want to import and specify other parameters.

      Example:

      qwe,145,F
      asd,256,F
      xzc,345,M
      rgth,234,F
      ert,456,F
      dfg,12,M
      tyj,4,M
      bfg,245,M
      nrtjeryj,15,F
      rwh,2344,M
      trh,387,F
      srjeyj,67,M
      saerh,567,M
    4. Click Next.

    5. Select how destination table fields match the source fields.

    6. Click Import Data.

  3. Create a table in the OSS console.

    1. Log on to the OSS console and create a bucket. For more information, see Create buckets.

    2. Create the file qwee.csv and upload it to OSS. For more information, see Upload objects.

      Note

      Make sure that fields in the file qwee.csv are exactly the same as the fields in the transs table.

  4. Add data sources in the DataWorks console.

    1. Login DataWorks console.

    2. In the left-side navigation pane, click Workspace.

    3. On the Workspaces page that appears, find the target workspace and click Shortcuts > Data Integration in the Actions column.

    4. In the left-side navigation pane of the Data Integration page, click Data source to go to the Data Sources page.

    5. On the Data Sources page, click Add Data Source. In the Add data source dialog box, click MaxCompute.

    6. In the Add MaxCompute data source dialog box, configure the parameters and click Complete. For more information, see Add a MaxCompute data source.

    7. Add OSS as a data source. For more information, see Add an OSS data source.

  5. Configure MaxCompute as the reader and OSS as the writer.

    1. Go to the data analytics page. Right-click the specified workflow and choose Create Node > Data Integration > Offline synchronization.

    2. In create a node dialog box, enter node name, and click submit.

    3. On the Configure Network Connections and Resource Group page, after selecting the exclusive data integration resource group, click the image icon in the top menu bar to switch to script mode.

    4. In the top navigation bar, choose Conversion scripticon.

    5. In script mode, click **icon.

    6. In import Template dialog box SOURCE type, data source, target type and data source, and click confirm.

    7. Modify JSON code and click the 运行 icon.

      Sample code:

      {
          "order":{
              "hops":[
                  {
                      "from":"Reader",
                      "to":"Writer"
                  }
              ]
          },
          "setting":{
              "errorLimit":{
                  "record":"0"
              },
              "speed":{
                  "concurrent":1,
                  "dmu":1,
                  "throttle":false
              }
          },
          "steps":[
              {
                  "category":"reader",
                  "name":"Reader",
                  "parameter":{
                      "column":[
                          "name",
                          "id",
                          "gender"
                      ],
                      "datasource":"odps_first",
                      "partition":[],
                      "table":"Transs"
                  },
                  "stepType":"odps"
              },
              {
                  "category":"writer",
                  "name":"Writer",
                  "parameter":{
                      "datasource":"Trans",
                      "dateFormat":"yyyy-MM-dd HH:mm:ss",
                      "encoding":"UTF-8",
                      "fieldDelimiter":",",
                      "fileFormat":"csv",
                      "nullFormat":"null",
                      "object":"qwee.csv",
                      "writeMode":"truncate"
                  },
                  "stepType":"oss"
              }
          ],
          "type":"job",
          "version":"2.0"
      }                           
  6. View the data of the newly created table in the OSS console. For more information, see Download objects.