All Products
Search
Document Center

DataWorks:Upload data

Last Updated:Jan 27, 2026

Data Upload in DataWorks allows you to upload local files, Data Analysis spreadsheets, Object Storage Service (OSS) files, and HTTP files to engines such as MaxCompute, EMR Hive, Hologres, and StarRocks. This topic describes how to use the Data Upload feature.

Precautions

  • If your operations involve cross-border data transfers (such as transferring data out of the Chinese mainland), read the Relevant Compliance Statement beforehand. Failure to comply may result in upload failures and legal liability.

  • Ensure table headers are in English before uploading. Chinese headers may cause parsing failures and upload errors.

Limitations

Billing

Data upload incurs the following fees:

  • Data transmission fees.

  • Computing and storage fees (if a new table is created).

These fees are charged by the engine. For details, see the billing documentation of the corresponding engine: MaxCompute Billing, Hologres Billing, E-MapReduce Billing, and EMR Serverless StarRocks Billing.

Go to the Data Upload page

  1. Go to the Upload and Download page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Upload and Download. On the page that appears, click Go to Data Upload and Download.

  2. Click the image icon in the left navigation pane to go to the Upload Data page.

  3. Click Upload Data and upload the target data by following the on-screen instructions.

Select data to upload

You can upload local files, spreadsheets, Object Storage Service (OSS) files, and HTTP files. Select a data source based on your business requirements.

Note

When uploading files, specify whether to ignore dirty data.

  • Yes: The platform automatically ignores dirty data and continues the upload.

  • No: The platform does not ignore dirty data. The upload process terminates if dirty data is encountered.

Local file

Select this method to upload local files.

  1. Data Source: Select Local File.

  2. Specify Data to Be Uploaded: Drag the local file to the Select File area.

    Note
    • Supported formats: CSV, XLS, XLSX, and JSON. The maximum size for a CSV file is 5 GB. The maximum size for other files is 100 MB.

    • Only the first sheet is uploaded by default. To upload multiple sheets, save each sheet as a separate file.

    • SQL files are not supported.

Spreadsheet

Select this method if the data to be uploaded is a DataWorks Data Analysis Spreadsheet.

  1. Data Source: Select Workbook.

  2. Specify Data to Be Uploaded:

    1. Select the spreadsheet to upload from the drop-down list next to Select File.

    2. If the spreadsheet does not exist, click Create. Alternatively, go to the Data Analysis module to Create a spreadsheet and Import Data.

OSS

Select this method if the data to be uploaded is stored in OSS.

Prerequisites:

Procedure:

  1. Data Source: Select Object Storage OSS.

  2. Specify Data to Be Uploaded:

    1. In the Select Bucket drop-down list, select the bucket where the data is stored.

      Note

      Data can only be uploaded from buckets located in the same region as the current DataWorks workspace.

    2. In the Select File area, select the file to upload.

      Note

      Supported formats: CSV, XLS, XLSX, and JSON.

HTTP file

Select this method if the data to be uploaded is an HTTP file.

  1. Data Source: Select HTTP File.

  2. Specify Data to Be Uploaded:

    Parameter

    Description

    File URL

    Enter the URL of the file.

    Note

    HTTP and HTTPS file addresses are supported.

    File Type

    The system automatically identifies the file type.

    Supported formats: CSV, XLS, and XLSX. The maximum size for a CSV file is 5 GB. The maximum size for other files is 50 MB.

    Request Method

    Supported methods: GET, POST, and PUT. The GET method is recommended. Select the method supported by your file server.

    Advanced Parameters

    You can configure Request Header and Request Body information in Advanced Parameters based on your business requirements.

Set the destination table

In the Configure Destination Table area, select the Target Engine for data upload and configure the related parameters.

Important

Ensure you select the correct data source environment (PROD or DEV). Selecting the incorrect environment will result in data being uploaded to the wrong destination.

MaxCompute

To upload data to an internal table in MaxCompute, configure the parameters described in the following table.

Parameter

Description

MaxCompute Project Name

Select a MaxCompute data source in the current region. If the required data source is not found, Associate a MaxCompute compute resource in the current workspace to generate a data source with the same name.

Destination Table

Select Existing Table or Create Table.

Destination Table > Existing Table

Select destination table

The table where the uploaded data will be stored. Keyword search is supported.

Note

You can only upload data to tables that you own (where you are the table owner). For details, see Limitations.

Upload Mode

Select how to write data to the destination table.

  • Clear Table Data First: Clears existing data in the destination table and imports the new data into the mapped fields.

  • Append: Appends the uploaded data to the mapped fields in the destination table.

Destination Table > Create Table

Table Name

Enter a new table name.

Note

DataWorks uses the MaxCompute account configured in the compute resource to create the table.

Table Type

Select Non-partitioned Table or Partitioned Table as needed. If you select a partitioned table, specify the partition fields and their values.

Lifecycle

Specify the lifecycle of the table. The table will be recycled after the specified period. For more information about the table lifecycle, see Lifecycle and Lifecycle management operations.

EMR Hive

To upload data to an internal table in EMR Hive, configure the parameters described in the following table.

Parameter

Description

Data Source

Select the EMR Hive Data Source (Alibaba Cloud Instance Mode) bound to the workspace in the current region.

Destination Table

Data can only be uploaded to an Existing Table.

Select destination table

The table where the uploaded data will be stored. Keyword search is supported.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to Manage settings for tables in Data Studio to create a table.

  • You can only upload data to tables that you own (where you are the table owner). For details, see Limitations.

Upload Mode

Select how to add the uploaded data to the destination table.

  • Clear Table Data First: Clears data in the destination table and imports all data into the mapped fields.

  • Append: Appends the uploaded data to the mapped fields in the destination table.

Hologres

To upload data to an internal table in Hologres, configure the parameters described in the following table.

Parameter

Description

Data Source

Select the Hologres data source bound to the workspace in the current region. If the required data source is not found, Associate a Hologres computing resource in the current workspace to generate a data source with the same name.

Destination Table

Data can only be uploaded to an Existing Table.

Select destination table

The table where the uploaded data will be stored. Keyword search is supported.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to the Hologres console to create a table.

  • You can only upload data to tables that you own (where you are the table owner). For details, see Limitations.

Upload Mode

Select how to add the uploaded data to the destination table.

  • Clear Table Data First: Clears data in the destination table and imports all data into the mapped fields.

  • Append: Appends the uploaded data to the mapped fields in the destination table.

Primary Key Conflict Strategy

If the uploaded data causes a primary key conflict in the destination table, select a handling strategy.

  • Ignore: Ignores the conflicting uploaded data. Existing data in the destination table remains unchanged.

  • Replace: Overwrites existing data. Unmapped fields are set to NULL.

  • Update: Updates existing rows with the uploaded data. Only mapped fields are updated.

StarRocks

To upload data to a table in the default catalog of StarRocks, configure the parameters described in the following table.

Parameter

Description

Data Source

Select the StarRocks data source bound to the workspace in the current region.

Destination Table

Data can only be uploaded to an Existing Table.

Select destination table

The table where the uploaded data will be stored. Keyword search is supported.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to the EMR Serverless StarRocks instance page to create a table.

  • You can only upload data to tables that you own (where you are the table owner). For details, see Limitations.

Upload Mode

Select how to add the uploaded data to the destination table.

  • Clear Table Data First: Clears data in the destination table and imports all data into the mapped fields.

  • Append: Appends the uploaded data to the mapped fields in the destination table.

Advanced Parameters

You can configure Stream Load request parameters.

Preview file data to upload

After setting the destination table, you can adjust the file encoding and data mapping based on the data preview.

Note

The preview displays only the first 20 records.

  • File Encoding Format: If characters appear garbled, change the encoding format. Supported formats: UTF-8, GB18030, Big5, UTF-16LE, and UTF-16BE.

  • Preview data and set destination table fields:

    • Upload data to an existing table: Map file columns to destination table fields. Field mapping must be completed before uploading. Mapping methods include Mapping by Column Name and Mapping by Order. You can also customize the field names in the destination table after mapping.

      Note
      • Unmapped source data is grayed out and will not be uploaded.

      • Duplicate mapping relationships are not allowed.

      • Field names and field types cannot be empty.

    • Upload data to a new table: You can use Intelligent Field Generation to automatically fill in field information or manually modify the information.

      Note
      • Field names and field types cannot be empty.

      • EMR Hive, Hologres, and StarRocks engines do not support creating new tables during data upload.

  • Ignore First Row: Specify whether to upload the first row of the file data (usually column names) to the destination table.

    • Selected: Select this option if the first row contains column headers. The first row will be excluded from the upload.

    • Cleared: Clear this option if the first row contains actual data. The first row will be included in the upload.

Upload data

After previewing the data, click Upload Data in the lower-left corner to upload the data.

Subsequent operations

After the upload is complete, click the image icon in the left navigation pane to return to the Data Upload page. Locate the upload task to perform the following operations:

  • Continue upload: Click Continue Upload in the Actions column to upload data again.

  • Data query: Click Query Data in the Actions column to query and analyze the data.

  • View upload data details: Click the Table Name to go to Data Map and view details about the destination table. For details, see Metadata retrieval.

Cross-border compliance statement

Important

If your operations involve cross-border data transfers (for example, transferring data from the Chinese mainland to an overseas region, or between different countries/regions), read the relevant compliance statement beforehand. Failure to comply may result in upload failures and legal liability.

Cross-border data operations involve transferring your cloud business data to the region you selected or the product deployment region. Ensure that your operations comply with the following requirements:

  • You have the authority to process the relevant cloud business data.

  • You have adopted sufficient data security protection technologies and strategies.

  • The data transmission complies with relevant laws and regulations. For example, the transferred data does not contain any content restricted or prohibited from transmission or disclosure by applicable laws.

Alibaba Cloud reminds you to consult with professional legal or compliance advisors if your data upload involves cross-border transmission. Ensure that the cross-border data transmission complies with the requirements of applicable laws, regulations, and regulatory policies (such as obtaining valid authorization from personal information subjects, signing and filing relevant contract clauses, and completing security assessments).

If you fail to comply with this compliance statement when performing cross-border data operations, you shall bear the corresponding legal consequences. You shall also be liable for any losses suffered by Alibaba Cloud and its affiliates.

References

FAQ

  1. Resource group configuration issues.

    Error message: A resource group is required for the current file source or target engine. Contact the workspace administrator to configure the resource group.

    Solution: Configure the resource group used by the engine via Data Analysis. For details, see System administration.

  2. Resource group binding issues.

    Error message: The resource group configured for global data upload is not bound to the workspace containing the destination table. Contact the workspace administrator to bind the resource group.

    Solution: Bind the resource group configured in System Management to the workspace.