All Products
Search
Document Center

DataWorks:Configure DataHub Writer

Last Updated:Nov 18, 2024

DataHub is a platform designed to process streaming data. You can publish and subscribe to streaming data in DataHub and distribute the data to other platforms. DataHub allows you to analyze streaming data and build applications based on the streaming data.

Prerequisites

A reader or transformation node is configured. For more information, see Data source types that support real-time synchronization.

Background information

DataHub Writer writes data to DataHub by using the DataHub SDK for Java. The following code shows the version of the DataHub SDK for Java.

<dependency>
    <groupId>com.aliyun.datahub</groupId>
    <artifactId>aliyun-sdk-datahub</artifactId>
    <version>2.5.1</version>
</dependency>

Procedure

  1. Go to the DataStudio page.

    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, click Workspaces.

    3. In the top navigation bar, select the region in which the workspace that you want to manage resides. On the Workspaces page, find the workspace and click Shortcuts > Data Development in the Actions column.

  2. In the Scheduled Workflow pane, move the pointer over the Create a table icon and choose Create Node > Data Integration > Real-time synchronization.

    Alternatively, right-click the required workflow, and then choose Create Node > Data Integration > Real-time synchronizationReal-time synchronization.

  3. In the Create Node dialog box, set the Sync Method parameter to End-to-end ETL and configure the Name and Path parameters.

    Important

    The node name cannot exceed 128 characters in length and can contain letters, digits, underscores (_), and periods (.).

  4. Click Confirm.

  5. On the configuration tab of the real-time sync node, drag DataHub under Output to the canvas on the right. Connect the new node to a reader or transformation node.

  6. Click the new DataHub node. In the configuration pane that appears, set the parameters in the Node configuration section.

    DataHub Writer

    Parameter

    Description

    Data source

    The connection to the DataHub data store. In this example, you can select only a DataHub connection.

    If no connection is available, click New data source on the right to create one on the Data Source page. For more information, see Add a DataHub data source.

    Topic

    The name of the topic to which data is written in DataHub. You can click Data preview on the right to preview the selected topic.

    Batch number

    The number of records that are written at a time.

    Field Mapping

    The mappings between fields in the source and destination data stores. DataWorks synchronizes data based on the field mappings.

  7. Click Save in the top toolbar to save the node.