DataHub is a platform designed to process streaming data. You can publish and subscribe to streaming data in DataHub and distribute the data to other platforms. DataHub allows you to analyze streaming data and build applications based on the streaming data.
Prerequisites
A reader or transformation node is configured. For more information, see Data source types that support real-time synchronization.
Background information
DataHub Writer writes data to DataHub by using the DataHub SDK for Java. The following code shows the version of the DataHub SDK for Java.
<dependency>
<groupId>com.aliyun.datahub</groupId>
<artifactId>aliyun-sdk-datahub</artifactId>
<version>2.5.1</version>
</dependency>
Procedure
Go to the DataStudio page.
Log on to the DataWorks console.
In the left-side navigation pane, click Workspaces.
In the top navigation bar, select the region in which the workspace that you want to manage resides. On the Workspaces page, find the workspace and click in the Actions column.
In the Scheduled Workflow pane, move the pointer over the icon and choose .
Alternatively, right-click the required workflow, and then choose
.In the Create Node dialog box, set the Sync Method parameter to End-to-end ETL and configure the Name and Path parameters.
ImportantThe node name cannot exceed 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
Click Confirm.
On the configuration tab of the real-time sync node, drag DataHub under to the canvas on the right. Connect the new node to a reader or transformation node.
Click the new DataHub node. In the configuration pane that appears, set the parameters in the Node configuration section.
Parameter
Description
Data source
The connection to the DataHub data store. In this example, you can select only a DataHub connection.
If no connection is available, click New data source on the right to create one on the Add a DataHub data source.
page. For more information, seeTopic
The name of the topic to which data is written in DataHub. You can click Data preview on the right to preview the selected topic.
Batch number
The number of records that are written at a time.
Field Mapping
The mappings between fields in the source and destination data stores. DataWorks synchronizes data based on the field mappings.
Click in the top toolbar to save the node.