Utilize Hologres's real-time write capability to build an interactive data warehouse for real-time analysis.
Prerequisites
Before configuring the Hologres Writer node, ensure the corresponding input or transform data source is set up. For more information, see the referenced document.
Background information
-
Supported Hologres data source versions include the following: V0.7, V0.8, V0.9, V0.10, and V1.1.
-
Fields of the UUID data type are not supported for synchronization.
Procedure
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
-
Hover your mouse over the
icon and click .
Alternatively, expand the business flow, right-click the target business flow, and select
. -
In the Create Node dialog box, select Synchronization Method as Single Table (topic) to Single Table (topic) ETL, enter the Name, and select the Path.
-
Click Confirm.
-
On the real-time synchronization node's edit page, click and drag it to the editing panel. Connect it to the previously configured input or transform node.
-
Click the Hologres node. In the Node Configuration dialog box, set the parameters.
Parameter
Description
Data Source
The name of the Hologres data source that you configured. You can select only a Hologres data source.
If you have not configured a data source, click Create Data Source on the right to go to the the referenced document.
page to create one. For more information, seeTable
Select the name of the table to which you want to write data from the current data source.
You can click Create Table on the right to create a new table, or click Data Preview to confirm.
Partition Information
Partition Method
The default is Dynamic Partitioning Based On Field Content.
If the Hologres table is a partitioned table, you need to set these partition parameters.
Partition Field Value Source
You can select the table field information from the input data source configured in the ancestor node.
Partition Field Name
The default is the partition field name set for the table.
Partition Field Value
The partition field value can be an Enumeration Value or a Time Value.
If the partition field value is an enumeration value, each value in the partition field will create a partition. Therefore, the number of different values must not exceed 1000 per day. If this value is exceeded, partition creation will fail and the real-time task will stop running.
If the partition field value is a time value, you need to configure the corresponding time Source Format and the saved Target Format.
Partitioned Cache Queue Size
The larger the partitioned cache queue size, the greater the memory consumption. If the data from the source is severely out of order based on the partition field, it is recommended to increase this value and accordingly increase the memory.
Job Type
Includes two types: Replay and Insert:
Replay indicates the mirror feature. That is, when the source end
INSERT
a record, Hologres alsoINSERT
a piece of data. When the source end performsUPDATE
orDELETE
operations, Hologres will execute the correspondingUPDATE
orDELETE
operations.Insert indicates using Hologres as stream storage to save the data synchronized from the source end through
INSERT
.
Write Conflict Policy
Includes two types: Overwrite and Ignore:
Overwrite: Use the new data synchronized from the source end to overwrite the existing data.
Ignore: Ignore the new data synchronized from the source end and retain the existing data.
Field Mapping
Click Field Mapping to set the mapping of fields between the source and destination ends. The synchronization node synchronizes data based on the field mappings.
-
Click the
icon in the toolbar.