This topic describes the definition of a node dependency loop, the reason why a node dependency loop is formed, and the solution to a node dependency loop.
What is a node dependency loop?
If a node not only serves as an ancestor node of some other nodes but also depends on one or more of its descendant nodes, a node dependency loop is formed. The nodes in the loop cannot be automatically scheduled. In the production environment, an alert notification is automatically sent if a node dependency loop is formed. For more information, see the Monitoring and alerting section in this topic.
Cause and solution
If a node not only serves as an ancestor node of some other nodes but also depends on one or more of its descendant nodes, a node dependency loop is formed. In this case, you need to analyze the workflow and remove the dependency that causes the loop at the earliest opportunity.
For example, if an ancestor node depends on the data in the table generated by its descendant node in the same scheduling cycle, and writes data processing results to the table, a node dependency loop is formed. In this case, confirm the business scenario and modify the dependency to enable the ancestor node to depend on the data in the table generated by its descendant node in the previous scheduling cycle.
Sample scenario: Node A is run to query data in Table C and generates Table A. Node B cleanses data in Table A and writes the obtained data to Table B. Then, Node C cleanses data in Table B and writes the obtained data to Table C. In this case, a node dependency loop is formed. The following figure shows the node dependency loop.
Solution: Analyze the workflow and remove the dependency that causes the loop. If you want an ancestor node to cleanse data generated by its descendant node in the previous scheduling cycle, you can configure cross-cycle scheduling dependencies. In the sample scenario shown in the following figure, you can configure cross-cycle scheduling dependencies between Node A and Node C.
Monitoring and alerting
DataWorks provides built-in alert rules to monitor and scan auto triggered tasks on a regular basis. This ensures that auto triggered tasks can be run as scheduled and instances can be generated for auto triggered tasks. If an exception occurs, an alert is triggered. An alert notification is automatically sent if a node dependency loop is formed. We recommend that you handle the alert at the earliest opportunity.
DataWorks scans auto triggered tasks at 09:00, 12:00, 16:00, 20:00, and 22:00 every day. If an exception occurs, DataWorks sends an alert notification. However, if an exception occurs within 10 minutes before a scan starts, the exception is out of the scanning scope of the current scan and can be detected until the next scan.
Alert rules for node dependency loops are built-in rules provided by DataWorks. After an alert rule is triggered, an alert notification is sent to the node owner by text message or email. You can change the alert contact on the Rule Management page. For more information, see Create a custom alert rule.