The single-source shortest path refers to the shortest path between a start vertex and all other vertices. The shortest path is calculated by using the Dijkstra algorithm. The Single-source Shortest Path component can provide the shortest paths between a start vertex and all other vertices and the number of shortest paths.
Configure the component
Method 1: Configure the component on the pipeline page
You can add the Single-source Shortest Path component on the pipeline page of Machine Learning Designer in the Platform for AI (PAI) console. The following table describes the parameters.
Tab | Parameter | Description |
Fields Setting | Source Vertex Column | The start vertex column in the edge table. |
Target Vertex Column | The end vertex column in the edge table. | |
Edge Weight Column | The edge weight column in the edge table. | |
Parameters Setting | Initial Node ID | The start vertex that is used to calculate the shortest path. |
Tuning | Number of Workers | The number of vertices for parallel job execution. The degree of parallelism and framework communication costs increase with the value of this parameter. |
Worker Memory (MB) | The maximum size of memory that a single job can use. Unit: MB. Default value: 4096. If the size of used memory exceeds the value of this parameter, the |
Method 2: Configure the component by using PAI commands
You can configure the Single-source Shortest Path component by using PAI commands. You can use the SQL Script component to run PAI commands. For more information, see Scenario 4: Execute PAI commands within the SQL script component in the "SQL Script" topic.
PAI -name SSSP
-project algo_public
-DinputEdgeTableName=SSSP_func_test_edge
-DfromVertexCol=flow_out_id
-DtoVertexCol=flow_in_id
-DoutputTableName=SSSP_func_test_result
-DhasEdgeWeight=true
-DedgeWeightCol=edge_weight
-DstartVertex=a;
Parameter | Required | Default value | Description |
inputEdgeTableName | Yes | No default value | The name of the input edge table. |
inputEdgeTablePartitions | No | Full table | The partitions in the input edge table. |
fromVertexCol | Yes | No default value | The start vertex column in the input edge table. |
toVertexCol | Yes | No default value | The end vertex column in the input edge table. |
outputTableName | Yes | No default value | The name of the output table. |
outputTablePartitions | No | No default value | The partitions in the output table. |
lifecycle | No | No default value | The lifecycle of the output table. |
workerNum | No | No default value | The number of vertices for parallel job execution. The degree of parallelism and framework communication costs increase with the value of this parameter. |
workerMem | No | 4096 | The maximum size of memory that a single job can use. Unit: MB. Default value: 4096. If the size of used memory exceeds the value of this parameter, the |
splitSize | No | 64 | The data split size. Unit: MB. |
startVertex | Yes | No default value | The ID of the start vertex. |
hasEdgeWeight | No | false | Specifies whether the edges in the input edge table have weights. |
edgeWeightCol | No | No default value | The edge weight column in the input edge table. |
Example
Add the SQL Script component as a vertex to the canvas and execute the following SQL statements to generate training data.
drop table if exists SSSP_func_test_edge; create table SSSP_func_test_edge as select flow_out_id,flow_in_id,edge_weight from ( select "a" as flow_out_id,"b" as flow_in_id,1.0 as edge_weight union all select "b" as flow_out_id,"c" as flow_in_id,2.0 as edge_weight union all select "c" as flow_out_id,"d" as flow_in_id,1.0 as edge_weight union all select "b" as flow_out_id,"e" as flow_in_id,2.0 as edge_weight union all select "e" as flow_out_id,"d" as flow_in_id,1.0 as edge_weight union all select "c" as flow_out_id,"e" as flow_in_id,1.0 as edge_weight union all select "f" as flow_out_id,"g" as flow_in_id,3.0 as edge_weight union all select "a" as flow_out_id,"d" as flow_in_id,4.0 as edge_weight ) tmp;
Data structure
Add the SQL Script component as a vertex to the canvas and run the following PAI commands to train the model.
drop table if exists ${o1}; PAI -name SSSP -project algo_public -DinputEdgeTableName=SSSP_func_test_edge -DfromVertexCol=flow_out_id -DtoVertexCol=flow_in_id -DoutputTableName=${o1} -DhasEdgeWeight=true -DedgeWeightCol=edge_weight -DstartVertex=a;
Right-click the SQL Script component and choose View Data > SQL Script Output to view the training results.
| start_node | dest_node | distance | distance_cnt | | ---------- | --------- | -------- | ------------ | | a | a | 0.0 | 0 | | a | b | 1.0 | 1 | | a | c | 3.0 | 1 | | a | d | 4.0 | 3 | | a | e | 3.0 | 1 |