Skip to content Skip to sidebar Skip to footer

Networkx: Calculating And Storing Shortest Paths On A Graph To A Pandas Data Frame

I have a pandas dataframe as shown below. There are many more columns in that frame that are not important concerning the task. The column id shows the sentenceID while the columns

Solution 1:

Here's one way to do what you are trying to do, in three distinct steps so that it is easier to follow along.

  • Step 1: From a list of edges, build the networkx graph object.
  • Step 2: Create a data frame with 2 columns (For each row in this DF, we want the shortest distance and path from the e1 column to the entity in e2)
  • Step 3: Row by row for the DF, calculate shortest path and length. Store them in the DF as new columns.

Step 1: Build the graph and add edges, one by one

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

elist = [[('a-5', 'b-17'), ('b-17', 'c-1')], #sentence 1
         [('c-1', 'a-23'), ('a-23', 'c-1')], #sentence 2
         [('b-17', 'g-2'), ('g-20', 'c-1')]] #sentence 3

graph = nx.Graph()

for sentence_edges in elist:
    for fromnode, tonode in sentence_edges:
        graph.add_edge(fromnode, tonode)

nx.draw(graph, with_labels=True, node_color='lightblue')

enter image description here

Step 2: Create a data frame of desired distances

#Create a data frame to store distances from the element in column e1 to e2
DF = pd.DataFrame({"e1":['c-1', 'a-23', 'c-1', 'g-2'],
             "e2":['b-17', 'a-5', 'g-20', 'g-20']})
DF

enter image description here

Step 3: Calculate Shortest path and length, and store in the data frame

This is the final step. Calculate shortest paths and store them.

pathlist, len_list = [], [] #placeholders

for row in DF.itertuples():
    so, tar = row[1], row[2]
    path = nx.shortest_path(graph, source=so, target=tar)
    length=nx.shortest_path_length(graph,source=so, target=tar)
    pathlist.append(path)
    len_list.append(length)

#Add these lists as new columns in the DF
DF['length'] = len_list
DF['path'] = pathlist

Which produces the desired resulting data frame:

enter image description here

Hope this helps you.


Solution 2:

For anyone that's interested in the solution (thanks to Ram Narasimhan) :

 pathlist, len_list = [], []
 so, tar = DF["e1"].tolist(), DF["e2"].tolist()
 id = DF["id"].tolist()

 for _,s,t in zip(id, so, tar):
     graph = nx.Graph(graph_edges[_]) #Constructing each Graph
     try:
         path = nx.shortest_path(graph, source=s, target=t)
         length = nx.shortest_path_length(graph,source=s, target=t)
         pathlist.append(path)
         len_list.append(length)
     except nx.NetworkXNoPath:
         path = "No Path"
         length = "No Pathlength"
         pathlist.append(path)
         len_list.append(length)

 #Add these lists as new columns in the DF
 DF['length'] = len_list
 DF['path'] = pathlist

Post a Comment for "Networkx: Calculating And Storing Shortest Paths On A Graph To A Pandas Data Frame"