Networkx: Calculating And Storing Shortest Paths On A Graph To A Pandas Data Frame
I have a pandas dataframe as shown below. There are many more columns in that frame that are not important concerning the task. The column id shows the sentenceID while the columns
Solution 1:
Here's one way to do what you are trying to do, in three distinct steps so that it is easier to follow along.
- Step 1: From a list of edges, build the
networkx
graph object. - Step 2: Create a data frame with 2 columns (For each row in this DF, we want the shortest distance and path from the e1 column to the entity in e2)
- Step 3: Row by row for the DF, calculate shortest path and length. Store them in the DF as new columns.
Step 1: Build the graph and add edges, one by one
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
elist = [[('a-5', 'b-17'), ('b-17', 'c-1')], #sentence 1
[('c-1', 'a-23'), ('a-23', 'c-1')], #sentence 2
[('b-17', 'g-2'), ('g-20', 'c-1')]] #sentence 3
graph = nx.Graph()
for sentence_edges in elist:
for fromnode, tonode in sentence_edges:
graph.add_edge(fromnode, tonode)
nx.draw(graph, with_labels=True, node_color='lightblue')
Step 2: Create a data frame of desired distances
#Create a data frame to store distances from the element in column e1 to e2
DF = pd.DataFrame({"e1":['c-1', 'a-23', 'c-1', 'g-2'],
"e2":['b-17', 'a-5', 'g-20', 'g-20']})
DF
Step 3: Calculate Shortest path and length, and store in the data frame
This is the final step. Calculate shortest paths and store them.
pathlist, len_list = [], [] #placeholders
for row in DF.itertuples():
so, tar = row[1], row[2]
path = nx.shortest_path(graph, source=so, target=tar)
length=nx.shortest_path_length(graph,source=so, target=tar)
pathlist.append(path)
len_list.append(length)
#Add these lists as new columns in the DF
DF['length'] = len_list
DF['path'] = pathlist
Which produces the desired resulting data frame:
Hope this helps you.
Solution 2:
For anyone that's interested in the solution (thanks to Ram Narasimhan) :
pathlist, len_list = [], []
so, tar = DF["e1"].tolist(), DF["e2"].tolist()
id = DF["id"].tolist()
for _,s,t in zip(id, so, tar):
graph = nx.Graph(graph_edges[_]) #Constructing each Graph
try:
path = nx.shortest_path(graph, source=s, target=t)
length = nx.shortest_path_length(graph,source=s, target=t)
pathlist.append(path)
len_list.append(length)
except nx.NetworkXNoPath:
path = "No Path"
length = "No Pathlength"
pathlist.append(path)
len_list.append(length)
#Add these lists as new columns in the DF
DF['length'] = len_list
DF['path'] = pathlist
Post a Comment for "Networkx: Calculating And Storing Shortest Paths On A Graph To A Pandas Data Frame"