Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Shortest path in graph

Description
The program is given as input a file containing connected nodes in a graph and a weight assigned to the edge between the nodes. The program shall answer the questions: Is there a path between two given nodes in the graph? If so, what is the shortest path ?
This is useful in a number of situations: Protein interaction, which proteins interact together, thereby discovering f.ex. new pathways. Social networks, who knows who, proving the "six degrees" theory.

Input and output
The program is given a graph file as input on the command line. A graph file is a tab delimited file, that looks like this (social network example):

Chris Workman	Peter Wad Sackett	1
Chris Workman	Søren Brunak	2
Carsten Friis	Sune Frankild	4
Each line contains an edge in the graph, given by the name of the two connected nodes, and a weight. All edges are two-way, i.e. if Chris knows Peter, then Peter knows Chris.
Having read the file and constructed the internal data structures, that represents the graph the program should ask for two nodes (names in this example) and find the shortest path between them, if any, and print out the result. Example output:
No node is named Anders Fjog
There is no path between Sune Frankild and Søren Brunak
Shortest path: Chris Workman -> Peter Wad Sackett, weight 1
Shortest path: Søren Brunak -> Chris Workman -> Peter Wad Sackett, weight 3

Examples of program execution:
shortest_path.pl graphfile.txt

Details
If there is more than one path between two nodes, then the shortest path should be selected. The length of the path is determined by the weight of the edge between two nodes. The weights are simply added together, smallest number wins. The program is a general purpose tool. The nature of the graphs and thereby the problems that can be solved, are changed by either giving all edges the same weight, or different weights. Even negative weights are possible, making certain edges hugely preferred.
You can see what Wikipedia has to say about the shortest path problem.. It will be good for inspiration.
Here is a list of human protein-protein interactions. Your program should detect that COL4A2 <=> Nid2 <=> ITGA5 <=> Vpr <=> IL12A are interacting like this and TPD52 and RasGRP3  do not interact.