Shortest path in Python (Breadth first search)

In this post, we’ll see an implementation of shortest path finding in a graph of connected nodes using Python. The shortest path will be found by traversing the graph in breadth first order.

As a related topic, see some common Python programming mistakes.

Breadth first search

Let’s consider the following graph. We will traverse it in breadth first order starting from node 0. Our goal will be to find node x.

First, we will traverse the nodes that are directly connected to 0. Those are {1, 2, 3}. If node x is part of {1, 2, 3}, we stop. If not, we continue traversing the graph.

Next, we consider the set of nodes that are connected to or previous set {1, 2, 3}. Those would be {4, 5, 6}. Nodes 4 and 5 are connected to node 1 and node 6 is connected to node 3. We look for node x again and then we stop becausre there aren’t any more nodes.

If the graph was larger, we would continue traversing the graph by considering the nodes connected to {4, 5, 6} and so on.

See that this order of traversal guarantees that we find the shortest path between node 0 and node x because we start by searching the nodes that are one edge away from node1, then those that are two edges distant, and so on.

Graph representation

We will represent our graph as a dictionary, mapping each node to the set of the nodes it is connected to.

For example, let’s consider the following graph.

In this graph, node 4 is connected to nodes 3, 5, and 6. Our graph dictionary would then have the following key: value pair:

graph[4] = {3, 5, 6}

We would have similar key: value pairs for each one of the nodes in the graph.

Shortest path function input and output

Function input

Our BFS function will take a graph dictionary, and two node ids (node1 and node2). For this tutorial, each graph will be identified using integer numbers (1, 2, etc).

Function output

The function will return a list of nodes that connect node1 and node2, starting with node1 and including node2: [node1, node_x, node_y, ..., node2]. This list will be the shortest path between node1 and node2.

If there is more than one possible shortest path, it will return any of them.

In case no path is found, it will return an empty list [].

from typing import Dict, List

def shortest_path(graph: Dict[int, set], node1: int, node2: int) -> List[int]:
     pass

Shortest path algorithm

Our algorithm starts by defining a list of possible paths. A “path” is a list of connected nodes. Initially, we have only one “path” possible: [node1], because we start traversing the graph from that node. We also define a set of previously visited nodes to avoid backtracking.

After these initial steps the algorithm does the following:

Take the next path from the list of paths.
- If all possible paths have been traversed, stop. No path was found.
Set the “current node” to the last node in the current path.
Search the goal node, node2, between the nodes that are connected to the current node.
- If node2 is connected to the current node, we have found path from node1 to node2. Stop.
If node2 isn’t connected to the current node, update the list of paths to traverse.
- Add a new path from node1 to each one of the connected nodes to traverse next.
Go back to step 1.

Shortest path implementation in Python

Finally, we have the implementation of the shortest path algorithm in Python.

def shortest_path(graph, node1, node2):
    path_list = [[node1]]
    path_index = 0
    # To keep track of previously visited nodes
    previous_nodes = {node1}

    if node1 == node2:
        return path_list[0]
        
    while path_index < len(path_list):
        current_path = path_list[path_index]
        last_node = current_path[-1]
        next_nodes = graph[last_node]
        # Search goal node
        if node2 in next_nodes:
            current_path.append(node2)
            return current_path
        # Add new paths
        for next_node in next_nodes:
            if not next_node in previous_nodes:
                new_path = current_path[:]
                new_path.append(next_node)
                path_list.append(new_path)
                # To avoid backtracking
                previous_nodes.add(next_node)
        # Continue to next path in list
        path_index += 1
    # No path is found
    return []

We return the trivial path [node1] for the case node1 == node2.

Variable path_index keeps track of the path that we’re currently following. We stop the loop when we reach the end of path_list.

At all times, we have a shortest path from node1 to last_node. Based on this path, we can find the path from node1 to node2 if node2 is connected to last_node.

The order in which new paths are added to path_list guarantees that we traverse the graph in breadth first order.

Verification

Let’s check our algorithm with the graph shared at the beginning of this post.

We create a graph dictionary.

graph = {}

graph[1] = {2, 5}
graph[2] = {1, 3, 5}
graph[3] = {2, 4}
graph[4] = {3, 5, 6}
graph[5] = {1, 2, 4}
graph[6] = {4}

Now, let’s find the shortest path from node 1 to node 6.

>>> shortest_path(graph, 1, 6)
[1, 5, 4, 6]