Intro to Breadth First Search (BFS)

When dealing with a graph it might be necessary to traverse the graph.
Graph traversal is a systemic method to traverse the graph.
Breadth first search is one of the most popular graph traversal algorithms that work very well for certain purposes.
Today, I am going to introduce breadth first search in this post.

How does breadth first search traverse?

BFS is a graph traversal algorithm to visit vertices from one starting point in a certain way.
How does it visit other nodes? It first visits its nearest neighbors and processes them before reaching any further nodes.
And it marks nodes as it visits so the same node won’t be visited more than once.
How it marks really depends on each implementation but I will use the following notation in this post.

UNDISCOVERED – node is not visited in the past
DISCOVERED – node is visited but not completely processed
PROCESSED – node is processed (you can do any custom process of the node if necessary)

BFS Example

Now, let’s take a look at the below example.
Let’s say we have a graph to traverse from node 1 using BFS.

Here are the algorithm steps for BFS

  1. Start from node 1 (or any node you would like to start), look at neighbors of node 1 and save the node 2 and 3 (you typically use a queue for this purpose. Enqueue for save and dequeue for pop) for next processing. Mark node 1 processed, node 2 and 3 as discovered so they are not enqueued more than once.
  2. Dequeue node 2 and enqueue neighbors of it which are node 1 and 4 but only node 4 will be enqueued since node 1 is already processed. Mark node 2 processed and node 4 discovered.
  3. Dequeue node 3 and enqueue neighbors which are node 1 and node 4 but nothing will be enqueued since node 1 is already processed and node 4 is discovered – already enqueued. Mark node 3 processed
  4. Dequeue node 4 and enqueue neighbors which are the node 2, 3 and 5 but only node 5 will be enqueued since 2 and 3 are already processed. Mark node 4 processed
  5. Dequeue node 5 and mark it processed (nothing to enqueue – node 4 is already processed)

Since it might be hard to fully understand with the above algorithm let’s take a look at each step.

Starting from node 1 take a look at unvisited neighbors.
By the way, an unvisited node means its status is UNDISCOVERED.
Enqueue unvisited neighbor nodes and mark them DISCOVERED.
Now node 1 should be marked as PROCESSED.

breadth first search example

Now dequeue node 2 (it could be node 3 depending on the enqueue order but it is node 2 here because I enqueued it first) and check unvisited neighbors of the node.
Enqueue node 4 and mark it DISCOVERED.
Although node 1 is the neighbor of node 2 we don’t enqueue it because it’s already marked PROCESSED.
Node 2 should be marked as PROCESSED.

breadth first search example

Dequeue node 3 and look for any unvisited neighbor node.
It looks like node 3 has two neighbors – nodes 1 and 4.
However, we don’t enqueue any node because node 1 is marked PROCESSED and node 4 is marked DISCOVERED.
Mark node 3 as PROCESSED.

breadth first search example

Dequeue node 4 and look for any unvisited neighbor node which is node 5 only.
Enqueue node 5 and mark it DISCOVERED.
Node 4 is also processed.

breadth first search example

Dequeue node 5 and there isn’t any unvisited or processed nodes at the moment.
Mark node 5 as PROCESSED and breadth first search is complete.

breadth first search example

Internal Data Structure for BFS

BFS uses queue which is FIFO (First in First Out). As you can already imagine it processes all of the neighbor nodes of the current node before moving on.
Therefore it radiates out from the starting node slowly.

Where do you use BFS for?

Now you might wonder where breadth first search algorithm is used for?
There are many BFS applications but I will just list two common ones here.

  1. Shortest path of an unweighted graph. Due to its nature BFS guarantees to find the shortest path of the unweighted graph. All you need to do is keep track of the parent node.
  2. The number of connected components in a graph. All the nodes may be connected for some graphs but there may be a graph that may have multiple components that are not connected. BFS could be very useful to find the number of connected components in a graph

Code Example

Here I provided a code example, finding the shortest path from the node.

#include <iostream>
#include <queue>
#include <vector>
#include <algorithm>
using namespace std;

enum NodeStatus
{
    UNDISCOVERED,
    DISCOVERED,
    PROCESSED
};

struct Node
{
    // default status is UNDISCOVERED
    NodeStatus status;
    
    // node id which is index of vector in this case
    int nodeId;
    
    // parent node id
    int parentNodeId;
    
    // set of edges belong to this node
    // destination node id for this edge
    vector<int> edges;
};

// this is really for testing purpose.
// I initialized graph as the above example.
vector<Node> initializeGraph()
{
    vector<Node> nodes(5);
    for (int i = 0; i < 5; ++i)
    {
        nodes[i].status = UNDISCOVERED;
        nodes[i].nodeId = i;
    }
    
    // edges for node 0
    nodes[0].edges.push_back(1);
    nodes[0].edges.push_back(2);
    
    // edges for node 1
    nodes[1].edges.push_back(0);
    nodes[1].edges.push_back(3);
    
    // edges for node 2
    nodes[2].edges.push_back(0);
    nodes[2].edges.push_back(3);
    
    // edges for node 3
    nodes[3].edges.push_back(1);
    nodes[3].edges.push_back(2);
    nodes[3].edges.push_back(4);
    
    // edges for node 4
    nodes[4].edges.push_back(3);
    
    return nodes;
}

// print each node as traversing
void bfs(vector<Node> &graph)
{
    queue<int> q;
    q.push(graph[0].nodeId);
    
    while (!q.empty())
    {
        int currentNodeId = q.front();
        q.pop();
        
        // mark it processed first to prevent enqueuing itself
        // for self-loop case
        graph[currentNodeId].status = PROCESSED;
        
        for_each(
            graph[currentNodeId].edges.begin(), 
            graph[currentNodeId].edges.end(),
            [&graph, &q, currentNodeId](int destNodeId)
            {
                if (graph[destNodeId].status == UNDISCOVERED)
                {
                    q.push(destNodeId);
                    graph[destNodeId].status = DISCOVERED;
                    graph[destNodeId].parentNodeId = currentNodeId;
                }
                
                // you can do any edge processing you want here
            });
            
        //cout << "processed node[" << currentNodeId << "]" << endl;
    }
}

void printShortestPath(int currentNodeId, const vector<Node> &graph)
{
    if (currentNodeId == 0)
    {
        cout << "root node[" << currentNodeId << "]" << endl;
    }
    else
    {
        cout << "node[" << currentNodeId << "]" << endl;
        printShortestPath(graph[currentNodeId].parentNodeId, graph);
    }
}

int main() 
{	
	vector<Node> graph = initializeGraph();

	bfs(graph);

	// print shortest path from node 0 -> node 4
	printShortestPath(graph.size() - 1, graph);
	return 0;
}

Conclusion

Breadth first search is a popular algorithm that is very good for certain applications such as finding the shortest path of an unweighted graph.
There is another graph traversal algorithm called depth first search which I will discuss in another post.

Leave a Reply

Your email address will not be published.