Minimum Spanning Tree – Kruskal’s Algorithm

This post explains what minimum spnning tree is and how to construct minimum spanning tree using Kruskal’s algorithm

What is minimum spanning tree

Given you have a list of nodes in a graph and weighted edges connecting the nodes. What is the minimum cost to connect the nodes that all the nodes are connected? For example, looking at the graph below, all the nodes are connected. However, some edges can be removed so that total weight of the graph can be minimized and still connecting all the nodes. Minimum spanning tree is a graph that all the nodes are connected with the minimum cost of edges.

There are some characteristics of minimum spanning tree.
1. Any subset of solution cannot have cycle. Cycle means there is a redundant connection that can be dropped
2. Need to span all nodes
3. connected

How to construct minimum spanning tree?

Given a list nodes and weighted edges, how to construct minimum spanning tree? In this section, we will use Kruskal’s algorithm – greedy approach. Here is the algorithm.

  1. Start with lists of nodes and edges
  2. At each step, add the cheapest edge to the tree
  3. Skip the edge if it creates the cycle

The algorithm itself is surprisingly simple and you might wonder if this actually results in a correct minimum spanning tree. Since greedy algorithm is typically wrong for many problems, it requires proof that the algorithm is correct. I will explain the proof later in the post.

Now, let’s look at the code. The key part of the algorithm is
1. Need to be able to select the cheapest edges, which requires the edges to be sorted based on the cost
2. Need to be able to detect if the given edge creates any cycle

#1 is fairly easy since you just need to sort the algorithm. But what about detecting a cycle? Would you you typical BFS or DFS for this? Running BFS or DFS every time you add the edge could be quite expensive as it will always result in quadratic time complexity. Union Find will be used in this algorithm since time complexity of the union find is practically O(n). Please refer to this post if you are unfamiliar with it. Pseudo code is used for simplicity

// n is a list of nodes from id 0 to n-1
n = [0,1,2,...n-1]

// edges is a list of edges. u is source, v is dest, w is weight
// assume any n log n sorting algorithm is used to sort the edges in ascending order of w
edges = [(u,v,w)...]

// parents are sizes are used for union find. Refer to the union find post above
parents = [0,1,2,...,n-1]
sizes = [1,1,1,...1]

total_cost = 0
num_components = n
mst = []

for u,v,w in edges:
    rootU = find(u)
    rootV = find(v)
    // if rootU and rootV are same, adding this edge will create a cycle
    if rootU != rootV:
        if sizes[rootU] < sizes[rootV]:
            parents[rootU] = rootV
            sizes[rootV] += sizes[rootU]
            parents[rootV] = rootU
            sizes[rootU] += sizes[rootV]

        // decrease every time connecting different components. This will check if all the nodes are connected
        // keeping track of the total cost
        total_cost += w
        // edges used in minimum spanning tree

// not all nodes are connected 
if num_components != 1:
    return -1
// found minimum spanning tree
print (total_cost, mst)

Following things are happening in the above code.

  1. Iterating all the edges which is sorted in ascending order of weights
  2. Using union find, check if adding the edge will create any cycle
  3. If creating a cycle, skip
  4. else add the edge to the graph
  5. If all the nodes are connected (checked by num_components), minimum spanning tree is found

Time complexity of the algorithm:
1. O(n log n) for sorting the edges. Assume the edges are sparse
2. O(n) running the edges using union find
Total Time complexity is O(n log n)

Proof of Kruskal’s algorithm

I will prove the algorithm using proof by contradiction.

Suppose I constructed minimum spanning tree using Kruskal’s algorithm (greedy approach) – I will call this K. And suppose there is another minimum spanning tree that has less total cost – I will call this B. Now, looking at all the edges used in both K and B, it will look like this
K1, K2, K3, K(n-1) edges for graph K, sorted in ascending order
B1, B2, B3, B(n-1) edges for graph B, sorted in ascending order

Since we are claiming graph B is better than graph K, there will be some i-th edge that is different between K and B. Let’s call that K(i) and B(i). It means all the edges until i-1 are the same. There are two cases possible.

Case 1. K(i) > B(i)
At K(i), let’s say the edge value is 12 and edge value of B(i) is 9. Is this actually possible? Up to this point, all the edges are same. Now given I add edge in increasing order, the fact K(i) edge value 12 and B(i) edge value 9 means I didn’t add edge 9 to K. What would be the case I didn’t add it? The only case I skip the edge is when it creates a cycle. If you added all the edges up to this point and add edge 9, it creates a cycle which means the edge is redundant. Therefore, this case is not possible.

Case 2. K(i) < B(i)
Now let’s say edge value of K(i) is 9 and B(i) is 12 instead. Given all the edges up to this point exactly matching, adding edge 9 to K is the best choice because it doesn’t create a cycle. Since the edges are sorted in ascending order of weights, there isn’t any way graph B will be better than graph K. So adding edge 9 to K is the best choice indeed.

As you see in both cases, Kruskal’s algorithm actually works. The implementation is simple yet works perfectly using union find.