Kruskal’s Algorithm - Greedy Edge Selection for MST

The Minimum Spanning Tree Problem

Given a connected, weighted graph, find a tree that:

  1. Spans all vertices (connects everything)
  2. Minimizes total edge weight
  3. Is acyclic (it’s a tree)

A spanning tree uses exactly V-1 edges (for V vertices). The question is: which V-1 edges?

Kruskal’s Greedy Strategy

The algorithm is remarkably simple:

1. Sort all edges by weight (ascending)
2. Initialize empty tree T
3. For each edge (u,v) in sorted order:
   - If adding (u,v) to T doesn't create a cycle:
     - Add (u,v) to T
4. Return T when we have V-1 edges

Why Greedy Works Here

The Cut Property: For any cut (partition of vertices into two sets), the minimum weight edge crossing the cut is in some MST.

Proof sketch:

  • Suppose edge e is the minimum crossing the cut
  • If e is not in MST T, then T has some other edge e’ crossing the cut
  • Since e has smaller weight, we can swap e’ for e
  • This creates a tree T’ with smaller total weight
  • Contradiction! T wasn’t minimal.

Kruskal exploits this: Each time we add edge (u,v), it’s the minimum edge connecting u’s component to v’s component (a cut!).

The Cycle Detection Problem

The key operation: “Does adding (u,v) create a cycle?”

Equivalently: “Are u and v already connected?”

This is exactly what Union Find solves efficiently!

Simple Implementation

def kruskal(edges, num_vertices):
    """
    edges: list of (weight, u, v) tuples
    returns: list of edges in MST
    """
    # Sort edges by weight
    edges.sort()
    
    # Union-Find for cycle detection
    parent = list(range(num_vertices))
    rank = [0] * num_vertices
    
    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])  # Path compression
        return parent[x]
    
    def union(x, y):
        px, py = find(x), find(y)
        if px == py:
            return False  # Already connected (would create cycle)
        # Union by rank
        if rank[px] < rank[py]:
            parent[px] = py
        elif rank[px] > rank[py]:
            parent[py] = px
        else:
            parent[py] = px
            rank[px] += 1
        return True
    
    mst = []
    for weight, u, v in edges:
        if union(u, v):  # If u and v not connected
            mst.append((weight, u, v))
            if len(mst) == num_vertices - 1:
                break  # Early termination
    
    return mst

Example Execution

Graph:
    A---1---B
    |       |
    4       2
    |       |
    C---3---D

Edges sorted: [(1,A,B), (2,B,D), (3,C,D), (4,A,C)]

Step 1: Add (1,A,B) - connects A and B
Step 2: Add (2,B,D) - connects {A,B} to D
Step 3: Add (3,C,D) - connects C to {A,B,D}
Step 4: Skip (4,A,C) - would create cycle (A and C already connected)

MST: {(1,A,B), (2,B,D), (3,C,D)} with total weight 6

Time Complexity Analysis

OperationComplexityCount
Sort edgesO(E log E)Once
Union-Find operationsO(α(V)) amortizedO(E) times
TotalO(E log E)

Note: O(E log E) = O(E log V) since E ≤ V² implies log E ≤ 2 log V.

The sorting dominates! Union-Find is nearly constant time with path compression and union by rank.

When Kruskal Beats Prim

Kruskal is better when:

  • Graph is sparse (E << V²)
  • Edges are already sorted or cheap to sort
  • You have edge list representation
  • Simple implementation is valued

Prim is better when:

  • Graph is dense (E ≈ V²)
  • You have adjacency matrix or list
  • You want to avoid sorting all edges

Both produce an MST; the choice is about efficiency for your graph structure.

Correctness: The Exchange Argument

Claim: Kruskal’s greedy choice is always safe.

Proof by exchange: Let A be edges chosen so far (acyclic, minimal weight so far). Let (u,v) be the next edge Kruskal adds.

If (u,v) is not in some optimal MST T*, then:

  1. Adding (u,v) to T* creates a cycle
  2. This cycle must contain another edge (x,y) connecting u’s component to v’s component
  3. Since we chose edges in order, weight(u,v) ≤ weight(x,y)
  4. We can swap (x,y) for (u,v) to get MST with weight ≤ T*
  5. Contradiction! T* wasn’t optimal.

Therefore, (u,v) is in some optimal MST.

Disconnected Graphs

If the graph is disconnected, Kruskal produces a minimum spanning forest - one MST per connected component.

def kruskal(edges, num_vertices):
    # ... same algorithm ...
    
    if len(mst) < num_vertices - 1:
        # Graph is disconnected
        num_components = num_vertices - len(mst)
        return mst, num_components
    return mst, 1  # Single component

Variants and Applications

Maximum Spanning Tree

Sort edges in descending order instead:

edges.sort(reverse=True)  # Largest first

Kruskal’s Reverse Delete Algorithm

Start with all edges, remove heaviest while maintaining connectivity (slower but interesting).

Real-world Applications

  1. Network design: Minimize cable length to connect all buildings
  2. Clustering: Remove k-1 heaviest edges to get k clusters
  3. Approximation algorithms: MST is used in traveling salesman approximations
  4. Image segmentation: Pixels as vertices, similarity as edge weights

The “Greedy on Global, Local Checks” Pattern

Kruskal exhibits a common algorithmic pattern:

  1. Sort globally (all edges by weight)
  2. Decide locally (check each edge independently)

This works because:

  • Sorting ensures we consider minimum weight edges first
  • Union-Find makes local decisions (cycle check) efficient
  • The cut property guarantees global optimality

Common Mistakes

  1. Forgetting to sort edges - breaks the greedy guarantee
  2. Using BFS/DFS for cycle detection - too slow (O(V) per edge instead of O(α(V)))
  3. Not checking edge count - could fail to terminate for disconnected graphs
  4. Wrong comparison in union by rank - should use rank, not tree size

Key Insight

Kruskal’s elegance is in its simplicity: sort edges, greedily add safe ones. The cut property guarantees optimality, and Union-Find makes it efficient. It’s a perfect example of how the right data structure (Union Find) enables an elegant algorithm.


Reconstruction: The cut property guarantees that the minimum crossing edge is always safe. Since Kruskal always picks the globally minimum unchecked edge, and Union-Find determines if it crosses a cut (connects components), the greedy choice is optimal.