Union-Find (Disjoint Set) Data Structure

One-Line Benefit

Supports efficient dynamic connectivity: both merging groups and checking connectivity run in nearly O(1), avoiding the O(n) update cost of naive set merging.

When to Use

  • Cycle Detection in undirected graphs (e.g., Kruskal’s MST - check if edge creates cycle)
  • Connected Components - grouping elements into disjoint sets
  • Dynamic connectivity / Equivalence relations - efficiently track transitive relationships (if AB and BC, then A~C) and answer “are X and Y connected?” after series of unions
  • Network reliability - detecting if networks remain connected after link failures, percolation problems

Time Complexity

  • find(): O(α(n)) amortized
  • union(): O(α(n)) amortized
  • α(n) is the Inverse Ackermann Function - grows extremely slowly (effectively constant for all practical values of n)

Why so fast?

  • Path compression flattens trees during find(), making future operations faster
  • Union by rank keeps trees balanced, limiting maximum depth
  • Combined, these optimizations achieve nearly O(1) amortized time per operation

Implementation

class UnionFind:
    def __init__(self, n):
        # For non-integer inputs, use a map: 
        # self.parent = {node: node for node in nodes} and
        # self.rank = {node: 0 for node in nodes}
        self.parent = list(range(n))
		
        # Rank is an upper bound on tree height 
        # (not exact height after path compression)
        # and directionality in this context is larger
        # rank has greater priority.
        #
        # Alternative strategy: union by size 
        # (track number of elements in each set)
        self.rank = [0] * n
        
        
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        x = self.find(x)
        y = self.find(y)
        
        if x == y:
            return
        
        if self.rank[x] >= self.rank[y]:
            self.parent[y] = x
        else:
            self.parent[x] = y
        
        if self.rank[x] == self.rank[y]:
            self.rank[x] += 1

Key Insights

  • Path compression: Flattens tree structure by pointing nodes directly to root during find()
  • Union by rank: Keeps trees balanced by attaching smaller rank tree under larger rank tree
  • When ranks equal, increment rank of the new parent (tree gets one level deeper)
  • Initially all nodes are disjoint (each is its own parent)
  • Works only on undirected graphs: Union operation is symmetric (union(A,B) = union(B,A)), so it cannot track edge direction. For directed cycle detection, use DFS with recursion stack.

References