Picture the queue as a leading edge: nodes at distance d from the source enter before any node at distance d+1, so each dequeue step is the wavefront advancing by one unit. The queue is the wavefront, holding every node currently at the leading edge; enqueuing a neighbor means the wave will touch it next.
The analogy is mathematically precise, not just illustrative. It explains BFS’s optimality for unweighted shortest paths directly: because the algorithm visits all distance-d nodes before any distance-(d+1) node, the first time it reaches a target is necessarily along the shortest path. Dijkstra’s algorithm is the weighted generalization of this same principle, with a priority queue replacing the FIFO queue so that the wave expands at variable speed governed by edge weights.
The model also predicts where the technique works well outside graph search: image flood-fill, network routing, “people you may know” two hops out, percolation simulation, even ripple-effect propagation in cellular automata. Anywhere uniform spreading from a source is the right semantic, BFS is the natural fit. See DFS and BFS Are Complementary Search Strategies for what changes when you swap a queue for a stack, and Graph Search Is O(V+E)-Optimal for why O(V+E) is the floor.