Minimum Spanning Tree
Core Concepts
Spanning Tree — a subgraph of a connected undirected graph that includes all V vertices, is connected, and is acyclic; always has exactly V−1 edges.
Minimum Spanning Tree (MST) — a spanning tree whose total edge weight is minimal over all possible spanning trees; may not be unique when edge weights are not all distinct.
Spanning Forest — the generalization of a spanning tree to disconnected graphs; one spanning tree per connected component, total edges = V − (number of components).
Cut — a partition of the graph's vertices into two disjoint non-empty sets S and V\S; the cut-set is all edges crossing between S and V\S.
Cut Property (safe edge theorem) — for any cut that respects a partial MST (no MST edge crosses the cut), the minimum-weight edge crossing that cut is safe to add to the MST. This is the correctness foundation of all greedy MST algorithms.
Cycle Property — the maximum-weight edge in any cycle is never in any MST (assuming unique weights); if weights are equal, at least one such maximum edge is excluded.
Light Edge — the minimum-weight edge crossing a given cut; guaranteed to be in some MST by the cut property.
Safe Edge — any edge that can be added to the current partial MST without violating the MST property; the cut property identifies safe edges.
MST Uniqueness — an MST is unique if and only if all edge weights are distinct; with ties, multiple MSTs can exist but all share the same total weight.
Bottleneck Path — the path between two vertices in the MST minimizes the maximum edge weight over all paths; the MST doubles as a minimum bottleneck spanning tree.
Types / Variants
| Variant | What it is | When to use | Key property |
|---|---|---|---|
| Minimum Spanning Tree | Spanning tree of minimum total weight | Default MST problem | Greedy correctness via cut property |
| Maximum Spanning Tree | Spanning tree of maximum total weight | Maximize minimum edge on any path; maximize reliability | Negate all weights and run standard MST, or sort edges descending |
| Minimum Spanning Forest | MST generalized to disconnected graphs | Graph is not guaranteed to be connected; k-connected-components result | Kruskal's naturally handles this; just stop when all components are spanned |
| Second Minimum Spanning Tree | MST with second-lowest total weight | "What if one edge is removed?" interview escalations | Enumerate: for each non-MST edge (u,v,w), swap it with the heaviest MST edge on path u→v; best swap gives second MST |
| Minimum Bottleneck Spanning Tree | Spanning tree minimizing the maximum edge weight | Minimize worst-case link in a network | Every MST is also a minimum bottleneck spanning tree (the converse is not true) |
| Steiner Tree | Minimum-weight connected subgraph spanning a required subset of vertices | Only a subset of vertices must be connected | NP-hard in general; polynomial only when the required set is small (bitmask DP on small k) |
Key Algorithms
Kruskal's Algorithm
Builds the MST by greedily adding the globally cheapest edge that does not form a cycle. Processes edges in non-decreasing weight order; uses Union-Find to detect cycles in O(α(V)) per edge.
Time: O(E log E) dominated by sorting. Space: O(V + E).
Key insight: sorting ensures each edge examined is a candidate light edge for the cut separating its two components; Union-Find rejects it if both endpoints are already in the same component (cycle).
edges.sort(key=lambda e: e[2])
for u, v, w in edges:
if find(u) != find(v): mst.append((u,v,w)); union(u, v)
Natural choice when E << V² (sparse graphs) and when edges are given as an explicit list. Produces minimum spanning forest automatically — just run on a disconnected graph.
Prim's Algorithm (Lazy Heap)
Grows the MST from a start vertex by repeatedly adding the cheapest edge that connects the current tree to a new vertex. Uses a min-heap of (weight, vertex) pairs; may process stale entries.
Time: O(E log E) with lazy deletion. Space: O(V + E).
Key insight: maintains a "frontier" cut between tree vertices and non-tree vertices; the heap always yields the light edge across this cut.
heap = [(0, start)]
while heap:
w, u = heappop(heap)
if visited[u]: continue
visited[u] = True
for v, wt in adj[u]:
if not visited[v]: heappush(heap, (wt, v))
Prim's Algorithm (Eager / Dense Graph)
Instead of a heap, maintains a key[] array where key[v] = cheapest known edge connecting v to the current tree. Updates key values in-place; scans all vertices to find the minimum.
Time: O(V²) — better than O(E log E) when E = O(V²) (dense graphs). Space: O(V).
Key insight: for dense graphs (E ≈ V²), the linear scan over V vertices per iteration is cheaper than heap operations over E edges.
Use this variant when the graph is given as an adjacency matrix or when V ≤ 1000 and E is dense.
Borůvka's Algorithm
Each component simultaneously selects its cheapest outgoing edge and merges; repeats until one component remains. Each phase halves the number of components.
Time: O(E log V). Space: O(V + E).
Key insight: O(log V) phases, each doing O(E) work to find the cheapest edge per component. Naturally parallelizable and works well when edges can be evaluated in bulk.
# each phase: for each component find min-weight outgoing edge, then union all
Less common in competitive programming than Kruskal's but appears in parallel/distributed MST problems.
Reverse-Delete Algorithm
Start with all edges; repeatedly remove the heaviest edge if the graph remains connected. Produces the MST.
Time: O(E log E · α(E, V)) — slow in practice. Space: O(V + E).
Rarely used in practice; included for theoretical completeness. Demonstrates the cycle property directly.
Advanced Techniques
Second Minimum Spanning Tree via Edge Swap
Problem class: "Find the MST if one edge is forced in/out" or "find the spanning tree with second-lowest weight."
Mechanism: Build the MST. For each non-tree edge (u, v, w), the alternative tree formed by adding this edge and removing the maximum-weight edge on the MST path u→v has cost MST_cost − max_path_weight + w. Track the minimum such delta. Requires finding the heaviest edge on any tree path efficiently.
When to use over simpler alternatives: Only when the problem explicitly asks for the second MST or a constrained MST swap. Standard MST problems do not need this.
Path maximum query: Preprocess the MST with binary lifting (sparse table on the tree path) to answer "heaviest edge on path u→v" in O(log V). Total: O(E log V).
Minimum Spanning Tree on Implicit/Dense Graphs
Problem class: Complete graphs where edges are defined by a formula (e.g., Manhattan distance between all pairs of points).
Mechanism: For Manhattan distance MST, reduce to O(n log n) candidate edges using coordinate compression and sorted sweeps — the true MST only uses edges between geometrically close neighbors. For Euclidean MST, Delaunay triangulation provides O(n) candidate edges.
When to use: When naively generating all O(V²) edges is too slow and coordinates or a metric structure exist.
Kruskal's with Offline LCA for Path Queries (Kruskal's Reconstruction Tree)
Problem class: "Minimum bottleneck path between all pairs" or "for each query (u, v, threshold), are u and v connected using only edges of weight ≤ threshold?"
Mechanism: Build a virtual binary tree (Kruskal's reconstruction tree) where each merge in Kruskal's creates an internal node with weight = the merged edge. The LCA of two leaf nodes is the internal node whose weight equals the bottleneck of the path between them. Answer each query by finding LCA.
When to use over simpler alternatives: Only when multiple offline bottleneck-path queries exist. A single query is answered with binary search on the MST path using binary lifting. The reconstruction tree amortizes LCA preprocessing across all queries in O((V + Q) log V).
Minimum Spanning Forest with Union-Find (Disconnected Graphs)
Problem class: Graph may be disconnected; connect as many components as possible under a budget, or find each component's MST.
Mechanism: Run Kruskal's unmodified; edges within a component are handled naturally; cross-component edges are never added. The result is the minimum spanning forest.
When to use: Whenever the problem says "graph may not be connected" or asks for minimum cost to connect all components given constraints.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Standard MST | Minimum cost to connect all nodes | Kruskal's or Prim's |
| Maximum MST | Maximum cost spanning tree / maximize minimum edge on any path | Negate weights + MST or sort descending |
| Minimum connection cost with constraints | Connect subset of nodes or restrict which edges can be used | Kruskal's with filtered edge set or Steiner tree (small k) |
| Path bottleneck query | Minimum of the maximum edge weight on path u→v | MST + binary lifting path max query |
| Second MST / forced edge | Second cheapest spanning tree | Edge swap over non-tree edges with path max |
| Network reliability | Maximize probability product on spanning tree | Maximum spanning tree (treat log-probability or reliability as weight) |
| Cluster / group connectivity | Connect groups at minimum inter-group cost | Contract groups to super-nodes; run MST on contracted graph |
| Online dynamic connectivity | Add/remove edges; maintain MST | Link-Cut Tree (advanced, rarely asked in interviews) |
Problem Signal → Technique
| Signal in problem statement | Likely technique |
|---|---|
| "Minimum cost to connect all N nodes" | Kruskal's or Prim's (standard MST) |
| "Connect all cities / computers / islands" | Kruskal's (sparse), Prim's (dense/matrix input) |
| "Minimize the maximum edge used" | MST (bottleneck property) |
| "Maximize the minimum edge on any path" | Maximum spanning tree |
| "Points in 2D plane, minimize total distance" | Euclidean/Manhattan MST (Prim's with all-pairs edges for small n) |
| "Already some edges fixed, add minimum cost" | Pre-union fixed edges, then run Kruskal's on remaining |
| "Cost of connecting k groups" | Contract groups; MST on contracted graph |
| "Second cheapest network" | Second MST via edge swap |
| "Is path u→v possible with max weight threshold" | Kruskal's reconstruction tree + LCA |
| "Find critical and pseudo-critical edges in MST" | Include/exclude edge from MST and compare total weight |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Pre-union fixed edges | Some edges must be in the MST; others are optional | Union-Find the forced edges first; run Kruskal's on the rest, skipping edges already in same component |
| Contracted super-node | Nodes in a group are treated as a single entity | Union-Find group members; relabel edges to connect group representatives; run MST |
| Virtual node | Adding a zero-cost hub node to connect to multiple targets | Insert a node with 0-weight edges to each target; MST of augmented graph connects all targets via hub |
| Coordinate-based edge pruning | Generating all O(n²) edges is too slow for point-set MST | Sort by x, then y; only adjacent points in sorted order are MST candidates (for Manhattan distance) |
| Negate weights for maximum MST | Maximize total weight or maximize minimum bottleneck | Multiply all weights by −1 and run standard min-MST; or sort edges descending |
| Include/exclude edge to find critical edges | Determine if an edge is in every MST (critical) or some MST (pseudo-critical) | Remove edge and check if MST cost increases (critical); force-include edge and check if MST cost stays same (pseudo-critical) |
Edge Cases & Pitfalls
-
Disconnected graph treated as connected: Kruskal's loop ends with fewer than V−1 edges; if the problem requires a spanning tree (not forest), the answer is "impossible." Check
len(mst_edges) == V−1before returning. -
Duplicate edge weights causing non-unique MST: Algorithms still produce a valid MST, but the specific MST may differ between runs or implementations. Problems asking for MST uniqueness must check whether all edge weights are distinct or use the cycle/cut property to verify.
-
Off-by-one in Union-Find initialization: If nodes are 1-indexed and the parent array is 0-indexed (or vice versa),
find(0)becomes the root of an unintended component. Initializeparent[i] = ifor the correct range. -
Stale heap entries in lazy Prim's: The heap may contain outdated (weight, vertex) pairs for vertices already added to the tree. Skipping with
if visited[u]: continueis required; omitting it gives incorrect results and may process O(E) stale entries. -
Self-loops in the edge list: A self-loop (u, u, w) never contributes to a spanning tree. Union-Find's
find(u) == find(v)check correctly rejects self-loops in Kruskal's; no special case needed, but be aware they exist. -
Negative edge weights: Both Kruskal's and Prim's handle negative weights correctly — sort/heap ordering still works. No modification needed, unlike Dijkstra's.
-
Assuming MST equals shortest-path tree: MST minimizes total edge weight; shortest-path tree minimizes sum of distances from a source. They are different structures and solve different problems.
-
Using Prim's on disconnected graph without checking connectivity: Prim's grows from one seed vertex; if the graph is disconnected, unreachable vertices are never visited and their
keyvalues remain infinity. The result is a spanning tree for one component only. -
Integer overflow in total weight: With V up to 10⁵ and weights up to 10⁶, MST total can reach ~10¹¹. Use 64-bit integers (Python ints are arbitrary precision, so not an issue in Python, but relevant in C++/Java).
-
Forgetting to handle the case where E < V−1: If the graph has fewer edges than V−1, no spanning tree exists regardless of the algorithm.
Implementation Notes
- Python's
heapqis a min-heap; for maximum spanning tree with Prim's, push(-weight, vertex)to simulate a max-heap. heapq.heappushandheappopoperate on the first element of a tuple; use(weight, vertex)or(weight, vertex, parent)tuples consistently.- Union-Find with both path compression and union by rank achieves amortized O(α(V)) per operation; omitting union by rank degrades to O(log V); omitting path compression degrades to O(log V) amortized; omitting both degrades to O(V) in the worst case.
- Python's recursion limit (default 1000) means recursive Union-Find
findwill hitRecursionErroron large inputs; use iterative path compression. - For Prim's on a dense graph given as a distance matrix, the O(V²) implementation avoids heap overhead and is simpler to implement correctly.
collections.defaultdict(list)is the standard adjacency list in Python; initializing with a fixed-size list[[] for _ in range(V)]avoids key-error surprises on isolated vertices.- When edges are given as
(u, v, w)tuples and nodes are 0-indexed, ensure Union-Find parent array has size V (not V+1); 1-indexed inputs need size V+1.
Cross-Topic Relations
| Related topic | Specific connection |
|---|---|
| Union-Find (Disjoint Set Union) | Kruskal's algorithm requires Union-Find for cycle detection in O(α(V)); MST is the primary application driving Union-Find study. See union-find.md for full coverage. |
| Graphs (BFS/DFS) | MST is defined on graphs; graph traversal needed for Prim's visited tracking and for verifying connectivity before/after MST construction. |
| Heaps / Priority Queue | Prim's algorithm is heap-driven; heap efficiency directly determines Prim's time complexity. |
| Greedy Algorithms | Both Kruskal's and Prim's are greedy; correctness proofs rely on the cut property, a greedy exchange argument. |
| Binary Lifting / Sparse Table | Used to answer "maximum edge weight on MST path u→v" in O(log V) per query; required for second MST and bottleneck path queries. |
| Shortest Paths | Frequently confused with MST: Dijkstra's produces shortest-path tree (minimize path sum from source); Kruskal's/Prim's produce MST (minimize total tree weight). They coincide only on unit-weight graphs. |
| Segment Tree / LCT | Link-Cut Trees enable dynamic MST (edge insertions and deletions); overkill for standard interview problems but appears in competitive programming. |
| Dynamic Programming (Bitmask) | Steiner tree on k required terminals is solved with bitmask DP over subsets of terminals: dp[S][v] = min cost to connect subset S with v as the root. |
Interview Reference
Must-Know Problems
Standard MST Construction
- LeetCode 1584 — Min Cost to Connect All Points (Prim's on complete graph, n ≤ 1000)
- LeetCode 1135 — Connecting Cities With Minimum Cost (Kruskal's, straightforward)
- LeetCode 1489 — Find Critical and Pseudo-Critical Edges in Minimum Spanning Tree (include/exclude pattern)
MST with Constraints / Variants
- LeetCode 1168 — Optimize Water Distribution in a Village (virtual node trick: add a well-node with edge cost = individual well cost)
- LeetCode 1579 — Remove Max Number of Edges to Keep Graph Fully Traversable (two MSTs — one per traversal type — built simultaneously)
- LeetCode 1724 — Checking Existence of Edge Length Limited Paths (offline sort queries + Kruskal's; bottleneck reachability)
- LeetCode 1697 — Checking Existence of Edge Length Limited Paths (same pattern)
Maximum Spanning Tree / Bottleneck
- LeetCode 1631 — Path With Minimum Effort (minimax path = MST bottleneck; also solvable with binary search + BFS/DFS)
- LeetCode 778 — Swim in Rising Water (minimax path; Prim's / binary search + BFS)
Hard / Multi-Step
- LeetCode 1489 — Critical and pseudo-critical edges (exhaustive include/exclude over each edge)
- LeetCode 1928 — Minimum Cost to Reach Destination in Time (not pure MST; DP + graph, but shares bottleneck intuition)
Clarification Questions to Ask
- "Is the graph guaranteed to be connected, or do I need to handle a spanning forest?"
- "Are edge weights distinct? Does uniqueness of the MST matter for correctness?"
- "Can there be self-loops or multi-edges (parallel edges between the same pair of nodes)?"
- "Are nodes 0-indexed or 1-indexed in the input?"
- "Is the graph directed or undirected? MST is only defined for undirected graphs."
- "What should I return if no spanning tree exists (disconnected graph)?"
- "Are all edge weights non-negative? (Algorithms still work with negatives, but worth confirming for bottleneck problems.)"
Common Interview Mistakes
- Running Prim's without checking if the final MST covers all V vertices; silently returning a partial spanning tree on disconnected input.
- Using Dijkstra's when the problem asks for MST; conflating "minimum path sum" with "minimum spanning tree."
- Forgetting the
if visited[u]: continueguard in lazy Prim's, causing vertices to be processed multiple times and producing incorrect weight totals. - Reinitializing Union-Find with size V when nodes are 1-indexed up to V, causing index-out-of-bounds or root collisions at index 0.
- Sorting edges by weight ascending for minimum MST but forgetting to reverse/negate for maximum MST.
- For the virtual-node trick (e.g., water distribution), forgetting to add V to the total node count when sizing the Union-Find array.
- Returning the MST edge list instead of the total weight (or vice versa) without re-reading the output format.
- On "critical edges" problems, only testing removal (critical) and forgetting to also test force-inclusion (pseudo-critical).
Typical Follow-Up Escalations
- "Now solve it if some edges are mandatory (must be in the MST)." → Pre-union mandatory edges in Union-Find; run Kruskal's on remaining edges, skipping those whose endpoints are already connected.
- "What if you need the second minimum spanning tree?" → For each non-tree edge, compute cost of swapping it with the heaviest tree edge on its path (binary lifting); take the minimum delta.
- "What if the graph changes dynamically (edges added/removed)?" → Link-Cut Tree for O(log V) per operation; acknowledge this is beyond standard interview scope.
- "How would you find all critical edges (edges in every MST)?" → An edge is critical if removing it increases the MST cost; check using Tarjan's bridge-finding adapted for weighted graphs, or the include/exclude method.
- "Can you do this in O(E α(V)) instead of O(E log E)?" → Not generally possible due to the sort in Kruskal's; if edge weights are integers bounded by W, use counting sort to sort in O(E + W), then Kruskal's runs in O(E α(V)).
- "What if only k of the n nodes need to be connected?" → Steiner tree; NP-hard in general. For small k (k ≤ 15), bitmask DP:
dp[mask][v]= min cost tree spanning the terminals inmaskrooted atv.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles