Depth-First Search
Topic type: Technique-Based (Graph / Tree Traversal) LeetCode problems: 700+
Core Concepts
Depth-First Search (DFS) — A traversal strategy that explores as far down a branch as possible before backtracking. Uses a call stack (recursive) or explicit stack (iterative) to track the current path.
Stack discipline — DFS processes the most recently discovered unvisited neighbor first; LIFO ordering produces depth-first behavior. BFS uses FIFO (a queue) and produces breadth-first behavior.
Visited state — On graphs (which may have cycles), each node is marked visited on first encounter to prevent revisiting. On trees (acyclic), no visited set is needed.
Backtracking — When a node has no unvisited neighbors, DFS returns to the caller; any state accumulated for the current path must be undone at this point.
DFS tree / DFS forest — The subgraph of edges actually traversed during DFS. All remaining edges fall into: back edges (visited ancestor, indicate cycles in directed graphs), cross edges (between separate subtrees, directed graphs only), forward edges (ancestor to non-child descendant, directed only).
Discovery / finish timestamps — Assigning entry[v] on first visit and exit[v] after all of v's subtree is processed. The interval [entry[v], exit[v]] of a descendant is entirely nested inside that of its ancestor.
Recursion depth — DFS on a path graph of n nodes recurses n levels deep. Python's default recursion limit is 1000; switch to iterative DFS or call sys.setrecursionlimit on large inputs.
Structural Invariants
| Invariant | What it means | What breaks if violated |
|---|---|---|
| Mark visited before recursing | Set visited[v] = True before the recursive call, not after | Node pushed to stack multiple times; exponential work on dense graphs |
| Backtrack undo | Every mutation made on entry to a node must be reversed on exit | State leaks into sibling branches; path tracking, grid marking, counters become wrong |
| Ancestor interval nesting | For any ancestor a of v: entry[a] < entry[v] < exit[v] < exit[a] | Tarjan's low-link values and timestamp-based algorithms produce wrong answers |
| Stack LIFO for iterative DFS | Push neighbors in reverse order to match recursive visit order | Processing order differs from recursive variant; affects order-sensitive problems |
Types / Variants
| Variant | What it is | When to use over alternatives |
|---|---|---|
| Recursive DFS | Uses the implicit call stack | Default; clean for trees and small-depth graphs |
| Iterative DFS | Explicit stack; avoids recursion limit | Graphs with depth > ~500; when RecursionError is a risk |
| DFS with timestamps | Records entry/exit per node | Topological sort, SCC, bridges, articulation points |
| DFS with state (backtracking) | Mutates path state before recursing; restores on return | Enumeration: all paths, subsets, permutations |
| Multi-source DFS | Outer loop calls DFS on every unvisited node | Connected components, island counting, graph coloring |
| DFS on implicit graphs | Nodes generated on-the-fly via a neighbors(state) function | Word ladder, puzzle states, grid problems modeled as graphs |
Traversals & Access Patterns
Pre-order (process before recursing) — Node is handled on first visit, before any children. Use: serializing a tree, path prefix recording, detecting problems early in a subtree.
Post-order (process after recursing) — Node is handled after all descendants finish. Use: subtree aggregates (height, size, sum), topological sort (reversed post-order), SCC pop order.
In-order (left → node → right) — Binary trees only. Produces sorted order on BSTs; rarely useful for arbitrary graphs.
DFS with entry + exit hooks — Run one action at discovery and another at finish. Use: timestamp assignment, gray/black coloring for directed cycle detection (gray = in current stack, black = done).
Iterative post-order:
stack = [(node, False)]
while stack:
v, done = stack.pop()
if done:
# post-order action on v
else:
stack.append((v, True))
for w in children(v):
stack.append((w, False))
Euler tour / DFS flattening — Assign in[v] and out[v] during DFS so the subtree of v maps to the contiguous range [in[v], out[v]] in a flat array. Converts subtree queries into range queries answerable by a segment tree or BIT.
See trees.md for tree-specific traversal patterns. See graphs.md for graph representation (adjacency list vs. matrix) details.
Key Algorithms
Connected Components (Undirected Graph)
Iterate over all nodes; for each unvisited node, run DFS and label all reachable nodes with the same component ID. O(V + E) time, O(V) space.
Key insight: a single DFS from a source visits exactly its connected component — no more, no less.
Cycle Detection — Undirected Graph
During DFS, if a visited neighbor is found that is not the direct parent of the current node, a back edge (and therefore a cycle) exists. O(V + E).
Key insight: in an undirected DFS tree, every non-tree edge is a back edge; the only way a visited neighbor is not a back edge is if it is the parent.
Cycle Detection — Directed Graph
Color nodes: white (unvisited), gray (on current DFS stack), black (fully processed). A back edge is detected when a gray node is re-encountered.
if color[v] == 'gray': # back edge → cycle
O(V + E). Do not use the undirected parent-check here — a black node reached via a cross edge is not a cycle.
Topological Sort (DFS-based)
Post-order DFS: append each node to a result list after all nodes it points to have finished; reverse the list. O(V + E).
Key insight: a node finishes only after everything it depends on finishes, so reversed post-order is a valid dependency ordering.
Flood Fill / Island Counting
For each unvisited cell matching the target, DFS marks all connected matching cells visited. Count the number of DFS initiations. O(R × C) for an R×C grid.
Key insight: define adjacency as 4-directional (or 8-directional per problem) moves within bounds.
Bipartite Check (2-Coloring)
Assign alternating colors to neighbors during DFS. If a neighbor already holds the same color as the current node, the graph is not bipartite. O(V + E).
All Paths from Source to Target
DFS with backtracking: push the current node onto a path list, recurse into each neighbor, pop on return. Record path when target is reached. Worst case exponential (all paths in a complete DAG).
Bridges (Cut Edges) — Tarjan's
Edge (u, v) is a bridge if removing it disconnects the graph. Condition: low[v] > disc[u] where low[v] is the minimum discovery time reachable from v's subtree via at most one back edge. O(V + E).
if low[v] > disc[u]: # (u, v) is a bridge
Articulation Points (Cut Vertices) — Tarjan's
Node u is an articulation point if: (1) u is the DFS root and has ≥ 2 children, or (2) u is non-root and has a child v with low[v] >= disc[u]. O(V + E).
Note the >= vs. > difference from bridges — a single character that is frequently confused.
Strongly Connected Components — Kosaraju's
Two-pass DFS: (1) DFS on original graph, push nodes to a stack in finish order; (2) DFS on the transposed graph in reverse finish order — each DFS tree in pass 2 is one SCC. O(V + E).
Strongly Connected Components — Tarjan's
Single-pass DFS with a node stack. Node v is an SCC root when low[v] == disc[v]; pop the stack up to and including v to extract the SCC. O(V + E).
Key insight: Tarjan's unifies SCC detection with the low-link computation used for bridges and articulation points — one traversal does everything.
Euler Path / Circuit — Hierholzer's
DFS variant: recurse until no outgoing edges remain, then push the node to the result. Reverse the result. Precondition for circuit (undirected): all vertices have even degree. For path: exactly 0 or 2 odd-degree vertices. O(V + E).
Advanced Techniques
Tarjan's Low-Link Framework
Solves: bridges, articulation points, SCCs — all in a single DFS pass.
Mechanism: Maintain disc[v] (discovery time) and low[v] (minimum disc reachable from v's subtree via at most one back edge). Update: low[u] = min(low[u], low[v]) for tree edges; low[u] = min(low[u], disc[v]) for back edges.
When to prefer over simpler alternatives: Any problem asking about structural graph properties (cut edges, cut vertices, SCC). Do not apply to simple cycle detection — gray/black coloring is simpler and sufficient.
Complexity: O(V + E) time, O(V) space.
DFS + Memoization (Top-Down DP on Graphs / Trees)
Solves: Longest path in a DAG, count of paths, tree DP (subtree max, rerooting), grid DP where the recurrence flows along edges. Mechanism: Cache DFS return values keyed by node (and additional state). On DAGs, each subproblem is computed once. On trees, post-order gives the natural computation order so memoization is implicit. When to prefer over bottom-up DP: When the recursion structure is natural (trees, sparse DAGs) and the state space is irregular. For dense layered DAGs, bottom-up in topological order is more cache-friendly. Complexity: O(V × S) where S is the size of additional state per node.
Euler Path / Circuit Detection via Hierholzer's DFS
Solves: Problems requiring traversal of every edge exactly once (distinct from Hamiltonian path which covers every node — NP-hard). Mechanism: DFS that appends a node to the result only when it has no remaining outgoing edges; reverse the result. Start from a node with odd degree (path) or any node (circuit). When to prefer: Only when the problem explicitly asks about edge coverage. Node-coverage problems are a different category entirely. Complexity: O(E) with an adjacency list that supports O(1) edge removal (use index pointer, not repeated deletion).
DFS on Implicit State Graphs
Solves: Puzzle problems (sliding puzzle, word transformations, number games) where states are too numerous to enumerate upfront.
Mechanism: Define neighbors(state) that generates valid transitions. DFS with a visited set on the state space. If optimality (minimum steps) is required, switch to BFS.
When to prefer over BFS: When the problem asks for existence, enumeration, or a depth-bounded search — not shortest path. BFS guarantees level-order; DFS does not.
Complexity: O(S × B) where S = number of reachable states, B = branching factor.
DSU on Tree (Small-to-Large Merging)
Solves: For each node, aggregate information over its entire subtree (e.g., count distinct values, find most frequent element) efficiently. Mechanism: DFS processes the heavy child (largest subtree) last and inherits its accumulated data structure without copying. Light children contribute and are discarded. Each node becomes a "light child" at most O(log n) times up the tree. When to prefer over brute-force O(n²): When the subtree aggregate requires maintaining a non-trivial data structure (frequency map, sorted set) and n > 10^4. Complexity: O(n log n) or O(n log² n) with a sorted structure.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Connectivity | Are nodes reachable from each other? | Multi-source DFS; union-find for static queries |
| Component counting | How many connected groups exist? | DFS loop over all unvisited nodes |
| Path existence / enumeration | Does a path exist? Enumerate all paths | DFS backtracking |
| Cycle detection | Is there a cycle? | Directed: gray/black coloring; Undirected: non-parent visited neighbor |
| Topological ordering | Valid ordering of dependencies | Post-order DFS reversed; or Kahn's BFS |
| Tree aggregation | Subtree sums, heights, sizes, diameters | Post-order DFS with returned values |
| Tree rerooting | Compute answer treating each node as root | Two DFS passes (down then up) |
| Structural graph properties | Bridges, articulation points, SCCs | Tarjan's low-link DFS |
| Grid traversal | Islands, regions, flood fill | DFS with 4/8-directional adjacency |
| Combinatorial search | All subsets, permutations, combinations | DFS backtracking with pruning |
| DAG path optimization | Longest/shortest path, path count in DAG | DFS + memoization |
Problem Signal → Technique
| Signal in problem statement | Likely technique |
|---|---|
| "number of islands", "connected regions", "number of provinces" | Multi-source DFS, flood fill |
| "all possible paths from source to target" | DFS backtracking |
| "detect cycle", "can finish all courses" | Directed: gray/black DFS; Undirected: non-parent check |
| "course schedule", "build order", "prerequisites" | Topological sort via post-order DFS |
| "critical connections", "bridges" | Tarjan's bridge algorithm |
| "strongly connected components" | Tarjan's or Kosaraju's |
| "reconstruct itinerary", "find Euler path" | Hierholzer's DFS |
| "diameter of tree", "longest path in tree" | Two DFS or post-order with global max |
| "subtree" + aggregate (sum, count, max) | Post-order DFS |
| "clone graph" | DFS with {original: copy} hashmap |
| "word search", "path exists in grid" | DFS with in-place visited marking |
| "combinations", "subsets", "permutations" | DFS backtracking |
| "serialize / deserialize tree" | Pre-order DFS |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Visited set | Any graph with possible cycles | visited.add(v) before recursing; if v in visited: return at top |
| In-place grid marking | Grid problems where cells can be mutated | Overwrite cell with sentinel ('#'); restore only if backtracking is needed |
| Backtrack restore | Enumeration: all paths, subsets, permutations | path.append(x); dfs(); path.pop() |
| Post-order accumulation | Subtree aggregates | Recurse first, combine children's return values at the current node |
| Two-pass DFS (rerooting) | Answer changes based on which node is root | Pass 1: compute subtree info bottom-up; Pass 2: propagate top-down using subtraction trick |
| Coloring (white/gray/black) | Directed cycle detection, topological sort | Gray = on current path; black = subtree fully processed |
| Iterative post-order | Post-order needed but recursion depth is risky | Push (v, False); when popped as True, execute post-order action |
| DFS on complement graph | Dense graphs where non-edges outnumber edges | Maintain a "remaining unvisited" set; iterate over non-neighbors |
Edge Cases & Pitfalls
-
Marking visited after instead of before recursing — On graphs with multiple edges to the same node, the node is pushed to the stack multiple times before any instance executes, causing exponential work. Always mark visited before the recursive call or push.
-
Parent-node vs. parent-edge check for undirected multigraphs —
if neighbor != parentfails when two separate edges connect u and v. A second edge to the parent is a legitimate back edge. Fix: track the parent edge's index, not just the parent node. -
Forgetting to backtrack mutable state — When DFS mutates a path list, counter, or grid cell and does not restore it after returning, sibling branches inherit corrupted state. Always pair mutation with restoration; check by tracing a path that fails to reach the target.
-
Applying undirected cycle detection to directed graphs — A visited non-parent neighbor in a directed graph may be reached via a cross edge (not a cycle). Directed cycle detection requires gray/black coloring; a cycle exists only when a gray node is re-encountered.
-
Disconnected graphs — DFS from a single source does not reach all nodes. Always wrap DFS in a loop over all nodes to handle multiple components.
-
Python recursion limit — Default is 1000. Graphs with depth > ~500 raise
RecursionError. Fix:sys.setrecursionlimit(n + 100)at script start, or convert to iterative DFS. -
Iterative DFS push order — Pushing neighbors left-to-right processes them right-to-left (LIFO). If the problem requires left-to-right processing (e.g., lexicographic smallest path), push neighbors in reversed order.
-
Bridge vs. articulation point condition — Bridge:
low[v] > disc[u]; articulation point:low[v] >= disc[u]. The>=vs.>difference is a single character and is the most common mistake in Tarjan's implementations. -
Empty graph or isolated nodes — DFS returns immediately; verify the outer loop correctly initializes component count to 0 and increments for each DFS call made.
-
Rerooting subtraction error — When moving the root from
uto childv, the portion ofu's aggregate that came fromv's subtree must be subtracted before addingu's updated value tov. Wrong subtraction order produces incorrect answers for rerooting problems.
Implementation Notes
sys.setrecursionlimitmust be called before the DFS function is first invoked; placing it inside the function is redundant after the first call but harmless.defaultdict(list)is the standard adjacency list; explicitly add all nodes (even isolated ones) if the problem has vertices with no edges.- Grid DFS: define
dirs = [(0,1),(0,-1),(1,0),(-1,0)]once and loop, rather than writing four separate recursive calls. - For undirected graphs, each edge appears twice in the adjacency list (once per direction); this is correct and not a bug.
- In-place grid marking for flood fill: do not restore the sentinel — restoration is only needed when the DFS is a backtracking search over multiple candidate paths.
lowanddiscarrays in Tarjan's must be indexed up to the maximum node label; if nodes are labeled by arbitrary integers (not 0-indexed), use a dict.- The iterative
(node, processed)tuple pattern is the most portable way to get post-order in Python when recursion depth is a concern.
Cross-Topic Relations
| Related topic | Specific connection |
|---|---|
| BFS | Alternative traversal; BFS guarantees shortest path (unweighted); DFS is preferred for path enumeration and topological sort |
| Backtracking | Backtracking is DFS with state-restore on return; the distinction is emphasis, not mechanics |
| Dynamic Programming | Top-down DP on trees/DAGs is DFS + memoization; post-order DFS provides the natural subproblem evaluation order |
| Union-Find | Alternative to DFS for static connectivity queries; DFS is needed when path details or ordering matter |
| Topological Sort | DFS post-order is one of two standard algorithms; Kahn's (BFS + in-degree) is the other and handles tie-breaking more directly |
| Trees | DFS is the default tree traversal; all tree DP and structural tree problems use DFS |
| Graphs | DFS is a fundamental graph primitive; SCC, bridge, articulation-point algorithms are all DFS-based |
| Segment Tree / BIT | Euler tour (DFS in/out indices) converts subtree queries into range queries, enabling O(log n) subtree aggregates |
| Binary Search | Combined in problems that binary-search on an answer and use DFS to check feasibility at each candidate value |
Interview Reference
Must-Know Problems
Graph Connectivity
- Number of Islands (LC 200) — grid multi-source DFS
- Number of Provinces (LC 547) — adjacency matrix DFS
- Clone Graph (LC 133) — DFS with
{original: copy}hashmap - Max Area of Island (LC 695) — flood fill returning subtree size
Cycle Detection & Ordering
- Course Schedule (LC 207) — directed cycle detection
- Course Schedule II (LC 210) — topological sort output
- Find Eventual Safe States (LC 802) — reverse topological reasoning
Path Problems
- Path Sum II (LC 113) — all root-to-leaf paths, backtracking
- All Paths From Source to Target (LC 797) — DAG all-paths DFS
- Word Search (LC 79) — grid DFS with backtracking
- Longest Increasing Path in a Matrix (LC 329) — DFS + memoization on implicit DAG
Tree Aggregation
- Maximum Depth of Binary Tree (LC 104) — post-order DFS
- Diameter of Binary Tree (LC 543) — post-order with global max
- Binary Tree Maximum Path Sum (LC 124) — post-order with global max
- Lowest Common Ancestor (LC 236) — post-order detection
- Serialize and Deserialize Binary Tree (LC 297) — pre-order DFS
Structural Graph Properties
- Critical Connections in a Network (LC 1192) — Tarjan's bridges
- Redundant Connection (LC 684) — cycle detection in undirected graph
Combinatorial Search
- Subsets (LC 78) / Subsets II (LC 90) — DFS backtracking
- Permutations (LC 46) / Permutations II (LC 47) — DFS backtracking
- Combination Sum (LC 39) — DFS with pruning
- N-Queens (LC 51) — DFS with constraint propagation
Advanced
- Reconstruct Itinerary (LC 332) — Euler path via Hierholzer's DFS
- Longest Path With Different Adjacent Characters (LC 2246) — post-order tree DFS
- Count Good Nodes in Binary Tree (LC 1448) — DFS with running max passed down
Additional LeetCode Coverage
- N-ary Tree Preorder Traversal (LC 589)
- N-ary Tree Postorder Traversal (LC 590)
Clarification Questions to Ask
- Is the graph directed or undirected? (Determines cycle detection method and SCC applicability.)
- Can there be self-loops or multiple edges between the same pair of nodes? (Affects parent-check correctness.)
- Is the graph guaranteed to be connected, or must disconnected components be handled?
- For grid problems: is diagonal movement allowed? (4-directional vs. 8-directional adjacency.)
- What constitutes a "path" — must it be simple (no repeated nodes or edges)?
- Are edge weights present? (DFS handles unweighted reachability; weighted shortest path requires Dijkstra or Bellman-Ford.)
- What are the input size constraints? (n > 500 on a path graph → iterative DFS or recursion limit increase.)
Common Interview Mistakes
- Using BFS when DFS is required (or vice versa) without explaining the tradeoff — state why one is chosen over the other.
- Forgetting to mark visited before the recursive call, causing TLE on dense or looped graphs.
- Using the parent-node check for undirected multigraphs instead of the parent-edge index.
- Writing backtracking that mutates but forgets to restore — always trace through a path that fails to verify the undo executes.
- Applying topological sort without first verifying or handling the case where the graph contains a cycle.
- Confusing
low[v] > disc[u](bridge) withlow[v] >= disc[u](articulation point) — wrong condition produces incorrect structural results. - Running DFS from only one source and reporting "no cycle" or "fully visited" without covering disconnected components.
Typical Follow-Up Escalations
- "Now return the actual path, not just whether one exists" → add a path list with backtrack restore.
- "Now do it without recursion" → iterative DFS with
(node, processed)tuples for post-order. - "The graph can have 10^6 nodes in a chain" → iterative DFS or
sys.setrecursionlimit; explain the trade-off. - "Now find all bridges / articulation points" → extend the DFS with Tarjan's
discandlowarrays. - "What if multiple valid topological orderings exist and you want the lexicographically smallest?" → switch to Kahn's algorithm with a min-heap.
- "Now find the shortest path instead" → switch to BFS; explain that DFS does not guarantee shortest path in general graphs.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles