Trees
Core Concepts
Node — Basic unit; holds data and references to children.
Edge — Connection between a parent and child node.
Root — Topmost node; has no parent.
Leaf — Node with no children.
Internal node — Non-leaf node.
Height of node — Longest edge-path from that node down to any leaf. Height of tree = height of root.
Depth of node — Edge-count from root to that node. Root depth = 0.
Level — depth + 1; root is level 1.
Degree — Number of children a node has.
Subtree — A node and all its descendants.
Ancestor / Descendant — All nodes on path from root to a node / all nodes in that node's subtree.
Forest — Collection of disjoint trees.
Width — Maximum number of nodes at any single level.
Path — Sequence of nodes connected by edges; no repeated nodes.
Branch — Any path from root to a leaf.
Key invariant: A tree with n nodes has exactly n − 1 edges.
Structural Invariants
Each tree type enforces invariants that every algorithm in that family silently relies on. Violating one corrupts all operations that depend on it.
| Tree Type | Invariant |
|---|---|
| Binary Tree | Each node has ≤ 2 children |
| BST | left subtree values < node.val < right subtree values — holds for ALL nodes, not just immediate parent-child |
| AVL | ` |
| Red-Black | No two consecutive red nodes; every root-to-null path has equal black-height |
| Min-Heap | parent.val ≤ child.val at every node |
| Complete Binary Tree | All levels full except possibly last; last filled left-to-right |
| Segment Tree | Leaf covers exactly one array element; internal node range = union of children ranges |
| Augmented BST | Extra field (e.g., size) must be updated consistently on every structural change |
Internal Representation
Pointer-based (linked nodes) — Each node is an object: val, left, right, optionally parent. No random access by index. Natural for most recursive tree algorithms.
Array-based (implicit) — Used when tree is complete/perfect (heap, segment tree). For 1-indexed node i:
- Parent:
i // 2 - Left child:
2i, Right child:2i + 1
Eliminates pointer overhead; cache-friendly; requires knowing max size upfront.
Parent pointer — Augmenting nodes with parent enables bottom-up traversal, O(1) sibling access, and iterative inorder without a stack.
Euler tour array — Flatten tree to 1D via DFS timestamps in[v] and out[v]. Subtree of v = contiguous range [in[v], out[v]]. Bridges tree algorithms and 1D array data structures (BIT, segment tree, sparse table).
Adjacency list — Used when tree arrives as edge list (n−1 edges). Build adj[u].append(v), pick arbitrary root, DFS with parent-tracking to avoid revisiting. Standard in competitive-programming tree problems.
Types of Trees
General / N-ary Tree
Each node can have any number of children. Often given as Node(val, children=[]). Algorithms generalize from binary trees by iterating over node.children instead of accessing .left/.right.
Binary Tree
Each node has at most 2 children.
| Variant | Definition | Key property |
|---|---|---|
| Full | Every node has 0 or 2 children | No single-child nodes |
| Complete | All levels full except possibly last; last filled left-to-right | Enables array representation |
| Perfect | All internal nodes have 2 children; all leaves at same level | 2^h leaves; 2^(h+1) − 1 total nodes |
| Balanced | Height = O(log n) | AVL, Red-Black guarantee this |
| Degenerate / Skewed | Each parent has exactly one child | Degenerates to linked list; height O(n) |
Binary Search Tree (BST)
Left subtree values < node.val < right subtree values for every node — not just immediate children. This non-local property is the source of most BST bugs.
- Balanced BST: O(log n) search/insert/delete.
- Worst case on sorted input: O(n).
- Inorder traversal yields sorted output — the defining property exploited in almost every BST problem.
Self-Balancing BSTs
AVL — Balance factor (height_left − height_right) ∈ {−1, 0, 1} at every node. Rebalanced via single or double rotations after every insert/delete. Stricter balance → faster lookups, slightly more rotation work on writes.
Red-Black — Nodes colored red/black with five enforced properties that keep height ≤ 2 log n. Fewer rotations than AVL → preferred for write-heavy workloads. Backs C++ std::map, Java TreeMap, Linux CFS scheduler.
Splay — Every accessed node rotated to root. Amortized O(log n). Excellent cache behavior for skewed-access patterns. Basis of Link-Cut Trees.
B-Tree / B+ Tree
Multi-way search tree for disk-based storage. Maximizes branching factor to minimize I/O. Each node holds multiple keys (order-m: up to m−1 keys, m children).
B+ Tree: all data in leaves; leaves linked in order for O(range) scans. Used in RDBMS indexes (InnoDB, PostgreSQL) and file systems.
Trie (Prefix Tree)
Each edge represents a character; path from root to node spells a prefix. O(L) insert/search per string of length L, independent of total string count. Applications: autocomplete, spell-check, longest common prefix, IP routing (compressed/radix trie), word search grids.
Segment Tree
Binary tree over an array supporting range queries (sum, min, max, GCD, XOR…) and point or range updates. Leaves = array elements; internal nodes = aggregate of their range. Build: O(n). Query/update: O(log n). Lazy propagation extends this to range updates, still O(log n) — see Advanced Techniques.
Fenwick Tree (Binary Indexed Tree / BIT)
Compact array for prefix-sum queries and point updates. O(log n) both. Simpler to implement than segment tree; restricted to invertible operations (sum, XOR, product mod p). Uses i & (−i) to identify the range each index is responsible for.
Heap (Binary Heap)
Complete binary tree stored as array. Min-heap: parent ≤ children; Max-heap: parent ≥ children.
- O(1) peek-min/max. O(log n) insert, extract-min/max.
- Build from array in O(n) via bottom-up heapify — not n separate inserts which cost O(n log n).
- Foundation of priority queues, Dijkstra, Prim, heap sort.
Cartesian Tree
Built from array: root = min (or max) element; left/right children are Cartesian trees of the left/right subarrays. Inorder traversal = original array. O(n) construction using a stack. Used in RMQ algorithms and as the structural basis for treap.
Treap
Randomized BST combining BST key ordering with random heap priorities. The combination of key and priority uniquely determines the structure. Expected O(log n) all operations. Supports split (split into two treaps by key) and merge (merge two treaps maintaining order) in O(log n) — making it a flexible persistent/functional data structure.
Traversals
DFS Order
| Traversal | Visit order | Canonical use |
|---|---|---|
| Preorder | Root → Left → Right | Serialize, copy, prefix expression evaluation |
| Inorder | Left → Root → Right | BST sorted output, validate BST |
| Postorder | Left → Right → Root | Delete tree, postfix expressions, all subtree DP (children before parent) |
Recursive skeleton: place process(node) before the two recursive calls (preorder), between them (inorder), or after (postorder).
Iterative inorder: push-left-until-null → pop → visit → move right. Direction at each step is determined structurally; no visited set needed.
Iterative preorder: push right child then left child before each pop, so left is processed first.
Iterative postorder (two-stack): Stack 1 processes root-right-left (push left then right before popping, collecting into Stack 2). Pop Stack 2 to get left-right-root = postorder.
Morris Traversal (inorder, O(1) space) — Temporarily thread the rightmost node of the left subtree back to current as a return pointer. On first visit to a node: if it has a left subtree, set the thread and go left. On second visit (thread triggered): undo thread, visit, go right. The tree is restored to original structure.
Morris Traversal (preorder) — Same threading mechanism, but visit the node on the first encounter (before threading), not the second.
BFS / Level-Order
Queue-based. Snapshot len(queue) before the inner loop — this is the current level size. Process exactly that many nodes in the inner loop, then append their children. Space: O(w) where w = max level width.
N-ary Tree Traversal
Preorder: visit root, then recurse into each child left-to-right.
Level-order: same BFS pattern but append all children (not just left/right) to the queue.
Postorder: recurse all children first, then visit root.
BST Operations
Search
Compare target with node.val. Go left if target < node.val, right if target > node.val, return node if equal. O(h).
Insert
Same traversal as search. When a null child is reached, insert the new node there. O(h). After insertion in a self-balancing BST, rebalance upward.
Delete (3 cases)
- Leaf: simply remove it.
- One child: replace node with its child.
- Two children: find the inorder successor (leftmost node of the right subtree). Copy its value to the current node. Delete the inorder successor (which has at most one right child — case 1 or 2). Alternatively use the inorder predecessor (rightmost of left subtree).
Floor and Ceil
Floor (largest value ≤ target): if node.val == target, return it. If node.val > target, floor is in left subtree. If node.val < target, floor is either node.val itself or something larger in the right subtree — recurse right, return node.val if right subtree yields nothing.
Ceil (smallest value ≥ target): mirror logic — if node.val < target, ceil is in right subtree; if node.val > target, ceil is either node.val or something smaller in left subtree.
Both are O(h).
Predecessor and Successor
Inorder predecessor (largest value < node): right child → find its leftmost node; else walk up until you take a right turn.
Inorder successor (smallest value > node): right child → find its leftmost node; else walk up until you take a left turn.
With parent pointers: O(h). Without: O(h) via BST search carrying the "last seen candidate."
Kth Smallest
Inorder traversal counts nodes visited. Stop at the kth. With augmented BST (size field): rank(node.left) + 1 compared to k, O(log n).
BST Iterator
Maintain a stack initialized with the leftmost spine. next(): pop from stack, record value, push leftmost spine of popped node's right subtree. hasNext(): stack is non-empty. Amortized O(1) per call, O(h) space.
Trim BST
Given range [lo, hi], remove all nodes with values outside range. If node.val < lo, return trim(node.right). If node.val > hi, return trim(node.left). Otherwise trim both children. O(n).
Convert Sorted Array to Balanced BST
Take mid = (lo + hi) // 2 as root, recurse on [lo, mid−1] and [mid+1, hi]. For sorted linked list: use slow/fast pointer to find midpoint, disconnect, recurse. O(n).
Count Distinct BSTs with n Keys
Catalan number: C(n) = C(2n, n) / (n + 1). Recurrence: C(n) = sum over i=1..n of C(i−1) * C(n−i) (choosing each key as root). C(0)=1, C(1)=1, C(2)=2, C(3)=5, C(4)=14. Appears in "unique BSTs" and structure-counting problems.
Key Algorithms
Height
Bottom-up DFS: height(node) = 1 + max(height(left), height(right)), base height(null) = 0.
Convention: stick to either node-based (null = 0) or edge-based (null = −1). Mixing causes off-by-one in balance checks and diameter.
Time: O(n), Space: O(h).
Diameter
Longest path between any two nodes; may not pass through root.
At each node: candidate = left_height + right_height. Update a global max. Return 1 + max(left_height, right_height) to parent.
The answer spans two subtrees so it cannot be returned up the call stack — must use a global/closure variable.
Time: O(n).
Distance Between Two Nodes
dist(u, v) = depth[u] + depth[v] − 2 * depth[LCA(u, v)]
Precompute depths in O(n), then use any LCA algorithm. Each query is O(1) after preprocessing with sparse table.
Path Between Two Arbitrary Nodes
Path from u to v = (path from u up to LCA) + (path from LCA down to v). Find LCA, then reconstruct each segment. With parent pointers and depth info, total O(depth[u] + depth[v] − 2·depth[LCA]).
Kth Ancestor
Using binary lifting: precompute up[v][k] = 2^k-th ancestor of v. To find the Kth ancestor: decompose K in binary, jump by each set bit. E.g., K=5 (binary 101) → jump 4 then 1.
Preprocessing: O(n log n). Query: O(log n).
Lowest Common Ancestor (LCA)
| Approach | Preprocessing | Query | Space | Best for |
|---|---|---|---|---|
| Naive recursive | — | O(n) | O(h) | Single or rare query |
| Binary lifting | O(n log n) | O(log n) | O(n log n) | Multiple queries |
| Euler tour + sparse table | O(n log n) | O(1) | O(n log n) | Read-heavy / maximum speed |
| Tarjan offline | O(n α(n)) | amortized O(1) | O(n) | All queries known upfront |
Binary lifting: precompute up[v][k] = 2^k-th ancestor. LCA(u,v): bring deeper node up to same depth using binary jumps, then jump both together until they meet.
Euler tour + RMQ: record (depth, node) at every DFS step including backtracks (2n−1 entries). LCA(u,v) = entry with minimum depth between first occurrences of u and v. Use sparse table for O(1) RMQ.
Sparse Table (for O(1) RMQ)
Build: st[i][j] = RMQ result over range [i, i+2^j−1].
st[i][j] = combine(st[i][j−1], st[i+2^(j−1)][j−1]).
Query [l, r]: let k = floor(log2(r−l+1)). Answer = combine(st[l][k], st[r−(1<<k)+1][k]).
Works for idempotent operations (min, max, GCD) where overlapping ranges don't double-count. Build: O(n log n). Query: O(1).
Counting Paths with Target Sum (Prefix Sum + DFS)
Maintain a running prefix sum from root. At each node: increment prefix_map[curr_sum], check if prefix_map[curr_sum − target] exists (those are path starts where the segment ending here sums to target), add to result. Recurse into children. On backtrack: decrement prefix_map[curr_sum] before returning. This undo step is critical — forgetting it causes cross-branch contamination.
Time: O(n), Space: O(h).
Tree Isomorphism via Hashing
Assign a canonical hash to each subtree bottom-up. For each node: sort children's hashes, serialize, hash the result. Two trees are isomorphic iff their roots have the same hash. Subtree matching (is T2 a subtree of T1?): compute hashes for all subtrees of T1 in O(n), check if T2's root hash exists in the set.
Time: O(n log n) due to sorting children hashes. Space: O(n).
Maximum Path Sum (Any Path)
At each node: left gain = max(0, dfs(left)), right gain = max(0, dfs(right)). Clamping to 0 means "don't extend path through a negative branch." Global max candidate = node.val + left_gain + right_gain. Return to parent: node.val + max(left_gain, right_gain) — path can only extend in one direction upward.
Time: O(n).
All Root-to-Leaf Paths
DFS carrying current path list; append a copy at every leaf. On backtrack: pop the last element.
Time: O(n·h), Space: O(h).
Serialize / Deserialize
Preorder with null markers ("#") uniquely encodes any binary tree. Decode by consuming tokens sequentially with a recursive DFS using an index or iterator.
Time: O(n).
Invert / Mirror
Swap left and right children at every node. Any traversal order works; postorder is most natural.
Time: O(n).
Flatten to Linked List (In-place Preorder)
Reverse-postorder (process right, then left, then root): maintain a prev pointer, set node.right = prev, node.left = None. Eliminates the O(h) rightmost-node search per node that naive approaches require.
Time: O(n), Space: O(h) recursive or O(1) iterative.
Count Complete Tree Nodes
Compare left-spine height vs right-spine height without full traversal.
- Equal → left subtree is perfect:
2^h − 1nodes, skip it and recurse into right subtree. - Unequal → right subtree is perfect one level shorter, skip it and recurse into left.
Time: O(log²n). Space: O(log n).
Complete Binary Tree Insertion
Insert at the next available position in level order. Convert the count of existing nodes + 1 to binary; the bits (excluding the leading 1) form a path from root to the insertion point (0 = go left, 1 = go right). Alternatively use BFS to find the first node with a missing child.
Time: O(log n).
Validate BST
Pass (lo, hi) bounds. Every node must satisfy lo < node.val < hi. Going left updates hi; going right updates lo.
Time: O(n).
Recover BST (Two Swapped Nodes)
Inorder of a valid BST is strictly increasing. A single swap creates one or two inversions. Track first, second, and prev across inorder traversal. First inversion (prev.val > curr.val): first = prev. Second inversion: second = curr. If only one inversion found, the swapped nodes are adjacent in inorder — second = curr from that single inversion. Swap values of first and second.
Time: O(n), Space: O(1) with Morris inorder.
Construct Tree from Traversals
- Preorder + Inorder: root = preorder[0]; locate root in inorder (O(1) with hashmap) → splits inorder into left and right subtrees; recursively build.
- Postorder + Inorder: root = postorder[−1]; same split.
- Preorder + Postorder: unique only if tree is full binary tree.
- Inorder alone or Preorder alone: not unique.
Time: O(n) with hashmap for inorder index.
Advanced Techniques
Rerooting (Two-Pass DFS)
Problem type: compute answer for every node as if it were the root (sum of distances to all nodes, max depth in any direction).
Pass 1 (bottom-up): compute subtree DP from the actual root — e.g., subtree sizes, subtree sums.
Pass 2 (top-down): for each child, re-derive its answer from the parent's answer using the formula "what changes when root shifts one edge." Reduces O(n²) to O(n).
Key: the re-derivation formula depends on the problem. For sum-of-distances: ans[child] = ans[parent] − size[child] + (n − size[child]).
Tree DP
Define DP state on subtrees. The state describes the best answer restricted to a subtree given a choice at the root of that subtree.
| Problem | State | Transition |
|---|---|---|
| Max independent set | dp[v][0/1] = max weight excl/incl v | dp[v][1] += dp[c][0]; dp[v][0] += max(dp[c][0], dp[c][1]) per child c |
| Min vertex cover | dp[v][0/1] | Complement of MIS |
| Tree knapsack | dp[v][k] = best value using k nodes | Convolution over children; O(n²) naive, O(nW) optimized |
| Diameter | Height returned, global updated | Postorder |
| Sum of subtree sizes | Pass subtree size up | size[v] = 1 + sum(size[c]) |
| Count nodes at depth k | Pass current depth down | Base: depth == k → count 1 |
Tree knapsack: when merging dp[v] with each child, convolve their DP arrays. Naive O(n²k). The tighter bound: across all merges, total pair comparisons = O(n·W) because each pair of nodes meets at exactly one LCA during merges.
3-State DP (Greedy on Trees) — For problems like minimum cameras: assign states 0/1/2 to each node (e.g., uncovered / covered-no-camera / has-camera). Process leaves first (postorder). Greedy rule: if any child is uncovered (state 0), parent must place a camera (state 2). If any child has a camera (state 2), parent is covered (state 1). If all children are covered-no-camera (state 1), parent stays uncovered (state 0) — let grandparent handle it. After processing all nodes, count uncovered roots separately.
Time: O(n). Generalizes to any "dominating set on tree" problem.
Heavy-Light Decomposition (HLD)
Partition each node's edge to the child with the largest subtree size as "heavy"; all others as "light." Any root-to-leaf path crosses ≤ O(log n) light edges. Map each maximal heavy chain to contiguous array indices, build a segment tree on the array.
Path query u→v: while u and v are on different chains, move the deeper chain-head up (O(log n) jumps), query each chain segment in O(log n) → O(log²n) total.
Use for: path sum/max/assign with updates, LCA in O(log n).
Centroid Decomposition
The centroid of a tree is the node whose removal produces all components of size ≤ n/2. It always exists and can be found in O(n). Recursively: find centroid, solve the problem for all paths passing through it, remove it, recurse on remaining components. Decomposition depth: O(log n); each node participates in O(log n) subproblems.
Use for: counting pairs/paths with a specific distance/sum property. Total work: O(n log n) with O(n)-per-level processing.
Euler Tour (DFS Timestamps)
Assign in[v] (DFS entry time) and out[v] (exit time). Subtree of v = contiguous range [in[v], out[v]]. Converts any subtree query into a 1D range query. For LCA via RMQ: record (depth, node) at every visit including backtracks → array of length 2n−1. LCA(u,v) = minimum-depth entry between first occurrences of u and v.
Lazy Propagation (Segment Tree)
Problem: range updates (e.g., add 5 to all elements in [l, r]) are O(n) naively. Lazy propagation defers updates.
Lazy tag: a pending operation stored at an internal node meaning "this operation has been applied to this node's range but not yet pushed to children."
Push-down: before accessing or querying children, push the tag down: apply the tag to both children (update their values and tags), then clear the current node's tag.
Range update: if current node's range ⊆ update range, apply tag directly and stop. Otherwise push-down, recurse into relevant children, pull-up (recompute current node's value from children).
Key rule: every operation that recurses into children must push-down first.
Time: O(log n) per update or query.
Segment Tree Merge
Two segment trees built on the same index range can be merged node-by-node: if only one child exists at a position, use it directly; if both exist, merge recursively and combine values. Total cost: O(number of nodes in the smaller tree). Useful when each tree node in a problem maintains its own segment tree (e.g., distinct values per subtree).
DSU on Tree (Sack)
Distinct from simple small-to-large merging. For each node: process all light children (and undo their contributions to the data structure), then process the heavy child keeping its data structure, then re-add light children's data without undoing.
Why it's O(n log n): light-child data is added and removed at most O(log n) times per element; heavy-child data is never removed.
Use for: queries on subtrees requiring a data structure that supports add, remove, and query (distinct count, mode, etc.).
Small-to-Large Merging
When merging two sets, always insert elements of the smaller set into the larger. Each element is moved O(log n) times in total across all merges → O(n log n) total. Works for any merge-friendly structure (set, map, sorted list).
Virtual Tree (Auxiliary Tree)
Given K key nodes, construct the minimal subtree spanning those nodes and all their pairwise LCAs. Size: O(K). Construction: sort key nodes by Euler tour order, compute adjacent LCAs, insert unique nodes into a stack to build the tree in O(K log n). Reduces an O(n)-per-query problem to O(K log n).
Augmentation
Extend each node with additional fields to support new query types without increasing time complexity. The field must be maintained through every structural operation (insert, delete, rotate, split, merge).
| Added field | Enables |
|---|---|
size | Kth smallest, order statistics, rank in O(log n) |
sum | Range sum on BST paths |
min / max | Range min/max; augmented heap delete-any |
lazy | Deferred range updates in segment tree |
hash | O(1) subtree equality check; tree isomorphism in O(n log n) |
max_prefix_sum | Maximum subarray queries on segment tree |
Persistence (Persistent Segment Tree)
Each update creates O(log n) new nodes pointing to O(log n) new children, sharing all unchanged nodes with the previous version. Supports: kth smallest in a range (using difference of two versions), historical queries, rollback. Space: O(n log n) for n updates. Build by inserting elements one by one and retaining each version's root.
Link-Cut Trees
Dynamic tree structure: supports path queries and tree modifications (link two trees by an edge, cut an existing edge) in O(log n) amortized. Internally uses splay trees on "preferred paths." Applicable to dynamic connectivity, online LCA with edge modifications, dynamic MST.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Path problems | Sum / max / count along a path | DFS + global max; LCA + distance formula |
| Subtree problems | Property of a subtree | Bottom-up DFS return value |
| Ancestor problems | LCA, kth ancestor, depth | Binary lifting, Euler + sparse table |
| Construction | Build tree from traversals or sorted input | Inorder split + recursion, midpoint |
| Structural check | Balanced? Symmetric? Same tree? Valid BST? | DFS with bounds or mirrored recursion |
| Modification | Invert, flatten, prune, trim, insert, delete | In-place DFS, postorder preferred |
| Serialization | Encode/decode uniquely | Preorder + null markers |
| Counting | Count leaves, nodes, paths, structures | DFS accumulator; Catalan numbers |
| BST operations | Search, insert, delete, floor, ceil, rank | BST invariant traversal with bounds |
| Range queries on paths | Sum/min/max on path u→v with updates | HLD + segment tree |
| Path-pair counting | Count pairs/paths with given sum/distance | Centroid decomposition |
| Root-independent queries | Answer for every node as root | Rerooting (two-pass DFS) |
Problem Signal → Technique
| Signal in problem statement | Likely technique |
|---|---|
| "path sum", "root to leaf" | DFS with running sum / backtracking |
| "any path" (not just root-to-leaf) | Global-max DFS, LCA for arbitrary paths |
| "lowest common ancestor", "distance between nodes" | LCA (binary lifting or Euler tour) |
| "subtree of another tree", "identical subtree" | Tree hashing or serialization + matching |
| "serialize / deserialize" | Preorder + null markers |
| "complete binary tree", "count nodes efficiently" | Spine-height comparison, O(log²n) |
| "kth smallest", "floor / ceil in BST" | BST inorder, augmented BST with size |
| "range query on path u→v" | HLD + segment tree |
| "count paths with sum K" | Prefix sum + hashmap DFS with backtracking |
| "count pairs / paths with property" | Centroid decomposition |
| "sum of distances", "each node as root" | Rerooting (two-pass DFS) |
| "maximum independent set on tree" | Tree DP dp[v][0/1] |
| "minimum cameras / dominating set" | 3-state greedy tree DP |
| "recover BST", "two nodes swapped" | Inorder inversion detection |
| "dynamic tree" (link/cut) | Link-Cut trees |
| "distinct values in subtree" | DSU on tree or Euler tour + offline sorting |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Bottom-up DFS | Need subtree property used by parent | Return computed value (height, sum, bool) |
| Global variable in DFS | Answer spans multiple branches | nonlocal var updated at each node; cannot be returned |
| Top-down DFS | State accumulates from root downward | Pass state as parameter (path sum, bounds) |
| BFS level tracking | Level-specific output, min-depth, zigzag | Snapshot len(queue) before inner loop |
| Parent map + BFS | Distance between nodes, k-radius neighbors | Build {child: parent} in DFS, BFS outward |
| Inorder on BST | Sorted iteration, kth element, validation | Iterative with stack for O(h) space |
| Two passes (rerooting) | Answer for every node as root | Bottom-up then top-down with formula |
| Postorder aggregation | Children must finish before parent | Return tuple of multiple values |
| Divide at midpoint | Construct balanced BST from sorted input | mid = lo + (hi-lo)//2 as root, recurse |
| Prefix sum on root-path + backtrack | Count paths with given sum (any start) | HashMap {prefix_sum: count}; undo on backtrack |
| Two-pointer on BST | Two-sum, merge two BSTs | Simultaneous inorder iterators on both |
| 3-state postorder greedy | Cameras, dominating set | States 0/1/2 per node; greedy from leaves |
| DFS with state machine | Multiple node states affect parent's choice | Return discrete state (not just a value) from DFS |
| Serialize-and-hash subtrees | Subtree matching, isomorphism | Bottom-up canonical hash; set lookup |
Edge Cases & Pitfalls
Empty tree — Every function must check if not node: return first. Forgetting this on any single code path causes a crash.
Single node — Diameter = 0; height = 0 or 1 depending on convention; LCA(v, v) = v; no children to recurse into.
Skewed / degenerate tree — Recursion depth = n. For n = 10^5 in Python, this stack-overflows. Convert to iterative or set sys.setrecursionlimit(200000).
BST validation is non-local — Comparing only with the immediate parent misses violations deep in subtrees (e.g., a right-subtree descendant can be less than an ancestor). Always propagate (lo, hi) bounds.
Diameter doesn't pass through root — The widest path is often entirely in a subtree. Using left_height + right_height only at the root misses it entirely. Always maintain a global max across the full DFS.
Height convention mixing — Using height(null) = −1 in one place and = 0 in another silently corrupts balance checks and diameter computations. Pick one and use it everywhere.
Path sum with negative values — Don't initialize global max to 0; use float('-inf'). Clamping gains to max(0, gain) is correct when you want to not extend through negative branches.
Prefix sum path counting — missing backtrack — After recursing into children in the prefix_map DFS, decrement prefix_map[curr_sum] before returning. Omitting this causes cross-branch contamination.
BST duplicate values — Standard algorithms assume unique keys. With duplicates, the strict-inequality invariant breaks. Confirm with constraints; adjust to ≤ consistently if needed.
Modifying tree during traversal — Flatten/prune in-place can corrupt left/right pointers before they are visited. Use postorder (children processed first) or save the next pointer before overwriting.
Level-order level boundary — Snapshot len(queue) before the inner loop. Snapshotting after appending children of a level includes the next level.
LCA when u or v is ancestor of the other — naive recursive LCA handles this correctly (returns the ancestor immediately). Binary lifting also handles it, but depth-equalization must be done correctly.
Lazy propagation push-down timing — In a segment tree with lazy, every function that recurses into children must call push-down first. Missing a single push-down in query or update corrupts results.
Integer overflow — Path sums involving large node values overflow int in C++/Java; use long. Python is unbounded so this is a non-issue.
Implementation Notes
Python recursion limit — Default is 1000 frames. Set sys.setrecursionlimit(200000) for n up to ~10^5, or rewrite as iterative.
BIT is 1-indexed — Euler tour timestamps should start from 1. Off-by-one from 0-indexed timestamps causes silent errors in BIT queries.
Segment tree array size — Allocate 4 * n to safely accommodate all nodes without computing exact sizes.
Python heapq — Min-heap only. For max-heap, negate values. For custom objects, push (priority, item) tuples; the tuple comparison falls back to item if priorities tie.
BST midpoint — Use mid = lo + (hi - lo) // 2 to avoid signed-integer overflow in C++/Java and to stay consistent with Python's integer semantics.
Adjacency list for general trees — Use defaultdict(list). DFS needs a visited set or a parent parameter to avoid revisiting the node you came from.
Returning multiple values from DFS — Return a named tuple or a plain tuple (val1, val2) rather than using multiple global variables when a node needs to report several things to its parent (e.g., (is_balanced, height) for balance check).
Treap split/merge — Split by key divides treap into (≤ key) and (> key) parts; merge two treaps where all keys in left < all in right. Both are O(log n). Together they power interval operations and persistent treaps.
Cross-Topic Relations
| Related topic | Connection |
|---|---|
| Graph | Trees are connected acyclic undirected graphs. Many tree algorithms are restricted versions of graph algorithms. |
| DFS / BFS | The two traversal engines underlying virtually all tree algorithms. |
| Binary Search | Used in balanced BST operations, segment tree queries, sparse table index computation, complete-tree node count. |
| Dynamic Programming | Tree DP is a major sub-domain; subtree DP generalizes 1D DP to hierarchical decomposition. |
| Segment Tree | Layered on HLD chains for path queries; built on Euler tour arrays for subtree queries. |
| Binary Indexed Tree | Paired with Euler tour timestamps for O(log n) subtree sum/count queries. |
| Union-Find | Alternative for offline connectivity; used in Tarjan's offline LCA. |
| Heap | Is a complete binary tree; tree structural properties define all heap operations. |
| Trie | A tree where each edge is a character; all DFS traversal patterns apply directly. |
| Divide and Conquer | Centroid decomposition IS divide-and-conquer on the tree structure. |
| Hashing | Subtree hashing enables O(log n) subtree equality, tree isomorphism, and subtree deduplication. |
| Combinatorics | Catalan numbers count unlabeled binary trees with n nodes; Cayley's formula (n^(n−2)) counts labeled trees. |
| Greedy | Many tree optimization problems (cameras, matching, pruning) are solved greedily in postorder. |
Interview Reference
Clarification Questions to Ask
- Can node values be negative? (affects max path sum initialization and gain-clamping logic)
- Are BST values guaranteed distinct? (affects validation and floor/ceil edge cases)
- Does "path" mean root-to-leaf only, or any path between two nodes?
- What is max n? (determines acceptable complexity; O(n log²n) vs O(n))
- Is the tree guaranteed to be balanced? (affects recursion depth and O(log n) claims)
- Are queries online or offline? (determines if Tarjan LCA or persistent structures are needed)
Common Interview Mistakes
- Missing
if not node: returnbase case - Height convention inconsistency causing off-by-one in balance check or diameter
- Using a returned value instead of a global variable for diameter / max path sum
- Claiming O(log n) for BST without confirming the tree is balanced
- Validating BST by comparing only with immediate parent
- Forgetting to undo the path list or prefix-sum map when backtracking
- Assuming the root is given when the tree may arrive as an edge list
Typical Follow-Up Escalations
- "Now O(1) extra space" → Morris traversal
- "What if n = 10^6?" → Iterative DFS, watch recursion depth
- "Tree has dynamic inserts/deletes" → Self-balancing BST or Link-Cut Trees
- "Now support path range queries" → HLD + segment tree
- "LCA queries online" → Sparse table (static) or persistent structure (dynamic)
- "Without modifying the tree" → Iterative with explicit stack or parent map + BFS
- "Count distinct values in each subtree" → DSU on tree
Must-Know Problems by Category
Traversal: level-order, zigzag level-order, right side view, vertical order traversal
Path: path sum I/II/III, max path sum, sum root-to-leaf numbers, all paths
Structure: same tree, symmetric, balanced check, invert, flatten
BST: validate BST, kth smallest, recover BST, BST iterator, trim BST, convert sorted array to BST
LCA / Ancestor: LCA (general + BST versions), distance between nodes
Counting: count good nodes, count nodes in complete tree, count BST structures
Construction: build tree from preorder+inorder, from postorder+inorder
Serialization: serialize and deserialize binary tree, encode/decode N-ary
Advanced: binary tree cameras, sum of distances in tree, maximum width of binary tree
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles