Binary Tree
Binary Tree is a Data Structure topic with 200+ LeetCode problems.
Related files: binary-search-tree.md · heap-priority-queue.md · depth-first-search.md · breadth-first-search.md
Core Concepts
Node — unit storing a value and references to at most two children (left, right).
Root — the single node with no parent; entry point to the tree.
Leaf — a node with no children (both children are null).
Internal node — a node with at least one child.
Height — max number of edges on any root-to-leaf path; empty tree has height −1, single node has height 0.
Depth — number of edges from the root to a given node.
Level — set of nodes at the same depth; root is level 0.
Size — total number of nodes; a tree with n nodes has exactly n−1 edges.
Full binary tree — every node has 0 or 2 children; no node has exactly 1 child.
Complete binary tree — all levels fully filled except possibly the last, which is filled left-to-right with no gaps. n nodes → height is ⌊log₂ n⌋.
Perfect binary tree — all internal nodes have exactly 2 children and all leaves are at the same level. n nodes requires n = 2^(h+1) − 1.
Balanced binary tree — height difference between left and right subtrees of every node is at most 1; height is O(log n).
Degenerate (skewed) tree — every node has at most one child; behaves like a linked list with O(n) height.
Subtree rooted at v — v itself plus all its descendants; any node in the tree is the root of a valid subtree.
Ancestor / descendant — u is an ancestor of v if u lies on the path from the root to v; v is a descendant of u.
Structural Invariants
| Invariant | What it means | What breaks if violated |
|---|---|---|
| At most 2 children per node | Binary constraint; left and right are distinct | Ceases to be a binary tree |
| Acyclic | No node reachable from itself by following child pointers | Traversals loop infinitely |
| Single root | Exactly one node has no parent | Two components or undefined entry point |
| Connected | Every node reachable from root | Subtrees become unreachable |
These invariants hold for all binary trees. BST, heap, and AVL trees add further invariants — see their dedicated files.
Internal Representation
Pointer-based (standard for LeetCode)
Each node is an object with val, left, right. The tree is navigated by pointer traversal. Space O(n), supports arbitrary shapes, constant-time child access.
Array-based (implicit indexing, used for complete/perfect trees)
Root at index 1 (1-indexed). For node at index i: left child = 2i, right child = 2i+1, parent = i//2. Enables O(1) parent access without storing parent pointers. Wasted space for non-complete trees (up to O(2^n) for degenerate).
Parent-pointer augmentation
Add a parent field to each node. Required for upward traversal (e.g., in-order successor without a stack, LCA with parent links). Doubles pointer overhead.
Serialized / flattened form A pre-order or level-order array with nulls for missing nodes (LeetCode's canonical representation). Used for deserialization problems.
Types / Variants
| Variant | Key property | Typical use |
|---|---|---|
| Generic binary tree | At most 2 children, no ordering | Structural/path problems |
| Binary Search Tree | Left subtree < node < right subtree | Ordered lookup, range queries |
| Min/Max Heap | Parent ≤ (min) or ≥ (max) all descendants; stored as array | Priority queues, heap sort |
| AVL Tree | Balance factor ∈ {−1, 0, 1} at every node | Strictly balanced BST |
| Segment Tree | Each node aggregates an index range; range queries in O(log n) | Range sum/min/max queries |
| Trie (Prefix Tree) | Each edge labeled by a character; children indexed by alphabet | String prefix search |
BST: see binary-search-tree.md Heap: see heap-priority-queue.md
Traversals & Access Patterns
Pre-order (Root → Left → Right)
Visit root before subtrees. Produces a sequence where the root always appears before its subtree — useful for serialization and copying. Recursive and iterative (explicit stack, push right child before left) are equivalent.
In-order (Left → Root → Right)
Visit root between subtrees. For BSTs, yields sorted order. For generic trees, useful when relative position (rank, predecessor) matters.
Post-order (Left → Right → Root)
Visit root after both subtrees. Every child result is available before the parent is processed — the natural order for bottom-up aggregation (delete tree, evaluate expressions, compute subtree heights).
Level-order (BFS)
Process nodes level by level using a queue. Use a deque; record queue length at the start of each level to track level boundaries. O(n) time and space.
Iterative in-order (key for follow-ups):
stack, cur = [], root
while stack or cur:
while cur: stack.append(cur); cur = cur.left
cur = stack.pop(); visit(cur); cur = cur.right
Morris Traversal (O(1) space in-order)
Thread the rightmost node of the left subtree to the current node. Traverse without a stack or recursion. Unthreads as it visits. Used when O(log n) stack space is unacceptable. Modifies the tree transiently but restores it.
Reverse level-order
BFS then reverse result, or use a deque with appendleft. Produces bottom-up level sequence.
Vertical order traversal
Assign horizontal distances (column indices) during DFS/BFS; group nodes by column. Requires sorting by (column, row, value).
Euler tour (DFS entry/exit timestamps)
Record entry time in[v] and exit time out[v] during DFS. Node u is an ancestor of v iff in[u] ≤ in[v] ≤ out[v]. Flattens tree structure into a linear array for range-query problems.
Key Algorithms
Lowest Common Ancestor (LCA)
Find the deepest node that is an ancestor of both p and q.
Key insight: In a single post-order DFS, if both p and q are found in a subtree, the current node is the LCA. Return a non-null node upward as soon as either target is found.
if not root or root == p or root == q: return root
left, right = lca(root.left, p, q), lca(root.right, p, q)
return root if left and right else left or right
Time O(n), space O(h).
Diameter
Longest path between any two nodes (measured in edges). The path may not pass through the root.
Key insight: For each node, the longest path through it = height(left) + height(right). Track a global maximum across all nodes in a post-order DFS that returns subtree height.
Time O(n), space O(h).
Maximum Path Sum
Maximum sum of values along any path between any two nodes (values may be negative).
Key insight: Same structure as diameter — each node contributes val + max(left_gain, 0) + max(right_gain, 0) to the path through it, but returns only val + max(single_arm_gain, 0) upward (a path can only turn once). Global maximum tracked via closure/nonlocal.
Time O(n), space O(h).
Tree Height / Depth
Post-order: height(node) = 1 + max(height(left), height(right)), base case −1 for null.
Check Balanced
Post-order returning (height, is_balanced): if either child is unbalanced or heights differ by > 1, propagate unbalanced. O(n) single pass.
Symmetric / Mirror Check
Two-pointer recursion: compare outer pair (left.left, right.right) and inner pair (left.right, right.left) simultaneously. Time O(n).
Invert Binary Tree
Post-order: swap left and right children after inverting each subtree. Or pre-order (swap first, then recurse).
Count Nodes in Complete Binary Tree
Binary search on levels: compute height of leftmost path (h). Then check if right subtree has height h−1 (left subtree is perfect, recurse right) or h (right subtree is perfect with one less level, recurse left). O(log² n).
Serialize / Deserialize
Pre-order with nulls: serialize with a sentinel (e.g., '#') for null nodes. Deserialize using an iterator that consumes tokens left-to-right in pre-order.
Level-order: BFS; append null tokens for missing children. Deserialize with BFS, attaching children in queue order. This is LeetCode's format.
Construct from Traversals
Pre-order + In-order: Pre-order[0] is the root; find it in in-order to split left/right sizes; recurse. Time O(n) with a value-to-index hashmap.
Post-order + In-order: Post-order[-1] is the root; same split strategy.
Pre-order alone (with null sentinels): sufficient for unique reconstruction — each '#' token ends a branch.
Path Sum (Root-to-Leaf)
DFS carrying a running sum; check remaining == leaf value at each leaf. Variant: collect all such paths by backtracking the current path list.
Flatten to Linked List
Morris-style right-threading in pre-order: for each node, find the rightmost node of the left subtree, point its right to root.right, move root.left to root.right, null out root.left. O(n) time, O(1) space.
Advanced Techniques
Post-order Return with Global State
Solves: Path/value problems where the optimal answer spans two subtrees (diameter, max path sum, longest univalue path).
Mechanism: The recursive function returns a value useful to the parent (single-arm result), while updating a global/nonlocal variable with the local best (full two-arm result). The two values differ — failing to separate them is the single most common hard-problem mistake.
When to use over simpler alternatives: When the answer can span multiple levels and requires combining results from both subtrees at an intermediate node. If the answer is always root-to-leaf, a simple DFS accumulator suffices.
Time O(n), space O(h).
Small-to-Large Merging on Trees
Solves: Collect/merge data structures (sets, frequency maps) stored at each node; answer queries about subtree aggregates.
Mechanism: Merge the smaller child's collection into the larger one. Because each element is moved at most O(log n) times, total work is O(n log n) even though naïve merging would be O(n²).
When to use: When you need a per-subtree data structure and the subtree sizes are unbalanced. If all subtrees are the same size (balanced tree), simple recursion already achieves O(n log n).
Binary Lifting for LCA
Solves: LCA queries in O(log n) per query after O(n log n) preprocessing; also k-th ancestor queries.
Mechanism: Precompute up[v][j] = 2^j-th ancestor of v using DFS + DP. LCA(u, v): equalize depths via binary lifting, then lift both until they meet.
When to use over single-query LCA: When there are multiple LCA queries (Q ≥ O(log n)) on the same tree. Single-query DFS LCA is O(n) and simpler; binary lifting only wins with many queries.
Space O(n log n).
Euler Tour + Range Queries
Solves: Subtree-aggregate queries (sum of values in a subtree), path queries, LCA via RMQ.
Mechanism: Flatten the tree into an array using DFS entry/exit times. Subtree of v corresponds to the contiguous range [in[v], out[v]]. Any range query data structure (prefix sums, segment tree, sparse table) can then answer subtree queries in O(1)–O(log n).
When to use: When you need both tree structure and fast range queries, or when repeated LCA queries require O(1) per query (sparse table on Euler tour). Overkill for single-pass problems.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Path sum / value | Does a root-to-leaf / any path exist with sum k? Collect all such paths | DFS with running sum; backtrack for all-paths variant |
| Tree properties | Height, diameter, balance, count nodes | Post-order bottom-up aggregation |
| Tree comparison | Same tree, symmetric, subtree of another | Simultaneous two-pointer DFS |
| Tree construction | Build tree from traversal arrays, sorted array, or linked list | Split by root in in-order; recursion on subarrays |
| Tree transformation | Invert, flatten, prune, populate next pointers | Pre/post-order DFS; Morris threading |
| Ancestor queries | LCA, depth of node, k-th ancestor | Post-order LCA; binary lifting for repeated queries |
| Level / layer problems | Level averages, right side view, zigzag | BFS with level boundary tracking |
| Serialization | Encode/decode a tree to/from a string | Pre-order or BFS with null tokens |
| Subtree aggregates | Count/sum in subtrees, longest univalue path | Post-order with global max tracking |
Problem Signal → Technique
| Signal in problem statement | Likely technique |
|---|---|
| "root-to-leaf path" | DFS carrying accumulated sum/path |
| "any path" (not necessarily root-to-leaf) | Post-order return with global state |
| "lowest common ancestor" | Post-order LCA DFS; binary lifting if many queries |
| "level order", "by level", "row by row" | BFS with level boundary |
| "right side view", "leftmost/rightmost per level" | BFS last-element per level; or right-first DFS |
| "serialize", "encode", "decode" | Pre-order DFS with null sentinels |
| "construct from preorder and inorder" | Split by root index; recurse on subarrays |
| "check if balanced" | Post-order returning height; early-exit on imbalance |
| "diameter", "longest path" | Post-order: height(left) + height(right) |
| "flatten to linked list" | Morris right-threading in pre-order |
| "maximum path sum" | Post-order with separate single-arm return and global max |
| "next right pointers" | BFS level-order; or O(1) space using already-set next pointers |
| "count nodes" (complete binary tree) | Binary search on levels: O(log² n) |
| "subtree" problems | Match subtree root first, then DFS equality check |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Post-order aggregation | Combine child results before processing parent | Return (result, metadata) from each child; process at parent |
| Pass-down accumulator | Carry context from parent to children (running sum, depth, path) | DFS parameter carries the accumulated value |
| Global max via closure | Track answer that spans two subtrees | nonlocal res; update inside DFS without returning the answer |
| Two-subtree simultaneous DFS | Compare, mirror, or interleave two trees | dfs(a.left, b.right) and dfs(a.right, b.left) simultaneously |
| BFS with level sentinel | Process nodes level by level | Record n = len(queue) at level start; pop exactly n nodes |
| Backtracking path list | Collect all root-to-leaf paths | Append before recursion, pop after both children return |
| Precompute index map | Construct tree from traversal arrays in O(n) | {val: idx} for in-order to find root split in O(1) |
| Iterative DFS with explicit stack | Avoid Python recursion limit on deep trees | Stack stores (node, state) pairs; simulate call stack |
Edge Cases & Pitfalls
-
Null root: Most recursive functions return 0, False, or a sentinel for null input — failing to handle it causes
AttributeErroron.left/.right. Always checkif not root: return base_casefirst. -
Single-node tree: Height is 0, diameter is 0, it is both root and leaf. Functions checking for leaves (
not node.left and not node.right) correctly identify it. -
Skewed tree: Recursion depth equals n; Python's default limit (1000) is exceeded for n > ~1000. Either increase with
sys.setrecursionlimitor use iterative traversal. -
Negative values in path-sum problems: Cannot prune a branch just because its running sum is negative — a subsequent positive value may recover. Use
max(child_gain, 0)only when adding the arm to a two-arm path at the current node, not as a general pruning rule. -
"Any path" vs "root-to-leaf": Paths between arbitrary nodes can go up through an ancestor and back down — they are not top-down only. The post-order return pattern handles this; a simple accumulator does not.
-
LCA with p or q not guaranteed in tree: Standard LCA DFS returns None if a target isn't present. If the problem guarantees both are present, returning as soon as one is found is correct (the other must be in that subtree). Verify the guarantee before simplifying.
-
Constructing from pre-order + in-order with duplicate values: The split-by-index approach requires unique values to locate the root in in-order. With duplicates, construction is ambiguous — clarify the constraint before coding.
-
Modifying a tree during traversal: Flattening or pointer surgery (flatten-to-linked-list) can corrupt
node.rightwhile you still need it. Saveright = node.rightbefore re-pointing. -
Diameter vs path length: LeetCode diameter is in edges; some problems ask for nodes (= edges + 1). The formula changes from
left_h + right_htoleft_h + right_h + 1. -
Height convention: Some implementations return −1 for null (edge-count height), others return 0. Pick one and enforce it consistently across all helper functions in the same solution.
Implementation Notes
- Python's default recursion limit is 1000; trees with n > 900 nodes in a skewed shape will stack-overflow. Add
sys.setrecursionlimit(10**5)or use iterative DFS for production. - LeetCode's
TreeNodedefinition:class TreeNode: def __init__(self, val=0, left=None, right=None). collections.dequeis required for O(1)popleft()in BFS; using a list withpop(0)is O(n) per operation.- In-order with iterative DFS: the
while curinner loop and outerwhile stack or curmust both be present — dropping either silently skips nodes. - For serialization, Python's
','.join(...)and.split(',')pair well with aniter()wrapper for deserialize:nodes = iter(data.split(',')); node = next(nodes). - When returning multiple values from a recursive call (e.g., height + is_balanced), use a tuple — avoids the global-variable pattern for cleaner code when both values propagate upward.
- Morris traversal restores the tree but requires careful null-checking; the
while cur.right and cur.right != rootcondition for finding the rightmost predecessor is the error-prone step.
Cross-Topic Relations
| Related topic | Connection |
|---|---|
| Binary Search Tree | Binary tree + BST ordering invariant; in-order traversal yields sorted output. See binary-search-tree.md |
| Heap / Priority Queue | Complete binary tree + heap property; always array-represented. See heap-priority-queue.md |
| Depth-First Search | Pre/in/post-order traversals are DFS instances; all DFS patterns (visited sets, entry/exit timestamps) apply. See depth-first-search.md |
| Breadth-First Search | Level-order traversal is BFS; shortest path in unweighted trees. See breadth-first-search.md |
| Recursion | Tree structure naturally decomposes into smaller subproblems; most tree algorithms are recursive by design. See recursion.md |
| Divide and Conquer | Tree problems split at the root into independent left/right subproblems; post-order = combine after solving subproblems. See divide-and-conquer.md |
| Dynamic Programming | Tree DP: define state per node (subtree result), transition from children to parent. Applicable to counting, optimisation on trees. See dynamic-programming.md |
| Graph Theory | Trees are connected acyclic undirected graphs; tree problems that require parent-tracking become graph problems. See graph-theory.md |
Interview Reference
Must-Know Problems
Traversal & Structure
- Binary Tree Inorder Traversal (94) — iterative version required
- Binary Tree Level Order Traversal (102)
- Binary Tree Zigzag Level Order Traversal (103)
- Binary Tree Right Side View (199)
- Invert Binary Tree (226)
Properties & Validation
- Maximum Depth of Binary Tree (104)
- Balanced Binary Tree (110)
- Diameter of Binary Tree (543)
- Symmetric Tree (101)
- Same Tree (100)
- Count Complete Tree Nodes (222)
Path Problems
- Path Sum (112) — root-to-leaf
- Path Sum II (113) — collect all paths
- Binary Tree Maximum Path Sum (124) — any path, hard
- Sum Root to Leaf Numbers (129)
- Longest Univalue Path (687)
Construction & Serialization
- Construct Binary Tree from Preorder and Inorder Traversal (105)
- Construct Binary Tree from Inorder and Postorder Traversal (106)
- Serialize and Deserialize Binary Tree (297) — hard, essential
- Flatten Binary Tree to Linked List (114)
Ancestor & Subtree
- Lowest Common Ancestor of a Binary Tree (236)
- Subtree of Another Tree (572)
- Kth Smallest Element in a BST (230) — in-order counting
Advanced
- Binary Tree Cameras (968) — greedy post-order
- Recover Binary Search Tree (99) — Morris traversal
- Binary Tree Maximum Path Sum (124)
Additional LeetCode Coverage
- Populating Next Right Pointers in Each Node II (LC 117)
- Binary Tree Postorder Traversal (LC 145)
- Path Sum III (LC 437)
- Boundary of Binary Tree (LC 545)
- Average of Levels in Binary Tree (LC 637)
- Find Duplicate Subtrees (LC 652)
- All Nodes Distance K in Binary Tree (LC 863)
- Leaf-Similar Trees (LC 872)
- Vertical Order Traversal of a Binary Tree (LC 987)
- Maximum Difference Between Node and Ancestor (LC 1026)
- Maximum Level Sum of a Binary Tree (LC 1161)
- Deepest Leaves Sum (LC 1302)
- Longest ZigZag Path in a Binary Tree (LC 1372)
- Binary Tree Preorder Traversal (LC 144)
Clarification Questions to Ask
- Is the root guaranteed to be non-null, or can the tree be empty?
- Are node values guaranteed to be unique? (affects construction and LCA assumptions)
- What is the definition of "path" — root-to-leaf only, or any node-to-node path?
- Can node values be negative or zero? (affects sum pruning)
- What is the tree's height bound? (determines if recursion depth is a concern)
- For "subtree" problems: does a subtree include the node's full descendant structure, or just the immediate children?
- For serialization: is any output format acceptable, or must it match a specific encoding?
Common Interview Mistakes
- Using a simple DFS accumulator for "any path" sum — misses paths that go through an ancestor; requires the post-order global-state pattern.
- Returning only the two-arm path sum from the recursive call instead of the single-arm — corrupts the parent's calculation.
- Forgetting
max(child_gain, 0)in maximum path sum — allows negative arms to incorrectly drag down the result. - Implementing BFS without tracking level boundaries — produces all nodes in one list instead of a list-of-lists.
- Not saving
node.rightbefore re-threading in flatten — loses the right subtree. - Assuming LCA is correct when one of p or q might not be in the tree — need to verify the guarantee.
- Checking BST validity only against the immediate parent — non-local ordering constraints (all ancestors) require passing min/max bounds down the recursion.
- Off-by-one in height convention — mixing edge-count height (−1 for null) and node-count height (0 for null) within the same solution.
Typical Follow-Up Escalations
- "Now solve it iteratively (without recursion)." → Simulate the call stack explicitly; iterative in-order is the classic ask.
- "Now do it in O(1) extra space." → Morris traversal for in-order; right-threading for flatten.
- "The tree is very deep (n = 10^5)." → Must use iterative DFS or increase recursion limit; discuss stack overflow risk.
- "Now handle multiple LCA queries efficiently." → Binary lifting O(n log n) preprocessing, O(log n) per query.
- "Now answer subtree sum queries online." → Euler tour to linearize, then prefix sum or segment tree on the flattened array.
- "What if nodes also have parent pointers?" → LCA reduces to finding the merge point of two ancestor chains; O(h) with a visited set or depth-equalization.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles