Segment Tree
Core Concepts
Segment Tree — a binary tree where each node stores an aggregate (sum, min, max, gcd, etc.) over a contiguous subarray (segment) of the input array. Leaves represent single elements; every internal node covers the union of its children's ranges.
Node range — each node at the tree is responsible for the interval [l, r]. The root covers [0, n-1]; a node covering [l, r] splits at mid = (l+r)//2 into left child [l, mid] and right child [mid+1, r].
Aggregate function — the combining function applied when merging two children: sum, min, max, gcd, product, bitwise OR/AND, etc. Must be associative; need not be commutative.
Point update — change a single element a[i] and propagate the change upward through O(log n) ancestors.
Range query — compute the aggregate over [ql, qr] in O(log n) by decomposing the range into O(log n) canonical nodes whose union is exactly [ql, qr].
Lazy propagation — a technique to apply range updates in O(log n) by storing a deferred tag at a node and pushing it down to children only when the children must be visited. Without lazy, a range update touching k leaves costs O(k log n).
Lazy tag — the pending (not-yet-propagated) update stored at a node. The tag must be composable: applying tag B after tag A produces a combined tag representing "apply A then B".
Push-down — before accessing a node's children, flush the parent's lazy tag to both children and clear the parent's tag. This ensures children's stored values are always valid relative to what the parent has been told.
Push-up — after modifying a node's children, recompute the node's stored value as combine(left.val, right.val).
Build — initialise the tree from an array in O(n) by recursively building children then pushing up.
Structural Invariants
| Property | Rule | Consequence if broken |
|---|---|---|
| Range accuracy | Each node's stored value equals the aggregate of its exact index range, after all pending lazy tags on ancestors have been applied | Range query returns wrong answer; overlap logic can silently pick stale nodes |
| Lazy tag consistency | A node's stored value already reflects its own tag (the tag represents updates to be pushed to children, not the node itself) | Double-application of a tag; values become incorrect after push-down |
| Disjoint child ranges | Left child covers [l, mid], right child [mid+1, r] with no overlap and no gap | Queries over boundary indices miss elements or double-count them |
| Array size ≥ 4n | Array-based implementation needs at most 4n nodes for a tree over n elements | Index out of bounds during build or update when tree is deep |
| Tag identity | The identity tag (no-op) must be the zero element of tag composition so that a fresh node behaves correctly before any update | Spurious updates applied to leaves that were never explicitly updated |
Internal Representation
Array-based (4n allocation) — allocate a flat array tree[4n] and use 1-indexed node numbering: root at index 1, left child of node v at 2v, right child at 2v+1. No pointers needed; cache-friendly for competitive programming.
left_child = 2 * v
right_child = 2 * v + 1
parent = v // 2
Space: O(n) (constant factor ≤ 4). All build/query/update functions take the node index and its range as parameters. This is the standard approach for static-size segment trees.
Recursive with explicit nodes — each node is an object or a dict entry holding val, lazy, left, right. More flexible for sparse/dynamic trees but has pointer overhead and worse cache performance.
Iterative (bottom-up) — use a 2n array; leaves at indices n to 2n-1, internal nodes at 1 to n-1. Build bottom-up; point updates walk from leaf to root updating each ancestor. Supports only point updates and range queries (no lazy propagation in the standard form). Faster constant than recursive for sum/min/max with no range updates.
# iterative point update (0-indexed external, n-indexed leaf)
i += n
tree[i] = val
while i > 1:
i //= 2; tree[i] = tree[2*i] + tree[2*i+1]
Implicit / Dynamic Segment Tree — nodes are created on demand (only when updated). Uses a node pool with left/right child indices. Supports coordinate-compressed or very large index spaces (up to 10^9) without pre-allocating 4n memory. Essential when n is too large to allocate statically.
Persistent Segment Tree — each update creates a new root and only O(log n) new nodes (path copying), leaving previous versions intact. Used for offline k-th order statistics over ranges and historical queries.
Types / Variants
| Variant | What it is | When to prefer | Key property |
|---|---|---|---|
| Sum Segment Tree | Stores subarray sum at each node | Range sum queries with point/range updates | Combine: l + r; identity: 0 |
| Min / Max Segment Tree | Stores subarray min or max | Range min/max queries; sliding window offline | Combine: min(l,r) or max(l,r) |
| Lazy Segment Tree | Any tree with deferred range-update tags | Range assign, range add, then range query | Tag must compose and distribute over node values |
| Persistent Segment Tree | Segment tree with path-copying on update | K-th smallest in range; offline versioned queries | O(log n) new nodes per update; O(n log n) total space |
| Dynamic / Implicit Segment Tree | Nodes allocated on demand | Index space up to 10^9; sparse updates | O(q log U) nodes for q updates over universe U |
| Merge Sort Tree | Each node stores sorted list of its range's elements | Count elements in range satisfying a value predicate | O(n log n) space; O(log² n) query; no efficient updates |
| Segment Tree Beats (Ji Driver) | Stores max, second-max, max-count per node; applies "clamp max" in O(n log² n) | Range min/max assignment when value itself limits change propagation | Breaks when second-max ≥ new cap; otherwise amortised |
| 2D Segment Tree | Outer tree over rows, each node is an inner tree over columns | 2D range queries with updates | O(n² ) space; O(log² n) query; rarely needed vs. offline |
Topic-Specific Operations
Build
Recursively build left and right subtrees, then push-up: node.val = combine(left.val, right.val). At a leaf, node.val = a[i]. Time: O(n) (each of the 2n−1 nodes visited once).
Push-Up
Called after any modification to a node's children. Recomputes the node's aggregate from its children:
tree[v] = tree[2*v] + tree[2*v+1] # or min/max/gcd
This is the only operation needed when there is no lazy propagation.
Push-Down (Lazy Flush)
Called before recursing into children during an update or query. Applies the parent's tag to both children (updating their stored values and composing their tags), then clears the parent's tag:
def push_down(v):
if lazy[v] != 0:
apply(2*v, lazy[v]); apply(2*v+1, lazy[v]); lazy[v] = 0
The apply function both updates tree[child] (e.g., adds tag × range_length to sum) and composes the tag into lazy[child].
Range Query
Recursively decompose [ql, qr]: if the node range is fully inside, return tree[v]; if fully outside, return identity; otherwise push-down and recurse on both children, combining results. Time: O(log n).
Point Update
Walk from the root to the target leaf, then push-up on the way back. With array-based iterative: update the leaf directly, then walk toward root recomputing each parent. Time: O(log n).
Range Update (with Lazy)
Recursively descend; when the node range is fully covered by the update range, apply the tag to the node and record it in lazy[v] (do not recurse further); otherwise push-down first, recurse on both children, then push-up. Time: O(log n).
Key Algorithms
Build from Array
Recursively split at midpoint, build subtrees, push-up. Each element is visited once; total cost is O(n).
Key insight: bottom-up push-up passes correct aggregates upward without visiting any index twice — no re-computation after building.
Range Sum with Range Add (Lazy)
Tag represents "add delta to every element in range". Applying the tag to a node: tree[v] += delta * (r - l + 1) (the sum increases by delta times the range length). Composing tags: lazy[v] += delta (additive). Identity tag: 0.
Range Min/Max with Range Assign (Lazy)
Tag represents "set every element in range to val". Applying: tree[v] = val. Composing: tag overrides previous tag completely. Identity tag: sentinel (no-op marker such as None or +inf for assign).
Offline K-th Smallest in Range (Persistent Segment Tree)
Build a persistent segment tree over frequency of values. For each prefix index i, create a new version by inserting a[i] into the value-domain tree. To answer "k-th smallest in a[l..r]": use the difference of versions r and l-1; walk the value-domain tree from root, going left if left_count >= k, else subtract left_count from k and go right. Time: O(log n) per query after O(n log n) build.
Coordinate Compression + Segment Tree
When values span a large range (up to 10^9) but there are only n distinct values, compress to ranks 0..n-1 first. Then build a static segment tree over the compressed domain. Allows range frequency queries and order statistics in O(n log n) total.
Inversion Count via Segment Tree
Process array right-to-left (or left-to-right). For each element, query the tree for the count of elements already inserted that are smaller (or larger). Then insert the current element. Same as BIT approach but segment tree generalises to other predicates. Time: O(n log n).
For the simpler BIT-based inversion count, see binary-indexed-tree.md if it exists.
Advanced Techniques
Lazy Propagation with Composed Tags
Problem class: Range updates of mixed types applied in sequence (e.g., "add 5 to range, then multiply range by 3, then query sum"). Each tag is a linear function f(x) = a·x + b; composition is function composition: f∘g(x) = a·(c·x+d)+b = ac·x + (ad+b).
Mechanism: Store tag as (mul, add) pairs. When pushing down: child_new_tag = compose(parent_tag, child_old_tag); child_val = apply(parent_tag, child_val, range_len).
When to use over simpler alternatives: Only when the update types do not commute (e.g., add then multiply ≠ multiply then add). If all range updates are of the same type (e.g., all adds or all assigns), a simple single scalar tag suffices.
Complexity: O(log n) per operation; constant factor scales with tag composition cost.
Segment Tree Beats (Ji Driver Segmentation)
Problem class: "For each element in [l, r], set a[i] = min(a[i], v)" (range chmin), combined with range sum queries. Generalises to range chmax and tracking exact minimums.
Mechanism: Each node stores the segment maximum, second maximum, maximum count, and segment sum. When applying chmin(v): if v >= max, nothing changes; if second_max < v < max, only the maximum values change — decrease sum by (max - v) * max_count, update max to v; if v <= second_max, break the recursion and recurse into children.
When to use over alternatives: Only when the problem specifically involves clamping the maximum (or minimum) value in a range while maintaining a sum or count query. Standard lazy propagation cannot represent "apply only to elements equal to the current max". Do not use for straightforward range assign or range add.
Complexity: O(n log² n) amortised for n operations; each "break" case is bounded by a potential argument on the number of distinct maximums.
Persistent Segment Tree for Historical Queries
Problem class: K-th smallest/largest in a subarray a[l..r]; count elements in value range [v1, v2] over index range [l, r]; "what was the state of the structure at time t?"
Mechanism: Each update creates a new root and copies only the O(log n) nodes on the path to the modified leaf (path copying). Versions are stored in an array roots[0..n]. Querying a range [l, r] uses roots[r] and roots[l-1] simultaneously — their difference gives the frequency distribution over [l, r].
When to use over alternatives: When range order-statistic queries are offline and the value domain can be coordinate-compressed. For online queries with updates, a wavelet tree or merge sort tree may be needed. Simpler than a 2D structure when the "second dimension" is the query range.
Complexity: O(n log n) build, O(log n) per query, O(n log n) total space.
Dynamic / Implicit Segment Tree for Large Index Spaces
Problem class: Segment tree over index ranges up to 10^9 or 10^18 (e.g., coordinate without compression, timestamp ranges).
Mechanism: Allocate a node pool array. Start with only a root node. When an update or query needs to go left/right, allocate a new child node from the pool if it does not exist yet. Node v stores left_child_idx and right_child_idx (indices into pool), initialised to 0 (null).
When to use over alternatives: When index space is too large to allocate 4n statically and coordinate compression is not possible (e.g., online queries with unknown coordinates). If all coordinates are known upfront, coordinate compression + static segment tree is simpler and faster.
Complexity: O(q log U) nodes total for q operations over universe size U; each operation O(log U).
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Range aggregate query (static) | Sum/min/max over [l, r]; no updates | Sparse table (O(1) query) preferred for min/max; segment tree for sum or when updates exist |
| Range query with point updates | Sum/min/max over [l, r]; single-index updates | Standard segment tree or BIT (sum only) |
| Range query with range updates | Aggregate over [l, r] after range add/assign | Lazy segment tree |
| Order statistics in range | K-th smallest in a[l..r] | Persistent segment tree over value domain; or merge sort tree (offline) |
| Count in value range over index range | How many elements in a[l..r] lie in [v1, v2] | Persistent segment tree or merge sort tree |
| Inversion count / rank queries | Count elements less than x inserted so far | Segment tree or BIT over compressed values |
| Interval stabbing / coverage | How many intervals cover point p; max overlap depth | Segment tree with difference array or sweep line |
| Range clamping with sum query | Apply min/max cap to a range; query sum | Segment Tree Beats |
| Versioned / historical queries | Query the array as it was at time t | Persistent segment tree |
| Sparse / large-index range queries | Queries over 10^9-size domain with few updates | Dynamic segment tree |
Problem Signal → Technique
| Signal in problem | Likely technique |
|---|---|
| "range sum query" + "update" | Lazy segment tree (range update) or BIT (point update) |
| "range minimum / maximum query" | Segment tree with min/max aggregate; sparse table if static |
"k-th smallest in subarray [l, r]" | Persistent segment tree over value domain |
"count of elements in [v1, v2] over index range" | Persistent segment tree or merge sort tree |
| "range add then range sum" | Lazy segment tree with additive tag |
| "range assign then range sum/min/max" | Lazy segment tree with assign tag |
| "range chmin / chmax with sum" | Segment Tree Beats |
| "index up to 10^9", "no coordinate compression possible" | Dynamic segment tree |
| "for each prefix / at each time step" | Persistent segment tree |
| "number of inversions" | Segment tree or BIT over value-compressed domain |
| "maximum subarray sum over queries" | Segment tree with composite node: (sum, prefix_max, suffix_max, max_subarray) |
| "offline queries sorted by right endpoint" | Segment tree processing queries as right endpoint advances |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Canonical decomposition | Any range query | Recursively split query range; accumulate O(log n) non-overlapping node results |
| Push-down before recurse | Any operation with lazy tags | Flush parent's tag to children before accessing children; guarantees children hold correct values |
| Push-up after modify | Any operation that changes children | Recompute parent aggregate from children after returning from recursion |
| Tag composition | Mixed update types (add + multiply, etc.) | Represent pending update as a function; compose functions when stacking tags |
| Coordinate compression before build | Value domain too large; finite update set | Map distinct values to ranks 0..k-1; build segment tree over ranks |
| Offline query ordering by right endpoint | Queries where left endpoint is variable but right advances | Process queries at their right endpoint; update tree incrementally as right advances |
| Composite node struct | Queries requiring multiple aggregates per node (max subarray, palindrome count) | Store multiple fields per node; define a merge function that combines all fields correctly |
| Segment tree on values, not indices | Problems about frequencies or rank | Build tree over value domain; update frequency at val; query prefix sums for rank |
| Difference trick for range update without lazy | Range add with only point queries (no range queries needed after) | Apply +delta at l and -delta at r+1 using a BIT or segment tree; then point query = prefix sum |
Edge Cases & Pitfalls
-
4n array size: allocating exactly
nor2nelements causes index out of bounds for the last internal nodes of non-power-of-two arrays. Always allocate4 * nfor recursive implementations. -
Off-by-one in range midpoint: using
mid = (l + r + 1) // 2instead of(l + r) // 2shifts the split point and causes the right child to be one element shorter than expected, breaking queries at boundary indices. -
Forgetting push-down before recursion: reading a child's value before flushing the parent's lazy tag returns a stale aggregate. This is the most common bug in lazy segment trees — push-down must precede every access to children.
-
Tag not applied to the node itself: the convention is that a node's stored value already reflects the tag (the tag is "for children only"). Applying the tag again to the node on push-up double-counts the update.
-
Wrong range-length multiplier in sum tree: for a lazy "add delta" tag, the node's sum increases by
delta * (r - l + 1), not bydelta. Using the wrong multiplier gives incorrect sums for internal nodes. -
Non-composable tags: using a single scalar tag for mixed update types (add and assign) without proper composition logic causes the earlier update to be lost. When a node has both an assign and a subsequent add pending, the assign must subsume any prior add.
-
Persistent tree version indexing: version
roots[i]corresponds to the prefixa[0..i-1], so the range[l, r](1-indexed) usesroots[r]androots[l-1]. Off-by-one here silently shifts the query window. -
Dynamic tree null node: when a left or right child is null (unallocated), returning
0or identity is correct only if the identity value is genuinely the neutral element for the aggregate. For min queries, the null node must return+infinity, not0. -
Empty query range: if
ql > qr(e.g., after coordinate compression maps an empty range), the query should return the identity immediately rather than recursing. Not checking this causes incorrect results or infinite recursion. -
Integer overflow in sum: for large arrays with large values, the sum at the root can overflow 32-bit integers. Use 64-bit integers (Python is safe; in C++/Java, use
long).
Implementation Notes
- Python's recursion limit is 1000 by default; a segment tree over n = 10^5 has depth ~17, which is safe. For n up to 10^6, increase with
sys.setrecursionlimit(300000)before building. - The array-based segment tree uses 1-indexed nodes with root at index 1. Index 0 is unused; do not store the identity value there and rely on it — explicitly skip index 0.
- For Python, the iterative bottom-up variant (2n array, no lazy) is faster in practice due to lower function-call overhead. Use recursive lazy only when range updates are required.
sys.stdout.reconfigure(encoding="utf-8")must appear at the top of scripts on Windows; segment tree output with large sums may include characters outside ASCII.- When implementing a lazy segment tree with assign tags, use
None(not0) as the "no pending tag" sentinel; otherwise a legitimate assign-to-zero is indistinguishable from no tag. - For the merge sort tree (node stores sorted list), Python's
bisect.bisect_leftandbisect.bisect_righthandle the per-node binary search; build time is O(n log n) due to merging sorted lists up the tree. - Dynamic segment tree node pools: pre-allocate a fixed-size array of node objects (e.g., size
40 * nfor n queries) and use a global counter as the allocator. This avoids repeated object creation overhead in Python. - When running multiple test cases, reset the segment tree array and lazy array with
tree[:] = [0] * sizerather than reconstructing a new list — avoids repeated allocation.
Cross-Topic Relations
| Related topic | Specific connection |
|---|---|
| Binary Indexed Tree (Fenwick Tree) | BIT solves the same range-sum + point-update problem in O(log n) with a smaller constant and simpler code; segment tree is preferred when range updates, range min/max, or lazy propagation are needed. See binary-indexed-tree.md if it exists. |
| Prefix Sum | Static range sum queries (no updates) are solved in O(1) per query with a prefix sum array; segment tree is only needed when updates are present. See prefix-sum.md. |
| Divide and Conquer | The recursive build and query structure of a segment tree is a direct application of divide and conquer; the merge step is the aggregate function. |
| Sorting / Coordinate Compression | Persistent and static segment trees over value domains require coordinate compression to reduce the domain from 10^9 to n. Sorting the unique values is the first step. |
| Sparse Table | For static range min/max queries (no updates), a sparse table achieves O(1) query vs. O(log n) for a segment tree; segment tree is chosen only when updates are required. |
| Binary Search | Walking a segment tree to find the leftmost index satisfying a predicate (e.g., "prefix sum ≥ k") is equivalent to binary search on the tree structure; O(log n). |
| Union-Find | Both are used for interval/component problems; Union-Find handles connectivity, segment tree handles aggregate queries over ranges. |
| Sweep Line | Segment trees are frequently used as the underlying structure in sweep line algorithms to maintain interval coverage counts or active interval sets. |
| Dynamic Programming | Some DP optimisations use a segment tree to answer "minimum/maximum dp[j] for j in [l, r]" in O(log n), reducing transitions from O(n) to O(log n). |
Interview Reference
Must-Know Problems
Range Query Fundamentals
- Range Sum Query — Mutable (307) — point update + range sum; baseline segment tree
- Range Minimum Query (no LC number; concept) — min segment tree without lazy
- Count of Smaller Numbers After Self (315) — segment tree over compressed values; inversion-count pattern
Lazy Propagation
- Range Sum Query with Range Update (conceptual; practice on 307 with range-add variant)
- Falling Squares (699) — range assign + range max; lazy assign segment tree
- My Calendar III (732) — range add + range max; lazy add segment tree
- The Skyline Problem (218) — coordinate-compressed segment tree on intervals
Order Statistics / Persistent
- Count of Range Sum (327) — count subarrays with sum in range; merge sort tree or BIT
- Reverse Pairs (493) — modified merge sort or segment tree on compressed values
- K-th Smallest Element in a Sorted Matrix (378) — binary search + segment tree alternative
Interval / Stabbing Queries
- Number of Flowers in Full Bloom (2251) — offline sweep with BIT/segment tree
- Amount of New Area Painted Each Day (2158) — range assign with tracking of "first uncovered"
- Rectangle Area II (850) — coordinate-compressed segment tree measuring covered length
Composite / Hard
- Maximum Sum of Distinct Subarrays With Length K — segment tree with frequency tracking
- Sum of Subarray Minimums (907) — monotonic stack preferred; segment tree alternative
- Longest Increasing Subsequence variants with segment tree DP optimisation
Segment Tree Beats
- Falling Squares (699) — assign variant (warm-up)
- Range chmin with sum queries — classic Ji driver application (competitive programming staple)
Clarification Questions to Ask
- "Are queries online (must answer each before seeing the next) or offline (all given upfront)?" — offline enables persistent tree or sort-by-right-endpoint techniques.
- "What is the range of values and indices?" — determines whether coordinate compression is needed and whether a dynamic tree is required.
- "Are updates point updates or range updates?" — range updates require lazy propagation; point updates allow simpler code or a BIT.
- "Can values be negative?" — affects identity values for min/max trees and sum overflow checks.
- "What aggregate function?" — sum, min, max, gcd, or custom? Custom merges (max subarray) require a composite node.
- "Is the problem asking for value-domain queries (k-th smallest) or index-domain queries (sum over index range)?" — determines which axis the segment tree is built over.
- "After a range update, are subsequent queries guaranteed to be range queries, or only point queries?" — if only point queries follow range updates, a difference-array approach avoids segment tree complexity entirely.
Common Interview Mistakes
- Allocating a size-2n or size-n array for recursive segment tree — causes index out of bounds; always use 4n.
- Forgetting
push_downbefore recursing into children in a lazy tree — the most common correctness error; leads to queries reading unflush stale values. - Applying the lazy tag to the node and its children (double application) — the node's value already reflects its own tag; the tag exists to defer updates to children only.
- Using
0as the no-tag sentinel when0is a valid update value (e.g., "assign 0 to the range") — useNoneor a separate boolean flag. - Wrong range-length multiplier in sum updates — writing
tree[v] += lazy[v]instead oftree[v] += lazy[v] * (r - l + 1)for a range-add lazy sum tree. - Building a segment tree when a prefix sum array or sparse table would suffice — overkill for static-no-update problems, and reviewers will note it.
- Confusing the index domain tree (tree over array indices) with the value domain tree (tree over possible values) — they look identical in code but serve different purposes.
- Not handling the base case (single-element leaf) separately in composite-node segment trees — the merge function for max-subarray-sum fails at leaves if not initialised correctly.
Typical Follow-Up Escalations
- "Your solution is O(n log n) with a segment tree — can you do the range sum queries in O(1)?" → Only if there are no updates; use a prefix sum array then.
- "Now add range updates — not just point updates." → Add lazy propagation; walk through push-down and tag composition.
- "The array can now have 10^9 elements but only 10^5 updates." → Switch to a dynamic/implicit segment tree; allocate nodes on demand.
- "I want to query historical versions — what was the sum at time t?" → Persistent segment tree with path copying; O(n log n) space.
- "Can you find the k-th smallest element in a subarray
[l, r]?" → Persistent segment tree over value domain; walk both versions simultaneously. - "The updates are now 'clamp all elements to at most v' — can your segment tree still handle it?" → Segment Tree Beats (Ji driver); walk through the break condition and amortised argument.
- "Your recursive Python solution hits the recursion limit for n = 10^6." → Rewrite as iterative bottom-up segment tree or increase
sys.setrecursionlimit.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles