Prefix Sum
Core Concepts
Prefix sum (cumulative sum) — an auxiliary array P where P[i] = sum of all elements from index 0 to i inclusive. Enables O(1) range-sum queries after O(n) preprocessing.
Range sum query — the sum of arr[l..r] equals P[r] - P[l-1] (using 1-indexed prefix array) or P[r+1] - P[l] (using 0-indexed prefix array with a sentinel P[0] = 0). The sentinel avoids an edge case when l = 0.
Difference array (inverse prefix sum) — an auxiliary array D where D[i] = arr[i] - arr[i-1]. A range update [l, r] += val becomes two O(1) point writes: D[l] += val, D[r+1] -= val. Recover updated array with a prefix sum pass. Prefix sum and difference array are inverse operations.
2D prefix sum — P[i][j] = sum of rectangle (0,0) to (i,j). Rectangle query (r1,c1) to (r2,c2) = P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1] (inclusion-exclusion). Build in O(mn), query in O(1).
Prefix XOR / prefix product — generalise prefix sum to any associative binary operation with an inverse. XOR prefix array enables O(1) range XOR queries. Prefix product enables O(1) range products when no zeros are present; zeros require special handling.
Hash-map prefix technique — store prefix values in a hash map to find previous prefixes that satisfy a target condition. Converts "find subarray with property X" into a lookup. Relies on the identity: sum(l..r) = P[r] - P[l-1].
Types / Variants
| Variant | What it is | When to use | Key property |
|---|---|---|---|
| 1D prefix sum | Cumulative sum over a linear array | Range sum / count queries, single dimension | Build O(n), query O(1) |
| Difference array | Inverse of prefix sum; supports range updates | Many range-increment updates, then single read-pass | Update O(1), recover O(n) |
| 2D prefix sum | Cumulative sum over a grid | Rectangle sum queries on a matrix | Build O(mn), query O(1) |
| Prefix XOR | Cumulative XOR | Range XOR queries, subarray XOR = target | XOR is its own inverse: P[r] XOR P[l-1] |
| Prefix product | Cumulative product | Range product when zeros are absent; "product except self" patterns | Zeros require separate tracking |
| Modular prefix sum | Prefix sum under modulo | "Number of subarrays with sum divisible by k" | P[r] % k == P[l-1] % k implies sum(l..r) divisible by k |
Key Algorithms
Build 1D prefix array
Create sentinel-padded array so P[0] = 0, P[i] = P[i-1] + arr[i-1]. Range sum [l, r] (0-indexed) = P[r+1] - P[l]. — O(n) build, O(1) query.
P = [0] * (len(arr) + 1)
for i, v in enumerate(arr): P[i+1] = P[i] + v
Subarray sum equals k (hash map prefix)
Maintain a running prefix sum and a count map {prefix: frequency}. For each position, check if prefix - k exists in the map. Handles negative values and zeros. — O(n) time, O(n) space.
count = defaultdict(int); count[0] = 1; res = s = 0
for v in arr: s += v; res += count[s - k]; count[s] += 1
The initialisation count[0] = 1 accounts for subarrays starting at index 0.
Difference array range update
To apply +val to range [l, r], write two entries in D then take one prefix-sum pass at the end. — O(1) per update, O(n) to reconstruct.
D[l] += val; D[r + 1] -= val # apply updates
arr = list(accumulate(D)) # reconstruct
2D prefix sum build and query
Build row-by-row: P[i][j] = arr[i][j] + P[i-1][j] + P[i][j-1] - P[i-1][j-1]. Query with inclusion-exclusion. — O(mn) build, O(1) query.
Modular prefix sum (subarray divisible by k)
sum(l..r) % k == 0 iff P[r] % k == P[l-1] % k. Count pairs of equal remainders in the prefix-mod array using a frequency map. Negative remainders require normalisation: ((s % k) + k) % k. — O(n) time.
Sliding window with prefix sum distinction
When the window is fixed-size, a sliding sum (subtract outgoing, add incoming) is simpler and achieves O(1) per step without building the full prefix array. Use prefix sum only when window size varies or queries are non-sequential.
Advanced Techniques
Prefix sum on boolean / indicator arrays
What it solves: count or locate elements satisfying a condition in a subrange without re-scanning.
Mechanism: build a prefix count array C where C[i] = 1 if condition(arr[i]) else 0, then cumulate. Range count = P[r+1] - P[l].
When over simpler alternatives: only worthwhile when the same array receives many range-count queries (>1); a single query is faster with a plain scan.
Complexity: O(n) build, O(1) per query.
Hash map with prefix sum for two-sum subarray variants
What it solves: "number of subarrays with sum = k", "longest subarray with sum ≤ k", "smallest subarray with sum ≥ k", "subarrays with equal 0s and 1s" (map 0→−1).
Mechanism: reduce to "find a previous prefix such that current_prefix OP prev_prefix = target". The hash map stores the first or count of each prefix seen.
When over simpler alternatives: use over brute-force O(n²) whenever the array can contain negatives (ruling out two-pointer); use over segment tree when offline queries are acceptable.
Complexity: O(n) time, O(n) space.
2D difference array
What it solves: multiple rectangle range-increment updates on a grid, then one read pass.
Mechanism: generalise the 1D difference array to 2D with four corner updates per rectangle. Recover with two prefix-sum passes (row-wise then column-wise).
When over simpler alternatives: preferable over 2D segment tree or BIT when all updates come before all reads (offline); simpler constant factor.
Complexity: O(1) per update, O(mn) to reconstruct.
Prefix sum on sorted order (offline range queries)
What it solves: range queries where "range" is on sorted value, not index (e.g., "how many values in subarray [l,r] fall in [a,b]").
Mechanism: sort elements, build prefix counts per value. For index-range + value-range queries, combine with a merge-sort tree or BIT — this is where prefix sum alone is insufficient.
When over simpler alternatives: plain prefix sum handles the value dimension; add a second index structure only when both dimensions are variable.
Complexity: O(n) for pure value-dimension; O(n log n) when combined with index-range BIT.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Range sum query | Sum of a fixed subarray, possibly many queries | Build prefix array; query in O(1) |
| Subarray sum = target | Count or find subarrays with exact sum | Hash map prefix sum |
| Subarray sum divisible by k | Count subarrays where sum % k = 0 | Prefix mod + frequency map |
| Range update, point query | Apply many range increments, then read individual values | Difference array |
| Range update, range query | Both range updates and range sum queries | BIT or segment tree (prefix sum insufficient alone) |
| Subarray with equal counts | Equal count of two symbols (0/1, char A/char B) | Remap one symbol to −1, use hash map prefix sum |
| 2D rectangle sum | Sum of a submatrix, many queries | 2D prefix sum |
| Product of subarray | Product excluding self, or range product | Prefix and suffix product arrays |
| Longest / shortest subarray meeting sum condition | Length-extremal subarray | Prefix sum + hash map storing first/last occurrence |
Problem Signal → Technique
| Signal in problem | Likely technique |
|---|---|
| "sum of subarray", "range sum" | 1D prefix sum |
| "number of subarrays with sum = k" | Hash map prefix sum (count[s-k]) |
| "sum divisible by k" | Prefix sum mod k, frequency map |
| "equal number of 0s and 1s" | Map 0→−1, hash map prefix sum |
| "rectangle sum", "submatrix sum" | 2D prefix sum |
| "range increment … then query" | Difference array |
| "product of array except self" | Prefix product + suffix product |
| "XOR of subarray" | Prefix XOR |
| "minimum / maximum prefix sum" | Track running min/max of prefix array |
| "at most k distinct / less than k sum" | Prefix sum + two pointers (non-negative arrays only) |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
Sentinel prefix P[0] = 0 | Always | Eliminates boundary check for subarrays starting at index 0 |
Count map initialised with {0: 1} | Hash map prefix technique | Accounts for the prefix itself equalling the target |
| First-occurrence map | Longest subarray with property | Store {prefix: first_index} instead of count; take max(i - map[prefix]) |
| Remap binary symbols to ±1 | Equal-count subarray problems | Converts "equal 0s and 1s" into "subarray sum = 0" |
| Two prefix arrays (prefix + suffix) | Product, min/max excluding self | Compute left-to-right then right-to-left; combine at each index |
| Reconstruct from difference array | Batch range updates | Accumulate D once after all updates; never read intermediate state |
| Row-then-column prefix pass | 2D difference array recovery | Apply accumulate row-wise, then column-wise |
Edge Cases & Pitfalls
-
Off-by-one in prefix indexing. The two conventions (0-indexed with sentinel vs. 1-indexed) are not interchangeable mid-solution. Pick one and use it consistently throughout; mixing them produces silent wrong answers.
-
Forgetting
count[0] = 1in the hash map pattern. Without it, subarrays starting from index 0 that sum exactly tokare missed. The fix is initialising the map before the loop. -
Negative values breaking two-pointer assumptions. Prefix sum + two pointers is only valid when all values are non-negative (so the prefix array is monotone). For arrays with negatives, the hash map approach is required.
-
Integer overflow in prefix products. Products grow exponentially; even 20 elements of value 10 overflow a 64-bit integer. Check constraints; use modular arithmetic or logarithms if products are large.
-
Zero in prefix product arrays.
range_product = P[r] / P[l-1]fails when any prefix is zero. Track zero positions separately or use the "product except self" two-pass pattern instead. -
Modular prefix sum with negative numbers. Python's
%returns a non-negative result, but in other languages(-7) % 3can be−1rather than2. Always normalise:((s % k) + k) % k. -
2D prefix sum index arithmetic. The inclusion-exclusion formula has four terms; a missing or double-counted corner gives wrong results. Derive from a small example before coding.
-
Difference array out-of-bounds at
r+1. Whenr == n-1, writingD[n]requires the array to be of sizen+1. AllocateD = [0] * (n + 1)preemptively. -
Applying difference array then reading before reconstruction. Reading
D[i]directly gives a delta, not the updated value. Always reconstruct with a prefix pass before querying.
Implementation Notes
- Python's
itertools.accumulatebuilds a prefix sum in one line; passinitial=0(Python 3.8+) to get the sentinel-padded version automatically. defaultdict(int)for the hash map avoids key-existence checks and is faster thandict.get(key, 0).- For 2D prefix sums on large grids, allocate the prefix array with an extra row and column of zeros to avoid conditionals;
P = [[0]*(cols+1) for _ in range(rows+1)]. - Python lists have no overflow, but if porting to Java/C++ use
longfor prefix sums over arrays with large values or large n. - The difference array pattern requires the update range to be closed (
[l, r]inclusive); if the problem gives half-open intervals, adjust before writing toD.
Cross-Topic Relations
| Related topic | Connection |
|---|---|
| Binary Indexed Tree (Fenwick Tree) | BIT generalises prefix sum to support point updates in O(log n); use when the array is mutable between queries |
| Segment Tree | Generalises further to arbitrary range queries with range updates; use when both updates and queries are range-based |
| Sliding Window | Complements prefix sum for fixed-size windows or non-negative arrays; O(1) per step without a prefix array |
| Two Pointers | Used with prefix sum on non-negative arrays to find subarrays meeting a sum bound in O(n) |
| Hash Map | Required component of the hash map prefix technique; the sum lookup is O(1) amortised |
| Dynamic Programming | Many DP optimisations use a running prefix min/max of the DP array to reduce O(n²) transitions to O(n) |
| Sorting | Sorting + prefix count array enables offline "how many values in range [a,b]" queries |
Interview Reference
Must-Know Problems
Foundational
- Range Sum Query – Immutable (LC 303)
- Range Sum Query 2D – Immutable (LC 304)
- Running Sum of 1d Array (LC 1480)
Hash map prefix — subarray count/length
- Subarray Sum Equals K (LC 560)
- Continuous Subarray Sum (LC 523)
- Subarray Sums Divisible by K (LC 974)
- Contiguous Array (LC 525) — equal 0s and 1s
- Longest Subarray with Sum at Most K (use deque + prefix; LC 862 hard variant)
Product variants
- Product of Array Except Self (LC 238)
Difference array / range update
- Corporate Flight Bookings (LC 1109)
- Car Pooling (LC 1094)
- Shifting Letters II (LC 2381)
2D and advanced
- Matrix Block Sum (LC 1314)
- Number of Ways to Form a Target String Given a Dictionary (prefix DP, LC 1639)
- Count Number of Nice Subarrays (LC 1248) — remap odds to 1, use prefix count
Clarification Questions to Ask
- Are values positive only, or can they be negative/zero? (determines whether two-pointer is valid)
- Is the array mutable between queries? (determines whether a static prefix array suffices or a BIT/segment tree is needed)
- Are queries online (must answer each before seeing the next) or offline? (offline allows sorting-based approaches)
- What is the range of values and n? (check for overflow risk in prefix products or sums)
- Does "subarray" mean contiguous? (prefix sum only applies to contiguous subarrays)
Common Interview Mistakes
- Using prefix sum on a mutable array and forgetting to rebuild after updates — the correct structure is a BIT or segment tree.
- Omitting the
count[0] = 1sentinel, causing wrong counts for subarrays that start at index 0. - Attempting two-pointer + prefix sum on arrays with negative values, where the prefix array is not monotone.
- Confusing the difference array (for updates) with the prefix array (for queries) — they are inverses, not interchangeable.
- Reading
D[i]from a difference array before running the prefix-sum reconstruction pass. - Off-by-one between 0-indexed and 1-indexed conventions when switching between problems mid-contest.
Typical Follow-Up Escalations
- "Now support updates to the array." → Switch to Binary Indexed Tree or Segment Tree for O(log n) point updates with O(log n) queries.
- "Now support range updates too." → Segment tree with lazy propagation; difference array is insufficient for interleaved updates and queries.
- "The array has 10^9 possible values but only 10^5 elements." → Compress values to indices before building prefix structures.
- "Find the subarray with sum closest to k (not exactly k)." → Sort prefix sums with their indices; binary search for
P[j] - k; O(n log n). - "Extend to 3D." → 3D prefix sum with 8-corner inclusion-exclusion; build O(mnp), query O(1).
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles