Memoization

Topic type: Technique-Based
LeetCode tag: Memoization (~120 problems)

See dynamic-programming.md for full DP theory: optimal substructure, subproblem DAGs, bottom-up formulation, and DP variant taxonomy.
See recursion.md for recursion fundamentals: base cases, recurrence structure, and stack mechanics.

Core Concepts

Memoization — an optimization where each unique set of arguments to a recursive function is computed exactly once; subsequent calls return the cached result immediately. Converts exponential tree recursion to polynomial time when subproblems overlap.

Overlapping subproblems — the same (args) tuple recurs across multiple call branches. Memoization is only beneficial when this happens; pure divide-and-conquer (disjoint subproblems, e.g., merge sort) gains nothing.

Cache key — the hashable representation of the complete function state. A correct key captures everything that determines the return value; any omission causes incorrect cache hits, any addition causes unnecessary misses.

Top-down evaluation — subproblems are solved lazily, driven by which states the recursion actually visits. Opposite of bottom-up (tabulation), which fills all states in dependency order.

Reachable state space — the set of (args) tuples actually visited during a run. Top-down only pays for reachable states; bottom-up pays for the full table. When the state space is sparse (e.g., digit DP, bitmask DP with pruning), top-down is cheaper.

Subproblem independence — each unique key's result must be a pure function of that key alone; if the return value depends on mutable global state outside the key, caching is incorrect.

Types / Variants

Variant	Mechanism	When to prefer
`@cache` / `@lru_cache(None)`	Python decorator wraps any function; key = all args as a tuple	Default choice in Python; zero boilerplate
Manual dict cache	`memo = {}` passed or closed over; `if key in memo: return memo[key]`	Need finer control (e.g., custom key, clearing, cross-function sharing)
Class-level cache	`self._memo` dict on an object	When memoised function is a method and object state is part of the key
Multi-level cache	Nested dicts or compound tuple keys	State has multiple independent dimensions (e.g., `(i, j, remaining)`)
Iterative + explicit stack	Convert recursion to iteration with a stack; fill cache as results become available	Recursion depth exceeds ~1000; Python stack limit is a hard constraint

Key Algorithms

Top-Down DP via Memoization

Translate the recurrence relation directly into a recursive function, then cache. The algorithm structure is: check cache → base case → recurse and compute → store in cache → return.

@cache
def dp(i, j):
    if base_case(i, j): return 0
    return min(dp(i-1, j), dp(i, j-1)) + cost[i][j]

Time equals O(unique states × branching factor); space equals O(unique states) for the cache plus O(depth) for the call stack.

Cache Key Construction for Complex State

When state includes a list or set, convert to a tuple or frozenset before using as a key. Hashing a mutable object by identity (not value) causes incorrect misses on logically identical states.

@cache
def solve(i, remaining: tuple):  # frozenset → frozenset, list → tuple
    ...

Interval DP (Top-Down)

Solve dp(l, r) by trying all split points k in [l, r). State space is O(n²); each state costs O(n) to compute → O(n³) total. Prefer top-down when not all (l, r) pairs are reachable.

Digit DP (Top-Down)

State: (position, tight, leading_zero, accumulated_count). The tight flag restricts digit choices to ≤ the corresponding digit of the limit. Memoization caches (position, tight, ...) tuples; the non-tight subtree is reused across multiple prefixes.

Bitmask DP (Top-Down)

State: (mask, last) where mask is a bitmask of visited items. Reachable states ≈ 2ⁿ × n; suits n ≤ 20. Top-down naturally prunes unreachable masks and is often simpler to write than bottom-up for irregular transitions.

Advanced Techniques

Broken-Profile / Rolling State Memoization

Solves: Grid/tiling problems where state is the partial configuration of a column boundary.
Mechanism: State encodes the "profile" (a bitmask or tuple) of how the current boundary is filled; recursion advances one cell at a time.
When over simpler alternatives: Standard 2D DP can't express non-rectangular dependency shapes; use this when the boundary state has O(2^width) configurations but n×m is large.
Complexity: O(nm · 2^width).

Memoization with Continuation Passing (Re-rooting)

Solves: Tree problems where the answer at each node depends on both subtree and parent information (e.g., sum of distances to all nodes).
Mechanism: First DFS memoises down-pass values; second DFS propagates up-pass values by inverting the recurrence.
When over simpler alternatives: Pure top-down from the root can't access ancestor info without re-rooting, which would be O(n²). Use when the answer needs the full tree context at every node.
Complexity: O(n).

Lazy Memoization with Sentinel Values

Solves: State spaces where computing "no solution" is expensive and you want to short-circuit.
Mechanism: Initialize cache to a sentinel (e.g., None or float('inf')); check sentinel in cache lookup to distinguish "not computed" from "computed as infeasible."
When over simpler alternatives: @cache can't distinguish a cached None return from "not yet computed" without a separate seen-set. Use manual dict when the function can legitimately return None.

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Count paths / ways	How many ways to reach target	`dp(pos, remaining)` counting recurrence
Optimal cost / value	Min/max cost over choices	`dp(i, j)` with `min`/`max` over transitions
String matching / edit	Transform or align two strings	`dp(i, j)` over two string indices
Interval problems	Best partition / merge of a range	`dp(l, r)` with split-point enumeration
Subset / assignment	Assign items to groups	`dp(i, mask)` bitmask state
Game theory	Optimal play	`dp(state)` returning win/loss or score
Digit counting	Count numbers with property up to N	`dp(pos, tight, ...)` digit DP
Tree / graph DP	Optimal structure on a tree	`dp(node, parent)` or re-rooting

Problem Signal → Technique

Signal in problem	Likely technique
"Number of ways to …"	Counting recurrence; `dp(pos, ...)`
"Minimum/maximum cost to …"	Optimization recurrence; `dp(i, j)`
"Can you reach …"	Boolean memoization; return True/False
Two strings, comparing characters	2D `dp(i, j)` on string indices
Subarray / subrange, combine adjacent	Interval DP `dp(l, r)`
Subset of items, n ≤ 20	Bitmask DP `dp(mask)`
Digits of a number, "count integers ≤ N"	Digit DP with `tight` flag
Game, two players alternate	Game DP, minimax
"Return to any node in a tree" / sum-of-distances	Re-rooting DP
Recursion gives TLE / time limit	Add `@cache`; identify overlapping subproblems

Common Patterns

Pattern	When it applies	Mechanism
State compression via tuple	State includes variable-length sequence	Convert list/set to sorted tuple before caching
Freeze mutable argument	Input is a list passed recursively	`tuple(lst)` or `frozenset(s)` as key
Pass indices, not slices	Function receives a subarray	Use `(l, r)` indices into a global array; avoid slicing (copies break equality)
Explicit seen parameter	Visited set in graph traversal must be part of state	`frozenset(visited)` in key; only feasible for small graphs
Default-argument cache injection	Avoid global variable for cache	`def f(i, memo={}):` — but beware of shared state across test cases
`@cache` + `sys.setrecursionlimit`	Deep recursion in Python	Always pair these; set limit before calling the top-level function
Clear cache between test cases	Reusing the function in multiple test runs	`f.cache_clear()` after each test; or use a class with instance-level cache

Edge Cases & Pitfalls

Mutable key argument — passing a list directly to a @cache-decorated function raises TypeError: unhashable type: 'list'. Convert to tuple first; @cache requires all arguments to be hashable.
Missing state in key — if a variable affects the return value but isn't in the function signature (e.g., a global index used inside the function), caching by (i, j) alone returns stale results when that hidden variable changes. All varying state must be explicit in the key.
Shared cache across calls — @cache on a module-level function persists across multiple test-case invocations. Call .cache_clear() or define the function inside the problem method to get a fresh cache per call.
Recursion depth exceeded — Python's default limit is 1000. Deep memoization on inputs of size 500+ will crash. Add sys.setrecursionlimit(200000) before the call, or convert to iterative bottom-up.
Returning None as a valid value — if key in memo: return memo[key] is the safe check. Avoid if memo.get(key): which treats a cached 0 or False as a cache miss.
Off-by-one in base cases — base cases must be checked before any cache lookup to avoid storing incorrect results. Placing them after the cache check can cause a cache hit that skips the base case on re-entry.
Counting vs. existence conflation — a boolean memoization (True/False) is distinct from a count memoization (int). Using the wrong return type silently produces wrong answers; the cache hits will propagate the error.
Interval DP asymmetry — dp(l, r) with l > r must be handled explicitly (typically return 0 or the identity). Forgetting this base case causes infinite recursion when splits produce empty intervals.

Implementation Notes

functools.cache (Python 3.9+) is identical to lru_cache(maxsize=None) and is the preferred spelling.
@cache does not limit cache size; for memory-constrained environments use @lru_cache(maxsize=N) where N is tuned to the problem's state space.
Nested functions decorated with @cache create a new cache per outer-function call, which is usually desirable for isolation but adds overhead if the outer function is called in a tight loop.
When all arguments are integers bounded by the input size, a 2D/3D list (bottom-up table) is faster than a dict cache (hash overhead per lookup); switch to tabulation for tight time limits.
tuple(lst) is O(n); if key construction dominates runtime, pre-compute keys or redesign state to use integers.
In Python, @cache on a method caches with self in the key — every distinct object gets its own cache, which is usually correct but can leak memory if many objects are created.

Cross-Topic Relations

Related topic	Specific connection
dynamic-programming.md	Memoization is the top-down formulation of DP; every memoized solution has a bottom-up equivalent
recursion.md	Memoization requires a working recursive solution first; the recurrence structure drives state design
backtracking.md	Backtracking explores all paths and undoes state; memoization assumes subproblem results are reusable (no undo needed). Cannot memoize when state includes the path taken
divide-and-conquer.md	D&C has disjoint subproblems — memoization adds no benefit; overlap is the key differentiator
bitmask.md	Bitmask DP uses memoization with a bitmask dimension; frequently combined for subset/assignment problems
graph-theory.md	Memoization on a DAG is equivalent to bottom-up DP in topological order; cycle detection is required before memoizing on a general graph

Interview Reference

Must-Know Problems

1D state

Climbing Stairs (LeetCode 70) — classic overlapping subproblem introduction
House Robber (198) — adjacent constraint
Coin Change (322) — unbounded knapsack
Word Break (139) — string segmentation

2D state — two sequences

Longest Common Subsequence (1143)
Edit Distance (72)
Regular Expression Matching (10) — null/wildcard transitions are the non-obvious base cases

2D state — grid

Unique Paths (62) / Unique Paths II (63)
Minimum Path Sum (64)
Cherry Pickup II (1463) — two simultaneous walkers, one 3D state

Interval DP

Burst Balloons (312) — "what if I pop this last?" framing
Strange Printer (664)
Palindrome Partitioning II (132)

Bitmask DP

Shortest Path Visiting All Nodes (847)
Partition to K Equal Sum Subsets (698)

Digit DP

Non-negative Integers without Consecutive Ones (600)
Count Numbers with Unique Digits (357) — simpler entry point

Tree DP

House Robber III (337) — return two values per node (rob vs. skip)
Binary Tree Cameras (968)

Clarification Questions to Ask

"Are there negative values or cycles in the graph?" — determines whether memoization is safe (requires a DAG or bounded state).
"Can the same item / character be reused?" — distinguishes bounded from unbounded knapsack; changes state transitions.
"What is the range of inputs?" — determines whether a dict cache or array-based table is appropriate.
"Is the answer a count or a yes/no?" — return type affects how base cases and transitions are written.
"Does 'path' mean any route or a simple (non-revisiting) path?" — simple paths on a general graph cannot be memoized without tracking visited nodes, blowing up state space.

Common Interview Mistakes

Writing the recursion correctly but forgetting @cache, producing O(2ⁿ) time and TLE.
Putting mutable objects (lists, sets) into the cache key without converting to a hashable type.
Defining base cases inside a branch that's short-circuited by the cache check, so the base case never fires on re-entry.
Not clearing the cache between test cases when the decorated function is defined at module scope.
Conflating "memoization is always better" — for dense state spaces with all states reachable, bottom-up tabulation avoids recursion overhead and stack limits.

Typical Follow-Up Escalations

"Now optimize space" → identify which previous states are still needed; convert to rolling array / bottom-up with reduced dimensions.
"The values are too large for an array (n = 10⁹)" → switch to a dict-based cache; states are sparse so array indexing is infeasible.
"Can you do it iteratively?" → build the bottom-up table in topological order of the subproblem DAG.
"What if there are cycles in the graph?" → memoization alone is unsafe; need cycle detection (visited set) or reformulate as shortest-path (Bellman-Ford / Dijkstra).
"Handle multiple test cases efficiently" → restructure so the cache is built once and queried per test case (offline processing), rather than cleared and rebuilt each time.

Explore Unread

Great job! You've read all available articles

Memoization

Topic type: Technique-Based
LeetCode tag: Memoization (~120 problems)

See dynamic-programming.md for full DP theory: optimal substructure, subproblem DAGs, bottom-up formulation, and DP variant taxonomy.
See recursion.md for recursion fundamentals: base cases, recurrence structure, and stack mechanics.

Core Concepts

Top-down evaluation — subproblems are solved lazily, driven by which states the recursion actually visits. Opposite of bottom-up (tabulation), which fills all states in dependency order.

Subproblem independence — each unique key's result must be a pure function of that key alone; if the return value depends on mutable global state outside the key, caching is incorrect.

Types / Variants

Variant	Mechanism	When to prefer
`@cache` / `@lru_cache(None)`	Python decorator wraps any function; key = all args as a tuple	Default choice in Python; zero boilerplate
Manual dict cache	`memo = {}` passed or closed over; `if key in memo: return memo[key]`	Need finer control (e.g., custom key, clearing, cross-function sharing)
Class-level cache	`self._memo` dict on an object	When memoised function is a method and object state is part of the key
Multi-level cache	Nested dicts or compound tuple keys	State has multiple independent dimensions (e.g., `(i, j, remaining)`)
Iterative + explicit stack	Convert recursion to iteration with a stack; fill cache as results become available	Recursion depth exceeds ~1000; Python stack limit is a hard constraint

Key Algorithms

Top-Down DP via Memoization

Translate the recurrence relation directly into a recursive function, then cache. The algorithm structure is: check cache → base case → recurse and compute → store in cache → return.

@cache
def dp(i, j):
    if base_case(i, j): return 0
    return min(dp(i-1, j), dp(i, j-1)) + cost[i][j]

Time equals O(unique states × branching factor); space equals O(unique states) for the cache plus O(depth) for the call stack.

Cache Key Construction for Complex State

When state includes a list or set, convert to a tuple or frozenset before using as a key. Hashing a mutable object by identity (not value) causes incorrect misses on logically identical states.

@cache
def solve(i, remaining: tuple):  # frozenset → frozenset, list → tuple
    ...

Interval DP (Top-Down)

Solve dp(l, r) by trying all split points k in [l, r). State space is O(n²); each state costs O(n) to compute → O(n³) total. Prefer top-down when not all (l, r) pairs are reachable.

Digit DP (Top-Down)

Bitmask DP (Top-Down)

Advanced Techniques

Broken-Profile / Rolling State Memoization

Memoization with Continuation Passing (Re-rooting)

Lazy Memoization with Sentinel Values

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Count paths / ways	How many ways to reach target	`dp(pos, remaining)` counting recurrence
Optimal cost / value	Min/max cost over choices	`dp(i, j)` with `min`/`max` over transitions
String matching / edit	Transform or align two strings	`dp(i, j)` over two string indices
Interval problems	Best partition / merge of a range	`dp(l, r)` with split-point enumeration
Subset / assignment	Assign items to groups	`dp(i, mask)` bitmask state
Game theory	Optimal play	`dp(state)` returning win/loss or score
Digit counting	Count numbers with property up to N	`dp(pos, tight, ...)` digit DP
Tree / graph DP	Optimal structure on a tree	`dp(node, parent)` or re-rooting

Problem Signal → Technique

Signal in problem	Likely technique
"Number of ways to …"	Counting recurrence; `dp(pos, ...)`
"Minimum/maximum cost to …"	Optimization recurrence; `dp(i, j)`
"Can you reach …"	Boolean memoization; return True/False
Two strings, comparing characters	2D `dp(i, j)` on string indices
Subarray / subrange, combine adjacent	Interval DP `dp(l, r)`
Subset of items, n ≤ 20	Bitmask DP `dp(mask)`
Digits of a number, "count integers ≤ N"	Digit DP with `tight` flag
Game, two players alternate	Game DP, minimax
"Return to any node in a tree" / sum-of-distances	Re-rooting DP
Recursion gives TLE / time limit	Add `@cache`; identify overlapping subproblems

Common Patterns

Pattern	When it applies	Mechanism
State compression via tuple	State includes variable-length sequence	Convert list/set to sorted tuple before caching
Freeze mutable argument	Input is a list passed recursively	`tuple(lst)` or `frozenset(s)` as key
Pass indices, not slices	Function receives a subarray	Use `(l, r)` indices into a global array; avoid slicing (copies break equality)
Explicit seen parameter	Visited set in graph traversal must be part of state	`frozenset(visited)` in key; only feasible for small graphs
Default-argument cache injection	Avoid global variable for cache	`def f(i, memo={}):` — but beware of shared state across test cases
`@cache` + `sys.setrecursionlimit`	Deep recursion in Python	Always pair these; set limit before calling the top-level function
Clear cache between test cases	Reusing the function in multiple test runs	`f.cache_clear()` after each test; or use a class with instance-level cache

Edge Cases & Pitfalls

Mutable key argument — passing a list directly to a @cache-decorated function raises TypeError: unhashable type: 'list'. Convert to tuple first; @cache requires all arguments to be hashable.
Missing state in key — if a variable affects the return value but isn't in the function signature (e.g., a global index used inside the function), caching by (i, j) alone returns stale results when that hidden variable changes. All varying state must be explicit in the key.
Shared cache across calls — @cache on a module-level function persists across multiple test-case invocations. Call .cache_clear() or define the function inside the problem method to get a fresh cache per call.
Recursion depth exceeded — Python's default limit is 1000. Deep memoization on inputs of size 500+ will crash. Add sys.setrecursionlimit(200000) before the call, or convert to iterative bottom-up.
Returning None as a valid value — if key in memo: return memo[key] is the safe check. Avoid if memo.get(key): which treats a cached 0 or False as a cache miss.
Off-by-one in base cases — base cases must be checked before any cache lookup to avoid storing incorrect results. Placing them after the cache check can cause a cache hit that skips the base case on re-entry.
Counting vs. existence conflation — a boolean memoization (True/False) is distinct from a count memoization (int). Using the wrong return type silently produces wrong answers; the cache hits will propagate the error.
Interval DP asymmetry — dp(l, r) with l > r must be handled explicitly (typically return 0 or the identity). Forgetting this base case causes infinite recursion when splits produce empty intervals.

Implementation Notes

functools.cache (Python 3.9+) is identical to lru_cache(maxsize=None) and is the preferred spelling.
@cache does not limit cache size; for memory-constrained environments use @lru_cache(maxsize=N) where N is tuned to the problem's state space.
Nested functions decorated with @cache create a new cache per outer-function call, which is usually desirable for isolation but adds overhead if the outer function is called in a tight loop.
When all arguments are integers bounded by the input size, a 2D/3D list (bottom-up table) is faster than a dict cache (hash overhead per lookup); switch to tabulation for tight time limits.
tuple(lst) is O(n); if key construction dominates runtime, pre-compute keys or redesign state to use integers.
In Python, @cache on a method caches with self in the key — every distinct object gets its own cache, which is usually correct but can leak memory if many objects are created.

Cross-Topic Relations

Related topic	Specific connection
dynamic-programming.md	Memoization is the top-down formulation of DP; every memoized solution has a bottom-up equivalent
recursion.md	Memoization requires a working recursive solution first; the recurrence structure drives state design
backtracking.md	Backtracking explores all paths and undoes state; memoization assumes subproblem results are reusable (no undo needed). Cannot memoize when state includes the path taken
divide-and-conquer.md	D&C has disjoint subproblems — memoization adds no benefit; overlap is the key differentiator
bitmask.md	Bitmask DP uses memoization with a bitmask dimension; frequently combined for subset/assignment problems
graph-theory.md	Memoization on a DAG is equivalent to bottom-up DP in topological order; cycle detection is required before memoizing on a general graph

Interview Reference

Must-Know Problems

1D state

Climbing Stairs (LeetCode 70) — classic overlapping subproblem introduction
House Robber (198) — adjacent constraint
Coin Change (322) — unbounded knapsack
Word Break (139) — string segmentation

2D state — two sequences

Longest Common Subsequence (1143)
Edit Distance (72)
Regular Expression Matching (10) — null/wildcard transitions are the non-obvious base cases

2D state — grid

Unique Paths (62) / Unique Paths II (63)
Minimum Path Sum (64)
Cherry Pickup II (1463) — two simultaneous walkers, one 3D state

Interval DP

Burst Balloons (312) — "what if I pop this last?" framing
Strange Printer (664)
Palindrome Partitioning II (132)

Bitmask DP

Shortest Path Visiting All Nodes (847)
Partition to K Equal Sum Subsets (698)

Digit DP

Non-negative Integers without Consecutive Ones (600)
Count Numbers with Unique Digits (357) — simpler entry point

Tree DP

House Robber III (337) — return two values per node (rob vs. skip)
Binary Tree Cameras (968)

Clarification Questions to Ask

"Are there negative values or cycles in the graph?" — determines whether memoization is safe (requires a DAG or bounded state).
"Can the same item / character be reused?" — distinguishes bounded from unbounded knapsack; changes state transitions.
"What is the range of inputs?" — determines whether a dict cache or array-based table is appropriate.
"Is the answer a count or a yes/no?" — return type affects how base cases and transitions are written.
"Does 'path' mean any route or a simple (non-revisiting) path?" — simple paths on a general graph cannot be memoized without tracking visited nodes, blowing up state space.

Common Interview Mistakes

Writing the recursion correctly but forgetting @cache, producing O(2ⁿ) time and TLE.
Putting mutable objects (lists, sets) into the cache key without converting to a hashable type.
Defining base cases inside a branch that's short-circuited by the cache check, so the base case never fires on re-entry.
Not clearing the cache between test cases when the decorated function is defined at module scope.
Conflating "memoization is always better" — for dense state spaces with all states reachable, bottom-up tabulation avoids recursion overhead and stack limits.

Typical Follow-Up Escalations

"Now optimize space" → identify which previous states are still needed; convert to rolling array / bottom-up with reduced dimensions.
"The values are too large for an array (n = 10⁹)" → switch to a dict-based cache; states are sparse so array indexing is infeasible.
"Can you do it iteratively?" → build the bottom-up table in topological order of the subproblem DAG.
"What if there are cycles in the graph?" → memoization alone is unsafe; need cycle detection (visited set) or reformulate as shortest-path (Bellman-Ford / Dijkstra).
"Handle multiple test cases efficiently" → restructure so the cache is built once and queried per test case (offline processing), rather than cleared and rebuilt each time.

Explore Unread

Great job! You've read all available articles

Memoization

Core Concepts

Types / Variants

Key Algorithms

Top-Down DP via Memoization

Cache Key Construction for Complex State

Interval DP (Top-Down)

Digit DP (Top-Down)

Bitmask DP (Top-Down)

Advanced Techniques

Broken-Profile / Rolling State Memoization

Memoization with Continuation Passing (Re-rooting)

Lazy Memoization with Sentinel Values

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Memoization

Core Concepts

Types / Variants

Key Algorithms

Top-Down DP via Memoization

Cache Key Construction for Complex State

Interval DP (Top-Down)

Digit DP (Top-Down)

Bitmask DP (Top-Down)

Advanced Techniques

Broken-Profile / Rolling State Memoization

Memoization with Continuation Passing (Re-rooting)

Lazy Memoization with Sentinel Values

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread