Complexity Analysis: A Comprehensive Guide
1. Foundational Understanding
1.1 What is Complexity Analysis?
Complexity analysis is the mathematical framework for analyzing and expressing how the runtime and memory requirements of an algorithm scale with input size. It provides a language-independent, machine-independent way to compare algorithms and predict their performance characteristics.
Primary Components:
- Time Complexity: How runtime scales with input size
- Space Complexity: How memory usage scales with input size
- Case Analysis: Best, average, and worst-case scenarios
- Amortized Analysis: Average cost over sequences of operations
- Asymptotic Notation: Mathematical framework for expressing growth rates
Core Innovation: Abstraction from constant factors and hardware specifics, focusing on growth rates to enable universal, timeless analysis.
Why Complexity Analysis Exists:
- Abstraction: Focus on what matters (growth rate) versus what doesn't (constants, hardware)
- Scalability: Understand behavior as problem size grows
- Comparison: Evaluate algorithms independently of implementation
- Optimization: Identify which improvements have meaningful impact
- Theoretical limits: Establish what's computationally feasible
- Resource planning: Predict memory and CPU requirements
Why Space Complexity Analysis Specifically:
- Memory constraints: RAM is finite and expensive at scale
- Performance impact: Memory access patterns affect speed (cache misses, page faults)
- Scalability: Determines maximum problem size solvable
- Cost optimization: Cloud computing charges for memory usage
- Embedded systems: Strict memory budgets in IoT, automotive, aerospace domains
- Big data processing: Processing datasets larger than available RAM
1.2 Historical Context and Breakthroughs
Key Developments:
- Formalization of complexity classes (P, NP, etc.)
- Development of in-place algorithms (heap sort)
- Master theorem for divide-and-conquer recurrences
- Amortized analysis techniques (dynamic arrays, splay trees)
- Lower bound proofs (Ω(n log n) for comparison sorting)
- Space-efficient data structures (succinct data structures)
- Streaming algorithms for limited memory
1.3 Scope and Limitations
What Complexity Analysis Covers:
- Asymptotic behavior (n → ∞)
- Growth rates and scalability
- Comparative algorithm performance
- Resource requirements prediction
What It Doesn't Cover:
- Exact runtime in seconds/milliseconds
- Hardware-specific performance
- Cache effects and memory hierarchy
- Programming language overhead
- Constant factors and hidden coefficients
- Small input performance (n < 100)
1.4 Assumptions and Requirements
Analysis Assumptions:
For Time Complexity:
- Input size n approaches infinity (asymptotic behavior)
- All basic operations take constant time
- Random Access Memory (RAM) model of computation
- Standard arithmetic operations are O(1)
- Inputs are well-formed and valid
For Space Complexity:
- RAM model: Random access memory, constant-time access
- Word size: O(log n) bits to address n elements
- Garbage collection: Freed memory is immediately available (idealized)
- No memory fragmentation: Contiguous allocation when possible
- Character encoding: 1-4 bytes (encoding dependent)
Preconditions:
- Algorithm must be correct (complexity analysis doesn't validate correctness)
- Clear definition of "input size"
- Identification of the basic operation to count
- Understanding of language/system memory management
Postconditions:
- Complexity bound applies for all inputs of size n
- Bound is valid in the limit as n → ∞
- Guarantees hold only if assumptions are met
2. Asymptotic Notation and Growth Rates
2.1 Core Asymptotic Notations
Big O (O) - Upper Bound
Mathematical Definition:
f(n) = O(g(n)) if and only if there exist positive constants c and n₀ such that:
0 ≤ f(n) ≤ c·g(n) for all n ≥ n₀
Limit Definition:
f(n) = O(g(n)) ⟺ lim sup(n→∞) [f(n)/g(n)] < ∞
Meaning: f(n) grows no faster than g(n), up to constant factors
Big Omega (Ω) - Lower Bound
Mathematical Definition:
f(n) = Ω(g(n)) if and only if there exist positive constants c and n₀ such that:
0 ≤ c·g(n) ≤ f(n) for all n ≥ n₀
Limit Definition:
f(n) = Ω(g(n)) ⟺ lim inf(n→∞) [f(n)/g(n)] > 0
Meaning: f(n) grows at least as fast as g(n)
Big Theta (Θ) - Tight Bound
Mathematical Definition:
f(n) = Θ(g(n)) if and only if:
f(n) = O(g(n)) AND f(n) = Ω(g(n))
Limit Definition:
f(n) = Θ(g(n)) ⟺ 0 < lim(n→∞) [f(n)/g(n)] < ∞
Meaning: f(n) grows exactly at the same rate as g(n)
Little o (o) - Strict Upper Bound
Mathematical Definition:
f(n) = o(g(n)) if and only if:
For every positive constant c > 0, there exists n₀ such that:
0 ≤ f(n) < c·g(n) for all n ≥ n₀
Limit Definition:
f(n) = o(g(n)) ⟺ lim(n→∞) [f(n)/g(n)] = 0
Meaning: f(n) grows strictly slower than g(n)
Examples:
n = o(n²) ✓
n² ≠ o(n²) ✗
log n = o(n) ✓
Little omega (ω) - Strict Lower Bound
Mathematical Definition:
f(n) = ω(g(n)) if and only if:
For every positive constant c > 0, there exists n₀ such that:
0 ≤ c·g(n) < f(n) for all n ≥ n₀
Limit Definition:
f(n) = ω(g(n)) ⟺ lim(n→∞) [f(n)/g(n)] = ∞
Meaning: f(n) grows strictly faster than g(n)
Relationship: f(n) = ω(g(n)) ⟺ g(n) = o(f(n))
2.2 Properties and Axioms
Transitivity:
If f(n) = O(g(n)) and g(n) = O(h(n)), then f(n) = O(h(n))
Same for Ω, Θ, o, ω
Reflexivity:
f(n) = O(f(n))
f(n) = Ω(f(n))
f(n) = Θ(f(n))
Symmetry (Θ only):
If f(n) = Θ(g(n)), then g(n) = Θ(f(n))
Transpose Symmetry:
f(n) = O(g(n)) ⟺ g(n) = Ω(f(n))
f(n) = o(g(n)) ⟺ g(n) = ω(f(n))
Trichotomy (almost):
For any two functions f(n) and g(n), exactly one holds:
1. f(n) = o(g(n))
2. f(n) = Θ(g(n))
3. f(n) = ω(g(n))
Exception: Non-comparable functions (e.g., n and n^(1+sin n))
2.3 Asymptotic Arithmetic
Addition:
f₁(n) = O(g₁(n)) and f₂(n) = O(g₂(n))
⟹ f₁(n) + f₂(n) = O(max(g₁(n), g₂(n)))
Examples:
O(n²) + O(n) = O(n²)
Θ(n log n) + Θ(n) = Θ(n log n)
Multiplication:
f₁(n) = O(g₁(n)) and f₂(n) = O(g₂(n))
⟹ f₁(n) · f₂(n) = O(g₁(n) · g₂(n))
Examples:
O(n) · O(n) = O(n²)
Θ(log n) · Θ(n) = Θ(n log n)
Exponentiation:
If f(n) = O(g(n)), then:
aᶠ⁽ⁿ⁾ = O(aᵍ⁽ⁿ⁾) for a > 1
Logarithm Base Conversion:
log_a(n) = Θ(log_b(n)) for all a, b > 1
Reason: log_a(n) = log_b(n) / log_b(a), where log_b(a) is constant
Implication: Base doesn't matter in asymptotic notation!
O(log₂ n) = O(log₁₀ n) = O(ln n)
2.4 Common Simplification Rules
Rule 1: Drop constant factors
5n² = Θ(n²)
(1/2)n log n = Θ(n log n)
1000000 = Θ(1)
Rule 2: Drop lower-order terms
n² + n = Θ(n²)
n³ + n² + n + 1 = Θ(n³)
n log n + n = Θ(n log n)
Rule 3: Logarithm properties
log(ab) = log a + log b
log(aᵇ) = b log a
log(n!) = Θ(n log n) (Stirling's approximation)
Rule 4: Exponential dominance
For any polynomial p(n):
p(n) = o(aⁿ) for any a > 1
n¹⁰⁰⁰ = o(1.001ⁿ)
Rule 5: Polynomial hierarchy
For constants a < b:
nᵃ = o(nᵇ)
n² = o(n³)
√n = o(n)
2.5 Growth Rate Hierarchy
Complete Hierarchy (Slow to Fast):
O(1) Constant
< O(log* n) Iterated logarithm
< O(α(n)) Inverse Ackermann
< O(log log n) Log-log
< O(log n) Logarithmic
< O((log n)²) Log-squared
< O(√n) Square root
< O(n) Linear
< O(n log log n) Linear log-log
< O(n log n) Linearithmic
< O(n (log n)²) Linear log-squared
< O(n^1.5) Superlinear
< O(n²) Quadratic
< O(n² log n) Quadratic log
< O(n³) Cubic
< O(n^k) Polynomial (constant k)
< O(2^n) Exponential
< O(n!) Factorial
< O(n^n) Super-exponential
Polynomial vs. Exponential:
Polynomial: O(n^k) for constant k
- n, n², n³, n¹⁰⁰
- Tractable (P complexity class)
Exponential: O(k^n) for constant k > 1
- 2^n, 3^n
- Intractable
Key: Any polynomial = o(exponential)
n¹⁰⁰⁰ = o(1.001^n)
2.6 Comparing Growth Rates
Using L'Hôpital's Rule:
Compare n² and 2ⁿ:
lim(n→∞) [n²/2ⁿ]
= lim(n→∞) [2n/(2ⁿ ln 2)] (L'Hôpital once)
= lim(n→∞) [2/(2ⁿ (ln 2)²)] (L'Hôpital again)
= 0
Therefore: n² = o(2ⁿ), or 2ⁿ = ω(n²)
Common Proofs:
Prove: log n = o(n)
lim(n→∞) [log n / n]
= lim(n→∞) [(1/n) / 1] (L'Hôpital)
= lim(n→∞) [1/n]
= 0 ✓
Prove: n = o(n log n)
lim(n→∞) [n / (n log n)] = lim(n→∞) [1 / log n] = 0 ✓
Prove: n log n = o(n²)
lim(n→∞) [(n log n) / n²] = lim(n→∞) [log n / n] = 0 ✓
2.7 Common Summation Formulas
Σᵢ₌₁ⁿ i = n(n+1)/2 = Θ(n²)
Σᵢ₌₁ⁿ i² = n(n+1)(2n+1)/6 = Θ(n³)
Σᵢ₌₀ⁿ 2^i = 2^(n+1) - 1 = Θ(2^n)
Σᵢ₌₁ⁿ 1/i = Θ(log n) (harmonic series)
Σᵢ₌₀^∞ r^i = 1/(1-r) for |r| < 1
2.8 Proof Techniques
Direct Proof Using Definition:
Example: Prove 3n² + 5n + 2 = O(n²)
Goal: Show ∃c, n₀ such that 3n² + 5n + 2 ≤ c·n² for all n ≥ n₀
Proof:
3n² + 5n + 2 ≤ 3n² + 5n² + 2n² (for n ≥ 1, since n ≤ n² and 1 ≤ n²)
= 10n²
= c·n² where c = 10
Choose c = 10 and n₀ = 1.
For all n ≥ 1: 3n² + 5n + 2 ≤ 10n²
Therefore, 3n² + 5n + 2 = O(n²) ✓
Limit Method:
To prove f(n) = O(g(n)):
Calculate lim(n→∞) [f(n)/g(n)]
If limit is finite, f(n) = O(g(n))
Example: Prove n log n = O(n²)
lim(n→∞) [n log n / n²]
= lim(n→∞) [log n / n]
= 0 (by L'Hôpital or known fact)
Therefore n log n = O(n²) ✓
Substitution Method for Recurrences:
1. Guess the form of the solution
2. Use mathematical induction to prove it
3. Verify base cases
Example: T(n) = 2T(n/2) + n
Guess: T(n) = O(n log n)
Inductive hypothesis: T(k) ≤ ck log k for k < n
Proof:
T(n) = 2T(n/2) + n
≤ 2·c(n/2)·log(n/2) + n
= cn·log(n/2) + n
= cn·(log n - log 2) + n
= cn log n - cn + n
≤ cn log n (if c ≥ 1)
Therefore T(n) = O(n log n) ✓
3. Time Complexity Analysis
3.1 Common Complexity Classes
O(1) - Constant Time
Examples:
- Array access: arr[5]
- Arithmetic operations: a + b
- Hash table operations (average): hash_map[key]
- Variable assignment
Characteristics:
- Independent of input size
- Same time regardless of n
Sublinear: o(n) - Faster than Linear
Grows slower than linear but faster than constant
Examples:
- O(log n): Binary search
- O(√n): Finding divisors up to square root
- O(log log n): Interpolation search on uniform data
- O(n^ε) for ε < 1: Fractional exponents
Characteristics:
- Doesn't examine all input elements
- Requires special structure (sorted, indexed, etc.)
- Critical for very large datasets
Applications:
- Streaming algorithms: Process data with limited memory
- Sketching algorithms: Approximate answers with small space
- Index structures: Fast lookups without full scan
O(log n) - Logarithmic
Examples:
- Binary search
- Balanced tree operations (BST, AVL, Red-Black)
- Finding number of digits: log₁₀(n)
Characteristics:
- Dividing problem in half repeatedly
- Very efficient: log₂(10⁹) ≈ 30
Code Pattern:
while i > 1:
i = i // 2
O(n) - Linear
Examples:
- Array traversal
- Linear search
- Finding min/max
- Single loop
Code Pattern:
for item in arr:
process(item)
O(n log n) - Linearithmic
Examples:
- Efficient sorting (merge sort, heap sort, quicksort average)
- Building heap from array
- Many divide-and-conquer algorithms
Characteristics:
- Optimal for comparison-based sorting
- Very practical complexity
Code Pattern:
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # T(n/2)
right = merge_sort(arr[mid:]) # T(n/2)
return merge(left, right) # Θ(n)
O(n²) - Quadratic
Examples:
- Nested loops
- Bubble sort, insertion sort, selection sort
- Simple matrix operations
Code Pattern:
for i in range(n):
for j in range(n):
# O(1) operation
O(n³) - Cubic
Examples:
- Matrix multiplication (naive)
- Triple nested loops
- All triples enumeration
Acceptable for: n < 500 typically
O(2ⁿ) - Exponential
Examples:
- Subset generation: all 2ⁿ subsets
- Naive recursive Fibonacci
- Brute force search
Characteristics:
- Intractable for n > 30
- Doubles with each increment
- Often requires optimization (DP, memoization)
Example:
def fibonacci_naive(n):
if n <= 1:
return n
return fibonacci_naive(n-1) + fibonacci_naive(n-2)
# T(n) = Θ(φⁿ) where φ ≈ 1.618
O(n!) - Factorial
Examples:
- Generating all permutations
- Traveling salesman (brute force)
Characteristics:
- Only feasible for n ≤ 10-12
- Extremely fast growth
- Requires heuristics or approximation
3.2 Analysis Process
Step-by-Step Guide:
1. Identify the basic operation
- Comparisons, assignments, arithmetic?
- What are we counting?
2. Define input size n
- Array: number of elements
- Graph: vertices (V) and/or edges (E)
- String: length
- Matrix: dimensions
3. Count operations as function of n
- How many times does operation execute?
4. Simplify using asymptotic rules
- Drop constant factors
- Drop lower-order terms
- Keep dominant term
5. Express in notation
- O for upper bound
- Θ for tight bound
- Ω for lower bound
3.3 Pattern Recognition
Pattern 1: Single loop - O(n)
for i in range(n):
# O(1) operation
# T(n) = n × O(1) = O(n)
Pattern 2: Nested loops (same size) - O(n²)
for i in range(n):
for j in range(n):
# O(1) operation
# T(n) = n × n × O(1) = O(n²)
Pattern 3: Nested loops (dependent) - O(n²)
for i in range(n):
for j in range(i, n): # Inner depends on i
# O(1) operation
# T(n) = n + (n-1) + ... + 1 = n(n+1)/2 = O(n²)
Pattern 4: Halving input - O(log n)
i = n
while i > 1:
i = i // 2
# Iterations: log₂(n)
Pattern 5: Divide and conquer - O(n log n) typically
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # T(n/2)
right = merge_sort(arr[mid:]) # T(n/2)
return merge(left, right) # O(n)
# Recurrence: T(n) = 2T(n/2) + O(n)
# Solution: T(n) = O(n log n)
Pattern 6: Multiple independent operations - Take maximum
def process(arr):
step1(arr) # O(n)
step2(arr) # O(n log n)
step3(arr) # O(n)
# Total: O(n) + O(n log n) + O(n) = O(n log n)
3.4 Detailed Examples
Example 1: Dependent Loops
def example(arr):
n = len(arr)
count = 0
for i in range(n):
for j in range(i, n): # j starts at i
count += 1
return count
# Analysis:
# i = 0: j runs from 0 to n-1 → n iterations
# i = 1: j runs from 1 to n-1 → n-1 iterations
# i = 2: j runs from 2 to n-1 → n-2 iterations
# Total: n + (n-1) + (n-2) + ... + 1
# = n(n+1)/2
# = Θ(n²)
Example 2: Logarithmic with Inner Loop
def example(n):
count = 0
i = 1
while i < n:
j = 1
while j < n:
count += 1
j *= 2 # Inner: log n iterations
i *= 2 # Outer: log n iterations
return count
# Outer loop: Θ(log n) iterations
# Inner loop: Θ(log n) iterations per outer
# Total: Θ(log n) × Θ(log n) = Θ((log n)²)
Example 3: Harmonic Series
def example(n):
total = 0
for i in range(1, n+1):
for j in range(i):
total += 1/i # Each iteration adds 1/i
return total
# For each i, inner loop runs i times, each adding 1/i:
# Contribution from i: i × (1/i) = 1
# Total: Σᵢ₌₁ⁿ 1 = n = Θ(n)
# Despite nested loops, still Θ(n)!
Example 4: Modified Binary Search
def modified_binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if condition(arr[mid]): # O(n) operation
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
# Analysis: O(log n) iterations × O(n) per iteration = O(n log n)
Example 5: Mystery Function
def mystery(n):
i = 1
s = 1
while s <= n:
i += 1
s += i
# Analysis:
# s = 1 + 2 + 3 + ... + k
# s = k(k+1)/2
# When s = n: k(k+1)/2 ≈ n
# k ≈ √(2n) = O(√n)
# Answer: O(√n)
3.5 Algorithm Complexity Reference
Sorting Algorithms:
| Algorithm | Best | Average | Worst | Space | Stable | | -------------- | ---------- | ---------- | ---------- | -------- | ------ | | Bubble Sort | Θ(n) | Θ(n²) | Θ(n²) | O(1) | Yes | | Insertion Sort | Θ(n) | Θ(n²) | Θ(n²) | O(1) | Yes | | Selection Sort | Θ(n²) | Θ(n²) | Θ(n²) | O(1) | No | | Merge Sort | Θ(n log n) | Θ(n log n) | Θ(n log n) | O(n) | Yes | | Quick Sort | Θ(n log n) | Θ(n log n) | Θ(n²) | O(log n) | No | | Heap Sort | Θ(n log n) | Θ(n log n) | Θ(n log n) | O(1) | No | | Counting Sort | Θ(n+k) | Θ(n+k) | Θ(n+k) | O(k) | Yes | | Radix Sort | Θ(n·k) | Θ(n·k) | Θ(n·k) | O(n+k) | Yes |
Data Structure Operations:
| Structure | Access | Search | Insert | Delete | Space | | ------------------ | ---------- | ---------- | ---------- | ---------- | ----- | | Array | O(1) | O(n) | O(n) | O(n) | O(n) | | Dynamic Array | O(1) | O(n) | O(1)* | O(n) | O(n) | | Linked List | O(n) | O(n) | O(1) | O(1) | O(n) | | Stack | O(n) | O(n) | O(1) | O(1) | O(n) | | Queue | O(n) | O(n) | O(1) | O(1) | O(n) | | Hash Table | N/A | O(1)* | O(1)* | O(1)* | O(n) | | Binary Search Tree | O(log n)* | O(log n)* | O(log n)* | O(log n)* | O(n) | | AVL Tree | O(log n) | O(log n) | O(log n) | O(log n) | O(n) | | Red-Black Tree | O(log n) | O(log n) | O(log n) | O(log n) | O(n) | | B-Tree | O(log n) | O(log n) | O(log n) | O(log n) | O(n) | | Heap | N/A | O(n) | O(log n) | O(log n) | O(n) |
*Amortized or average case
Graph Algorithms:
| Algorithm | Time | Space | | ------------------------- | -------------- | ----- | | BFS | O(V+E) | O(V) | | DFS | O(V+E) | O(V) | | Dijkstra (binary heap) | O((V+E) log V) | O(V) | | Dijkstra (Fibonacci heap) | O(E + V log V) | O(V) | | Bellman-Ford | O(VE) | O(V) | | Floyd-Warshall | O(V³) | O(V²) | | Kruskal's MST | O(E log E) | O(V) | | Prim's MST | O(E log V) | O(V) |
4. Space Complexity Analysis
4.1 Fundamental Concepts
Space Complexity Components:
Total Space = Input Space + Auxiliary Space
- Input Space: Memory for input data (usually O(n))
- Auxiliary Space: Extra memory during execution (what we optimize)
What Counts Toward Space:
- Variables (primitives)
- Data structures (arrays, lists, hash tables, trees)
- Stack space (function calls, recursion depth)
- Heap allocations (dynamic memory)
- Return values (if significant)
What Doesn't Count:
- Instruction code
- Constants and literals
- Compiler/interpreter overhead
4.2 Analysis Process
Step-by-Step:
1. Identify input size: n elements, dimensions, graph size
2. Count variables: How many primitive variables?
3. Analyze data structures: What auxiliary structures created?
4. Examine recursion: Maximum call stack depth?
5. Calculate peak usage: Sum all simultaneously alive
6. Simplify to dominant term
7. Express in Big O notation
4.3 Common Space Complexities
O(1) - Constant Space (In-Place)
# In-place swap
def swap(arr, i, j):
arr[i], arr[j] = arr[j], arr[i]
# Only temporary variable: O(1)
# In-place reversal
def reverse_inplace(arr):
left, right = 0, len(arr) - 1
while left < right:
arr[left], arr[right] = arr[right], arr[left]
left += 1
right -= 1
# Variables: left, right, temp: O(1) auxiliary
O(log n) - Logarithmic Space
# Iterative binary search: O(1) space
def binary_search_iterative(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
# Space: O(1) - just variables
# Recursive binary search: O(log n) space
def binary_search_recursive(arr, target, left, right):
if left > right:
return -1
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
return binary_search_recursive(arr, target, mid + 1, right)
else:
return binary_search_recursive(arr, target, left, mid - 1)
# Stack depth: log₂(n) (halving each time)
# Space: O(log n)
O(n) - Linear Space
# Creating a copy
def copy_and_modify(arr):
result = []
for item in arr:
result.append(item * 2)
return result
# Space: O(n) for result
# Frequency counting
def count_frequency(arr):
freq = {}
for item in arr:
freq[item] = freq.get(item, 0) + 1
return freq
# Space: O(n) worst case (all unique)
# Space: O(1) best case (all same)
O(n log n) - Linearithmic Space
# Merge sort (naive - creating copies)
def merge_sort_naive(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort_naive(arr[:mid]) # Creates copy
right = merge_sort_naive(arr[mid:]) # Creates copy
return merge(left, right) # Creates new array
# All recursive calls exist simultaneously in memory
# Space: O(n log n)
# Optimized merge sort: O(n) space
def merge_sort_optimized(arr):
aux = [0] * len(arr) # Reused across all merges
merge_sort_helper(arr, aux, 0, len(arr) - 1)
# Space: O(n) for aux + O(log n) for recursion = O(n)
O(n²) - Quadratic Space
# Adjacency matrix for graph
def create_adjacency_matrix(n):
matrix = [[0] * n for _ in range(n)]
return matrix
# Space: O(n²)
# All pairs shortest paths
def floyd_warshall(n):
dist = [[float('inf')] * n for _ in range(n)]
# ... algorithm
return dist
# Space: O(n²)
O(2ⁿ) - Exponential Space
# Generate all subsets
def all_subsets(arr):
result = []
n = len(arr)
for i in range(2 ** n):
subset = []
for j in range(n):
if i & (1 << j):
subset.append(arr[j])
result.append(subset)
return result
# Number of subsets: 2ⁿ
# Space: O(n · 2ⁿ) to store all subsets
4.4 Recursion and Stack Space
Call Stack Analysis:
# Linear recursion: O(n) stack space
def sum_recursive(arr, index=0):
if index >= len(arr):
return 0
return arr[index] + sum_recursive(arr, index + 1)
# Stack depth: n
# Each frame: O(1)
# Total: O(n)
# Binary recursion: O(log n) stack space
def binary_search(arr, target, left, right):
if left > right:
return -1
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
return binary_search(arr, target, mid + 1, right)
else:
return binary_search(arr, target, left, mid - 1)
# Stack depth: log n
# Each frame: O(1)
# Total: O(log n)
# Tree recursion: O(n) worst case
def fibonacci_naive(n):
if n <= 1:
return n
return fibonacci_naive(n-1) + fibonacci_naive(n-2)
# Maximum stack depth: n (leftmost branch)
# Space: O(n) despite exponential time
Tail Recursion Optimization:
# Non-tail recursive: O(n) space
def factorial(n):
if n <= 1:
return 1
return n * factorial(n - 1)
# Tail recursive: O(n) space in Python (no TCO)
# But O(1) in languages with tail call optimization
def factorial_tail(n, acc=1):
if n <= 1:
return acc
return factorial_tail(n - 1, n * acc)
4.5 Space Complexity by Data Structure
Linear Structures:
| Structure | Space | Notes | | -------------- | ----- | --------------------------- | | Array (static) | O(n) | Contiguous memory | | Dynamic Array | O(n) | + amortized overhead | | Linked List | O(n) | + pointer overhead per node | | Stack | O(n) | Max n elements | | Queue | O(n) | Max n elements | | Deque | O(n) | Double-ended queue |
Tree Structures:
| Structure | Space | Notes | | -------------- | ------------------------ | ------------------------- | | Binary Tree | O(n) | n nodes, 2 pointers each | | BST | O(n) | Same as binary tree | | AVL Tree | O(n) | + balance factor per node | | Red-Black Tree | O(n) | + color bit per node | | B-Tree | O(n) | Higher branching factor | | Trie | O(ALPHABET_SIZE × N × L) | N words, avg length L |
Graph Structures:
| Representation | Space | Best For | | ---------------- | -------- | ----------------- | | Adjacency Matrix | O(V²) | Dense graphs | | Adjacency List | O(V + E) | Sparse graphs | | Edge List | O(E) | Simple algorithms |
Hash Structures:
| Structure | Space | Load Factor | | ---------- | ----- | --------------------- | | Hash Table | O(n) | Typically 0.7-0.75 | | Hash Set | O(n) | Similar to hash table |
4.6 Space Optimization Techniques
Technique 1: In-Place Modification
# Before: O(n) space - creates new array
def transform_bad(arr):
return [x * 2 for x in arr]
# After: O(1) space - modifies in place
def transform_good(arr):
for i in range(len(arr)):
arr[i] *= 2
return arr
Technique 2: Space Reuse
# Before: O(n) space for each transformation
def transform_bad(arr):
step1 = [x * 2 for x in arr]
step2 = [x + 1 for x in step1]
step3 = [x ** 2 for x in step2]
return step3
# After: O(n) space total, reused
def transform_good(arr):
result = [x * 2 for x in arr]
for i in range(len(result)):
result[i] += 1
for i in range(len(result)):
result[i] **= 2
return result
Technique 3: Iterative Instead of Recursive
# Recursive: O(n) stack space
def factorial_recursive(n):
if n <= 1:
return 1
return n * factorial_recursive(n - 1)
# Iterative: O(1) space
def factorial_iterative(n):
result = 1
for i in range(2, n + 1):
result *= i
return result
Technique 4: Sliding Window
# Before: O(k×n) space - store all windows
def all_windows_bad(arr, k):
windows = []
for i in range(len(arr) - k + 1):
windows.append(arr[i:i+k])
return windows
# After: O(k) space - process one window at a time
from collections import deque
def max_in_window_optimized(arr, k):
max_values = []
dq = deque() # Store indices
for i in range(len(arr)):
# Remove elements outside window
if dq and dq[0] <= i - k:
dq.popleft()
# Maintain decreasing order
while dq and arr[dq[-1]] < arr[i]:
dq.pop()
dq.append(i)
if i >= k - 1:
max_values.append(arr[dq[0]])
return max_values
Technique 5: Bit Manipulation
# Before: O(n) space for boolean array
def mark_primes_array(n):
is_prime = [True] * (n + 1) # O(n) bytes
# ... sieve algorithm
return is_prime
# After: O(n/8) space using bits
def mark_primes_bits(n):
# Each byte stores 8 boolean values
num_bytes = (n + 1 + 7) // 8
is_prime = bytearray(num_bytes)
# ... sieve with bit operations
return is_prime
Technique 6: Generator Functions
# Before: O(n) space - store all results
def squares_list(n):
return [i ** 2 for i in range(n)]
# After: O(1) space - generate on demand
def squares_generator(n):
for i in range(n):
yield i ** 2
# Usage: doesn't store all at once
for square in squares_generator(1000000):
process(square)
4.7 Space-Time Tradeoffs
Memoization vs. Recomputation:
# Space-optimized Fibonacci: O(1) space, O(n) time
def fib_space_optimized(n):
if n <= 1:
return n
prev, curr = 0, 1
for _ in range(n - 1):
prev, curr = curr, prev + curr
return curr
# Time-optimized Fibonacci: O(n) space, O(n) time
def fib_time_optimized(n, memo={}):
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fib_time_optimized(n-1, memo) + fib_time_optimized(n-2, memo)
return memo[n]
Precomputation vs. On-Demand:
# Precompute: O(n²) space, O(1) query
def precompute_distances(points):
n = len(points)
distances = [[0] * n for _ in range(n)]
for i in range(n):
for j in range(n):
distances[i][j] = compute_distance(points[i], points[j])
return distances
# On-demand: O(n) space, O(1) per query
def compute_distance_on_demand(points, i, j):
return compute_distance(points[i], points[j])
Memory Constraints by Domain:
Real-world space requirements vary dramatically by application domain:
| Domain | Typical Memory | Space Requirement | | -------------- | -------------- | ----------------------------- | | Embedded IoT | 2-512 KB | O(1) or O(log n) only | | Mobile apps | 100 MB - 2 GB | O(n) acceptable for small n | | Web servers | 2-128 GB | O(n) for n = millions | | Big data | 64 GB - 1 TB | O(√n) or streaming algorithms | | Supercomputers | 1-100 TB | O(n) for n = billions |
4.8 Platform-Specific Considerations
Language Differences:
# Python: Everything is an object (high overhead)
x = 5 # Not just 4 bytes, but ~28 bytes!
# C: Exact size
int x = 5; # Exactly 4 bytes on most systems
32-bit vs 64-bit:
- Pointers: 4 bytes vs. 8 bytes
- Affects memory usage of linked structures significantly
Memory Alignment:
struct Example {
char a; // 1 byte
int b; // 4 bytes
char c; // 1 byte
};
// Actual size: 12 bytes (due to padding), not 6 bytes
Why Alignment Exists: Memory alignment ensures data is stored at addresses that are multiples of their size, improving CPU access efficiency but potentially wasting space. CPUs can access aligned data faster because:
- Single memory cycle for aligned reads
- Multiple cycles or crashes for misaligned data
Memory Fragmentation:
Fragmentation assumption in analysis: Contiguous allocation when possible
Reality:
- Internal fragmentation: Wasted space within allocated blocks
- External fragmentation: Free space scattered across memory
- Can increase actual space usage beyond theoretical O(n)
4.9 Streaming Algorithms
Goal: Process data streams with sublinear space (o(n))
Scenario: Data too large to store entirely, must process in one pass
# Example: Count distinct elements in a stream
# Naive: O(n) space - store all seen elements
def count_distinct_naive(stream):
seen = set()
for item in stream:
seen.add(item)
return len(seen)
# HyperLogLog: O(log log n) space with approximation
# Uses probabilistic counting with hash functions
def hyperloglog_estimate(stream, precision=14):
m = 2 ** precision # Number of registers
registers = [0] * m
for item in stream:
h = hash(item)
# Use first 'precision' bits to determine register
j = h & (m - 1)
# Count leading zeros in remaining bits
w = count_leading_zeros(h >> precision)
registers[j] = max(registers[j], w)
# Estimate from harmonic mean
raw_estimate = alpha_m(m) * m * m / sum(2**(-x) for x in registers)
return apply_corrections(raw_estimate, m)
# Space: O(m) = O(2^precision) ≈ O(log log n) for n distinct elements
Common Streaming Algorithms:
| Problem | Naive Space | Streaming Space | Algorithm | | --------------- | ----------- | --------------- | ---------------- | | Count distinct | O(n) | O(log log n) | HyperLogLog | | Frequent items | O(n) | O(k log n) | Count-Min Sketch | | Membership test | O(n) | O(m) | Bloom Filter | | Quantiles | O(n) | O(log n) | q-digest | | Heavy hitters | O(n) | O(k) | Misra-Gries |
Applications:
- Counting unique visitors: HyperLogLog for web analytics
- Finding frequent items: Count-Min Sketch for trending topics
- Membership testing: Bloom Filter for cache checks
- Network monitoring: Heavy hitter detection for traffic analysis
4.10 External Memory Algorithms
Scenario: Data doesn't fit in RAM, must use disk/external storage
I/O Complexity Model:
Instead of counting operations, count disk accesses:
- RAM size: M words
- Block size: B words
- Minimize I/O operations (disk reads/writes)
Goal: Design algorithms minimizing disk I/O
Example: External Merge Sort
def external_merge_sort(input_file, output_file, M, B):
"""
Sort data larger than memory using disk.
M: Memory size (in elements)
B: Block size (elements per disk read/write)
"""
# Phase 1: Create sorted runs
runs = []
while data_remaining:
# Read M elements into memory
chunk = read_chunk(input_file, M)
# Sort in memory: O(M log M) comparisons
chunk.sort()
# Write sorted run to disk: O(M/B) I/O operations
run_file = write_run_to_disk(chunk)
runs.append(run_file)
# Phase 2: K-way merge (K = M/B runs at a time)
while len(runs) > 1:
# Merge up to K runs at once
K = M // B
merged_runs = []
for i in range(0, len(runs), K):
batch = runs[i:i+K]
# Merge K runs using M/K buffer per run
merged = k_way_merge(batch, buffer_size=M//K)
merged_runs.append(merged)
runs = merged_runs
return runs[0]
# I/O Complexity: O((N/B) * log_{M/B}(N/M))
# Where N = total data size
# Much better than naive O(N²/B) for sorting on disk
Cache-Oblivious Algorithms:
Design algorithms optimal for ALL memory hierarchies without knowing M or B
Example: Cache-oblivious matrix transpose
- Recursive divide-and-conquer
- Automatically adapts to any cache size
- Optimal for all levels: L1, L2, L3, RAM, disk
I/O complexity: O(N/B) transfers (optimal)
No need to tune for specific hardware
Succinct Data Structures:
Goal: Achieve space close to information-theoretic lower bound
Example: Storing a binary tree with n nodes
- Naive: O(n log n) bits (store all pointers: 2 pointers × log n bits each)
- Information-theoretic minimum: O(n) bits (there are C_n = Catalan(n) possible trees)
- Succinct: 2n + o(n) bits (close to minimum)
Techniques:
- Rank and select operations on bit vectors
- Compressed representations
- Implicit data structures (no pointers)
Example: Succinct binary tree
- Store structure: 2n bits (balanced parentheses or LOUDS encoding)
- Store values: n log σ bits (σ = alphabet size)
- Total: 2n + n log σ + o(n) bits
- Still supports O(1) navigation!
5. Case Analysis: Best, Average, and Worst Case
5.1 Core Concepts
Best Case Analysis:
- Identifies optimal input configuration
- Provides lower bound on performance
- Often uses Omega (Ω) notation
- May be achieved with specific input ordering
Worst Case Analysis:
- Identifies pathological input configuration
- Provides upper bound guarantee
- Uses Big O notation
- Critical for safety-critical systems
Average Case Analysis:
- Calculates expected performance over all inputs
- Requires probability distribution
- Uses Theta (Θ) typically
- Most representative of typical usage
Mathematical Definitions:
Best Case:
T_best(n) = min{T(I) : |I| = n}
Worst Case:
T_worst(n) = max{T(I) : |I| = n}
Average Case:
T_avg(n) = Σ P(I) × T(I)
where P(I) is probability of input I
Relationship:
T_best(n) ≤ T_avg(n) ≤ T_worst(n)
Asymptotically:
Ω(f(n)) ≤ Θ(g(n)) ≤ O(h(n))
5.2 Detailed Examples
Example 1: Insertion Sort
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key: # Inner loop
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
Analysis:
Best Case: Θ(n)
- Input: Already sorted [1, 2, 3, 4, 5]
- Inner loop runs 0 times each iteration
- Total comparisons: n-1
- Total shifts: 0
Worst Case: Θ(n²)
- Input: Reverse sorted [5, 4, 3, 2, 1]
- Inner loop runs maximum times
- Total comparisons: 1 + 2 + ... + (n-1) = n(n-1)/2
- Total shifts: n(n-1)/2
Average Case: Θ(n²)
- Random input, each element moves halfway back on average
- Expected comparisons: (n-1)n/4
- Same asymptotic complexity as worst
Example 2: Quicksort
def quicksort(arr, low, high):
if low < high:
pivot_idx = partition(arr, low, high)
quicksort(arr, low, pivot_idx - 1)
quicksort(arr, pivot_idx + 1, high)
Analysis:
Best Case: Θ(n log n)
- Input: Pivot always splits evenly
- Recurrence: T(n) = 2T(n/2) + Θ(n)
- Balanced partitions
Worst Case: Θ(n²)
- Input: Already sorted (without randomization)
- Pivot is always min/max
- Recurrence: T(n) = T(n-1) + T(0) + Θ(n)
- Unbalanced partitions: one side has n-1, other has 0
Average Case: Θ(n log n)
- Random pivots
- Expected balanced partitions
- Even if only 25% of partitions are "good", achieves O(n log n)
Example 3: Linear Search
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1
Analysis:
Best Case: Θ(1)
- Input: target at arr[0]
- Operations: 1 comparison
Average Case: Θ(n)
- Assume target present with probability p, equally likely at any position
- Expected comparisons: (n+1)/2 if always present
- Θ(n) in any case
Detailed Average Case Derivation:
# Assume:
# - Target is in array with probability p
# - If present, equally likely at any position
# - If absent, must check all n elements
# P(target at position i) = p/n for i = 1, 2, ..., n
# P(target absent) = 1 - p
T_avg(n) = Σ(i=1 to n) [p/n × i] + (1-p) × n
= (p/n) × [1 + 2 + ... + n] + (1-p) × n
= (p/n) × [n(n+1)/2] + (1-p) × n
= p(n+1)/2 + (1-p)n
= p(n+1)/2 + n - pn
= n + p/2 - pn/2
= n(1 - p/2) + p/2
# If p = 1 (always present):
T_avg(n) = n/2 + 1/2 = Θ(n)
# If p = 0 (never present):
T_avg(n) = n = Θ(n)
# In all cases: T_avg(n) = Θ(n) ∎
Worst Case: Θ(n)
- Input: target absent or at end
- Operations: n comparisons
Range: Θ(1) to Θ(n)
Example 4: Binary Search Tree Search
def bst_search(root, target):
if root is None:
return None
if root.val == target:
return root
elif target < root.val:
return bst_search(root.left, target)
else:
return bst_search(root.right, target)
Analysis:
Best Case: Θ(1)
- Input: target is root
- Operations: 1 comparison
Average Case: Θ(log n) for balanced tree
- Random BST
- Expected height: O(log n)
- Operations: O(height) = O(log n)
Worst Case: Θ(n)
- Input: Degenerate tree (linked list)
- Height: n
- Operations: O(n)
Range: Θ(1) to Θ(n), avg Θ(log n) if balanced
5.3 When to Use Each
Best Case Analysis:
Use when:
- Algorithm shows dramatically better performance on certain inputs
- Adaptive algorithms
- Early termination conditions
Examples:
- Insertion sort: O(n) on sorted input
- Linear search: O(1) if target is first
Worst Case Analysis:
Use when:
- Need performance guarantees
- Safety-critical systems
- Real-time systems
- SLA guarantees
Examples:
- Heap sort: Always O(n log n)
- AVL tree: Guaranteed O(log n)
- Binary search: Always O(log n)
Average Case Analysis:
Use when:
- Typical performance matters
- Inputs are varied
- General-purpose systems
Examples:
- Quicksort: O(n log n) average
- Hash tables: O(1) average lookups
5.4 Adaptive Algorithms
Timsort-Style Optimization:
def adaptive_sort(arr):
if len(arr) <= 64:
insertion_sort(arr) # Fast for small arrays
return
if is_sorted(arr):
return # Already sorted: Θ(n)
if is_reverse_sorted(arr):
arr.reverse() # Θ(n) to fix
return
# Fall back to general-purpose
merge_sort(arr) # Θ(n log n) guaranteed
Randomization to Eliminate Worst Case:
import random
def randomized_quicksort(arr, low, high):
if low < high:
# Random pivot selection
pivot_idx = random.randint(low, high)
arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
pivot_idx = partition(arr, low, high)
randomized_quicksort(arr, low, pivot_idx - 1)
randomized_quicksort(arr, pivot_idx + 1, high)
# Expected time: Θ(n log n) for ANY input
# Worst case still Θ(n²), but probability (1/2)^n (negligible)
5.5 Comparative Analysis
Sorting Algorithms:
| Algorithm | Best | Average | Worst | Space | Notes | | ------------- | ---------- | ---------- | ---------- | -------- | ---------------------------- | | Bubble | Θ(n) | Θ(n²) | Θ(n²) | O(1) | Rarely used | | Insertion | Θ(n) | Θ(n²) | Θ(n²) | O(1) | Good for small/nearly sorted | | Selection | Θ(n²) | Θ(n²) | Θ(n²) | O(1) | Min memory writes | | Merge | Θ(n log n) | Θ(n log n) | Θ(n log n) | O(n) | Stable, guaranteed | | Quick (basic) | Θ(n log n) | Θ(n log n) | Θ(n²) | O(log n) | Fast average | | Quick (rand) | Θ(n log n) | Θ(n log n) | Θ(n²)* | O(log n) | General purpose | | Heap | Θ(n log n) | Θ(n log n) | Θ(n log n) | O(1) | In-place, guaranteed | | Tim | Θ(n) | Θ(n log n) | Θ(n log n) | O(n) | Adaptive, real-world |
*With randomization, worst case is probabilistically negligible
5.6 Competitive Analysis and Instance Optimality
Competitive Ratio:
For online algorithms (decisions without future knowledge):
Competitive ratio = max_input (ALG_cost / OPT_cost)
Example: Paging algorithms
- FIFO, LRU: k-competitive (k = cache size)
- Optimal offline: requires future knowledge
An algorithm is c-competitive if:
ALG(I) ≤ c × OPT(I) + α for all inputs I
Instance Optimality:
An algorithm is instance-optimal if for every input I:
ALG(I) ≤ c × ANY_ALG(I) + α
Stronger than competitive ratio (compares to ALL algorithms, not just optimal)
Example: Quicksort with median-of-3 pivot
- Not worst-case optimal: Θ(n²)
- Instance-optimal constant c ≈ 2 in practice
- Performs within constant factor of best possible for that specific input
6. Amortized Analysis
6.1 Core Concepts
Definition: Amortized analysis analyzes the average cost per operation over a sequence of operations, not individual operations. It provides realistic bounds when occasional expensive operations are "paid for" by many cheap ones.
Key Distinction:
Worst-case (single operation): Maximum cost of any one operation
Amortized (sequence): Average cost per operation over many operations
Example: Dynamic array insertion
- Worst-case single: O(n) when doubling
- Amortized: O(1) averaged over n insertions
Three Analysis Methods:
- Aggregate Method: Calculate total cost for n operations, divide by n
- Accounting Method: Assign credits to operations, bank for future expensive ops
- Potential Method: Define potential function tracking data structure state
6.2 Method 1: Aggregate Analysis
Dynamic Array Example:
class DynamicArray:
def __init__(self):
self.capacity = 1
self.size = 0
self.array = [None] * self.capacity
def append(self, item):
if self.size == self.capacity:
# Resize: double capacity
new_capacity = 2 * self.capacity
new_array = [None] * new_capacity
for i in range(self.size):
new_array[i] = self.array[i] # Copy: O(size)
self.array = new_array
self.capacity = new_capacity
self.array[self.size] = item
self.size += 1
Analysis:
Analyze n consecutive appends starting from empty:
Resize costs occur after insertions: 1, 2, 4, 8, 16, ..., 2^k where 2^k ≤ n
Total cost:
T(n) = n (normal inserts) + Σ 2^i (resize copies)
= n + (1 + 2 + 4 + ... + 2^floor(log n))
= n + (2^(floor(log n) + 1) - 1)
< n + 2n
= 3n
= O(n)
Amortized per insertion = T(n) / n = 3n / n = 3 = O(1) ✓
Binary Counter Example:
def increment_binary_counter(counter):
"""Increment binary counter represented as bit array."""
i = 0
while i < len(counter) and counter[i] == 1:
counter[i] = 0 # Flip 1 to 0
i += 1
if i < len(counter):
counter[i] = 1 # Flip 0 to 1
Analysis:
How often does each bit flip in n increments?
- Bit 0: flips every increment = n times
- Bit 1: flips every 2 increments = n/2 times
- Bit 2: flips every 4 increments = n/4 times
- Bit k: flips every 2^k increments = n/2^k times
Total flips:
T(n) = Σᵢ₌₀ᵗᵒ ˡᵒᵍ ⁿ floor(n / 2^i)
< n × Σᵢ₌₀ᵗᵒ ∞ (1 / 2^i)
= n × 2
= 2n
Amortized per increment = 2n / n = 2 = O(1) ✓
Stack Using Two Queues Example:
class StackUsingQueues:
def __init__(self):
self.q1 = []
self.q2 = []
def push(self, item):
self.q1.append(item) # O(1)
def pop(self):
# Move all but last element from q1 to q2
while len(self.q1) > 1:
self.q2.append(self.q1.pop(0))
# Last element in q1 is the top of stack
result = self.q1.pop(0) if self.q1 else None
# Swap q1 and q2
self.q1, self.q2 = self.q2, self.q1
return result
Analysis:
Consider sequence of n operations (mix of push and pop).
Push operations: Always O(1) each
Pop operations: Move (size - 1) elements between queues
Key insight: Each element is moved at most once per pop
Analysis for n operations where p pushes and (n-p) pops:
- Total push cost: p × O(1) = O(p)
- Total pop cost: Each element pushed is moved at most once = O(p)
- Total cost: O(p) + O(p) = O(p) ≤ O(n)
Amortized cost per operation: O(n) / n = O(1) ✓
6.3 Method 2: Accounting Method
Dynamic Array with Accounting:
Assign amortized cost: ĉ = 3 per append
Credit allocation:
- 1 credit pays for actual insertion
- 2 credits saved for future copying
Credit invariant: Each element has 2 credits
Proof:
- When resizing from k to 2k:
- Need k credits to copy k elements
- k elements × 2 credits = 2k available
- Use k credits for copying
- k credits remain
- Credits never go negative ✓
Amortized cost = 3 = O(1) ✓
Binary Counter with Accounting:
Assign amortized cost: ĉ = 2 per increment
Credit allocation:
- 1 credit for setting bit to 1
- 1 credit saved on each 1-bit (for future reset)
Invariant: Each 1-bit has 1 credit
Proof:
- When setting bit i to 1: use 1 credit, bank 1 on the bit
- When flipping bit i from 1 to 0: use its banked credit
- Credits never go negative ✓
Amortized cost = 2 = O(1) ✓
6.4 Method 3: Potential Method
Dynamic Array with Potential:
Define potential function:
Φ(D) = 2 × size - capacity
Initially: Φ(D₀) = 2 × 0 - 1 = -1
Analysis:
Case 1: Normal insert (no resize)
- Actual cost: cᵢ = 1
- size increases by 1, capacity unchanged
- ΔΦ = [2(i) - cap] - [2(i-1) - cap] = 2
- Amortized: ĉᵢ = cᵢ + ΔΦ = 1 + 2 = 3
Case 2: Insert with resize
- Actual cost: cᵢ = k + 1 (copy k + insert)
- Before: size = k, capacity = k, Φ = 2k - k = k
- After: size = k+1, capacity = 2k, Φ = 2(k+1) - 2k = 2
- ΔΦ = 2 - k
- Amortized: ĉᵢ = (k + 1) + (2 - k) = 3
Both cases: ĉᵢ = 3 = O(1) ✓
Splay Tree (Detailed):
class SplayTree:
def search(self, key):
"""Search and splay (move to root)."""
node = self._find(key)
if node:
self._splay(node) # Rotate node to root
return node
def _splay(self, node):
"""Rotate node to root using zig, zig-zig, zig-zag."""
while node.parent is not None:
if node.parent.parent is None:
# Zig: single rotation
self._rotate(node)
elif self._same_direction(node, node.parent):
# Zig-zig: rotate parent first, then node
self._rotate(node.parent)
self._rotate(node)
else:
# Zig-zag: rotate node twice
self._rotate(node)
self._rotate(node)
Potential Method Analysis:
Define potential function:
Φ(T) = Σ log(size(node)) for all nodes
= Σ rank(node) where rank(node) = log(size(subtree rooted at node))
Analysis sketch (simplified):
- Each zig-zig or zig-zag operation costs O(1) actual time
- Potential change carefully designed so:
- Potential increases for nodes that move up
- Potential decreases for nodes that move down
- Net potential change is O(log n) for entire splay
Result: Amortized O(log n) per splay operation
Detailed proof requires:
- Access theorem: Total time for m accesses is O(m log n + Σ log(n/fᵢ))
where fᵢ is frequency of access i
- Balance theorem: After m operations starting from empty tree,
depth of any node is at most O(log n)
- Working set theorem: Access frequently-used items faster
- Dynamic optimality conjecture (open problem): Splay trees are optimal
within constant factor for any access sequence
All operations (insert, delete, search) amortized O(log n) ✓
6.5 Common Amortized Patterns
| Data Structure | Operation | Worst Single | Amortized | Reason | | ---------------- | ------------ | ------------ | --------- | ---------------- | | Dynamic Array | Insert | O(n) | O(1) | Rare doubling | | Stack (2 queues) | Dequeue | O(n) | O(1) | Rare transfer | | Splay Tree | Search | O(n) | O(log n) | Self-adjusting | | Union-Find | Union/Find | O(log n) | O(α(n))* | Path compression | | Fibonacci Heap | Decrease-key | O(log n) | O(1) | Lazy deletion |
*α(n) is inverse Ackermann function, effectively constant
6.6 When to Use Amortized Analysis
Use When:
- Data structure has occasional expensive operations
- Want realistic performance bounds for sequences
- Worst-case per-operation is too pessimistic
- Operations can "save up" work for later
Avoid When:
- Hard real-time deadlines (each operation must be fast)
- Worst-case critical (cannot tolerate any slow operation)
- Security-sensitive (adversarial inputs trigger worst-case)
- Need predictable per-operation time
Hybrid Approach (Introsort):
def introsort(arr):
depth_limit = 2 * math.floor(math.log2(len(arr)))
_introsort_helper(arr, 0, len(arr) - 1, depth_limit)
def _introsort_helper(arr, low, high, depth_limit):
if depth_limit == 0:
heapsort(arr, low, high) # Guaranteed O(n log n)
else:
pivot = partition(arr, low, high)
_introsort_helper(arr, low, pivot - 1, depth_limit - 1)
_introsort_helper(arr, pivot + 1, high, depth_limit - 1)
# Combines quicksort's average O(n log n) with heapsort's worst-case guarantee
6.7 Competitive Analysis and Online Algorithms
Amortized Analysis vs. Competitive Analysis:
Amortized Analysis:
- Offline: analyze sequence after the fact
- Compare operations to each other in sequence
- Measure: total cost / n operations
- Goal: average cost per operation over worst sequence
Competitive Analysis:
- Online: make decisions without future knowledge of future inputs
- Compare to optimal offline algorithm that knows all inputs
- Measure: online_cost / optimal_cost
- Goal: ratio between online and optimal offline performance
Example: Ski Rental Problem
Problem:
- Don't know how many days you will ski
- Buy skis ($100) vs. rent ($10/day)?
- Decision must be made without knowing future
Online algorithm (2-competitive):
- Rent until total rental cost would equal purchase price
- After 10 days of renting ($100 spent), buy the skis
- Cost ≤ 2 × OPT in all cases
Analysis:
- If ski ≤ 10 days: pay ≤ $100 (vs. OPT = rental cost ≤ $100)
- If ski > 10 days: pay $200 (vs. OPT = $100 if buy immediately)
- Competitive ratio = 2
No online algorithm can achieve better than 2-competitive for this problem
Connection to Amortized Analysis:
Both techniques analyze sequences of operations, but:
- Amortized: knows full sequence, bounds average cost
- Competitive: doesn't know future, bounds ratio to optimal
- Many self-adjusting data structures (splay trees, move-to-front lists) can be analyzed using both frameworks
7. Practical Problem-Solving
7.1 Algorithm Selection Guide
Choosing Complexity Requirements:
1 second ≈ 10⁸ operations (rough guideline)
If n = 100,000 and time limit is 1 second:
- O(n): 10⁵ operations ✓
- O(n log n): ≈ 1.7×10⁶ operations ✓
- O(n²): 10¹⁰ operations ✗ (too slow)
Guidelines:
- n ≤ 18: Even O(2ⁿ) or O(n!) might work
- n ≤ 100: O(n³) acceptable
- n ≤ 10⁵: Need O(n log n) or better
- n ≤ 10⁸: Need O(n) or better
- n ≤ 10⁹: Need O(log n) or O(1)
For Reliability:
Choose algorithms with guaranteed worst-case:
- Merge sort over quicksort
- AVL/Red-Black tree over BST
- Heap sort over quicksort
For Average Performance:
Optimize for typical case:
- Quicksort (randomized) for sorting
- Hash tables for lookups
- Adaptive algorithms (Timsort)
For Space Constraints:
Prioritize in-place algorithms:
- Heap sort over merge sort
- Iterative over recursive
- Bit manipulation for boolean arrays
7.2 Optimization Strategies
Strategy 1: Change Growth Rate
# O(n²) → O(n log n) → O(n)
# Bad: O(n²) - check all pairs
def has_duplicates_bad(arr):
for i in range(len(arr)):
for j in range(i+1, len(arr)):
if arr[i] == arr[j]:
return True
return False
# Better: O(n log n) - sort first
def has_duplicates_better(arr):
sorted_arr = sorted(arr)
for i in range(len(sorted_arr) - 1):
if sorted_arr[i] == sorted_arr[i+1]:
return True
return False
# Best: O(n) - use hash set
def has_duplicates_best(arr):
seen = set()
for item in arr:
if item in seen:
return True
seen.add(item)
return False
Strategy 2: Better Data Structures
# O(n) lookup → O(log n) → O(1)
# Array: O(n) search
items_list = [1, 2, 3, ..., n]
target in items_list # O(n)
# Sorted array + binary search: O(log n)
sorted_items = sorted(items_list)
binary_search(sorted_items, target) # O(log n)
# Hash set: O(1) average
items_set = set(items_list)
target in items_set # O(1) average
Strategy 3: Preprocessing
# Query pattern: many lookups in same data
# Bad: O(q·n) for q queries
def multiple_searches(arr, queries):
results = []
for query in queries:
results.append(query in arr) # O(n) each
return results
# Better: O(n + q) - build hash set once
def multiple_searches_optimized(arr, queries):
lookup = set(arr) # O(n) once
results = []
for query in queries:
results.append(query in lookup) # O(1) each
return results
Strategy 4: Early Termination
# Avoid unnecessary work
def find_first_match(arr, condition):
for item in arr:
if condition(item):
return item # Exit early
return None
# vs continuing after found
def find_all_then_return_first(arr, condition):
matches = [item for item in arr if condition(item)]
return matches[0] if matches else None
7.3 Common Pitfalls
Pitfall 1: Confusing Big O with exact runtime
Bad: "O(n) is always faster than O(n log n)"
Reality: O(100000n) can be slower than O(n log n) for practical n
Pitfall 2: Ignoring hidden operations
# This is NOT O(n)!
def process(arr):
for item in arr:
arr.sort() # O(n log n) inside loop!
# Actual: O(n² log n)
Pitfall 3: Assuming best case is typical
Quicksort is O(n log n)... in average case
Worst case is O(n²) - don't ignore it!
Pitfall 4: Forgetting space complexity
# Looks efficient in time...
def generate_all_subsets(arr):
subsets = []
for i in range(2 ** len(arr)):
subset = [arr[j] for j in range(len(arr)) if i & (1 << j)]
subsets.append(subset)
return subsets
# But space is O(n·2ⁿ) - exponential!
Pitfall 5: Confusing amortized with average
Wrong: "O(1) average means O(1) amortized"
Right: "Average requires probability, amortized is deterministic"
7.4 Real-World Applications
Database Query Optimization:
-- SELECT * FROM users WHERE id = 12345
Best case: O(1) - indexed, in cache
Average: O(log n) - B-tree index
Worst: O(n) - full table scan
Optimizer chooses based on:
- Index availability
- Table size
- Expected selectivity
Web Search (Google-scale):
n = 10¹¹+ pages
Inverted index lookup: O(1) to O(log n)
- Hash table: O(1) average
- B-tree: O(log n) guaranteed
Ranking: O(k log k) where k = matching documents
Total query: O(log n + k log k)
- log₂(10¹¹) ≈ 37 (very fast!)
Machine Learning Training:
Complexity per epoch:
- n = training examples
- d = features
- h = hidden units
- L = layers
- k = iterations to converge
Linear model: O(n × d × k)
Neural network: O(n × d × h × L × k)
Mini-batch (size b): O((n/b) × d × h × L × k)
Growth rate analysis guides architecture and batch size
Network Applications:
Google PageRank:
PageRank algorithm complexity:
- Graph: V vertices (web pages), E edges (links)
- Iterative computation: O(k·E) where k = iterations
- Typically k ≈ 50-100 iterations to converge
- Sparse graph: E ≈ O(V), so O(k·V)
Social Networks:
Friend suggestions: O(V + E) - graph traversal
Newsfeed ranking: O(n log k) - top-k selection
Shortest path between users: O(V + E) - BFS
Database Systems:
Index lookup: O(log n) - B-tree
Full table scan: O(n)
Join operations: O(n·m) to O(n + m) depending on algorithm
8. Advanced Topics
8.1 Master Theorem
For recurrences of the form:
T(n) = aT(n/b) + f(n)
where a ≥ 1, b > 1, and f(n) asymptotically positive
Three Cases:
Case 1: If f(n) = O(n^(log_b(a) - ε)) for some ε > 0
Then T(n) = Θ(n^(log_b(a)))
Case 2: If f(n) = Θ(n^(log_b(a)))
Then T(n) = Θ(n^(log_b(a)) log n)
Case 3: If f(n) = Ω(n^(log_b(a) + ε)) for some ε > 0
and a·f(n/b) ≤ c·f(n) for some c < 1
Then T(n) = Θ(f(n))
Examples:
T(n) = 2T(n/2) + Θ(n) [Merge sort]
a=2, b=2, f(n)=n
log_b(a) = log_2(2) = 1
f(n) = Θ(n^1) → Case 2
T(n) = Θ(n log n) ✓
T(n) = T(n/2) + Θ(1) [Binary search]
a=1, b=2, f(n)=1
log_b(a) = log_2(1) = 0
f(n) = Θ(n^0) → Case 2
T(n) = Θ(log n) ✓
T(n) = 3T(n/4) + Θ(n²)
a=3, b=4, f(n)=n²
log_b(a) = log_4(3) ≈ 0.79
f(n) = n² = Ω(n^(0.79+ε)) → Case 3
T(n) = Θ(n²) ✓
8.2 Complexity Classes
P (Polynomial Time):
P = ∪ₖ O(n^k)
Problems solvable in polynomial time:
- Sorting: O(n log n)
- Shortest path: O(V²) or O(E + V log V)
- Matching: O(VE)
Considered "efficient" or "tractable"
NP (Nondeterministic Polynomial):
Problems verifiable in polynomial time
Examples:
- Boolean satisfiability (SAT)
- Hamiltonian cycle
- Subset sum
- Graph coloring
NP-Complete:
Hardest problems in NP
If any NP-complete problem has polynomial solution, P = NP
Examples:
- SAT (first proven NP-complete)
- 3-SAT
- Traveling salesman (decision version)
- Knapsack
- Vertex cover
P vs. NP:
Million-dollar question: P = NP?
Known: P ⊆ NP
Unknown: P = NP or P ≠ NP
If P = NP: Major implications for cryptography, optimization
If P ≠ NP: Many problems remain fundamentally hard
8.3 Special Growth Functions
Inverse Ackermann α(n):
Grows incredibly slowly:
α(2) = 1
α(4) = 2
α(16) = 3
α(65536) = 4
α(2^65536) = 5
For all practical values of n in computer science (even n = number of atoms in universe),
α(n) ≤ 5, making it effectively constant.
Used in:
- Union-Find with path compression: O(α(n)) per operation
- Nearly O(1) for all realistic inputs
Iterated Logarithm log* n:
Number of times log must be applied to reach ≤ 1:
log* 2 = 1
log* 4 = 2
log* 16 = 3
log* 65536 = 4
log* 2^65536 = 5
Grows even slower than log log n
Stirling's Approximation:
n! ≈ √(2πn) × (n/e)^n
Used to show:
log(n!) = Θ(n log n)
Useful for analyzing:
- Permutation algorithms
- Combinatorial problems
8.4 Parameterized Complexity
Beyond Single Parameter:
Instead of T(n), analyze T(n, k):
- n: input size
- k: parameter (solution size, tree width, etc.)
Example: Vertex Cover
- Traditional: NP-complete, no known polynomial
- Parameterized: O(2^k × n) where k = cover size
- Fixed-parameter tractable (FPT) when k is small
If k = 10: 2^10 × n = 1024n (practical!)
8.6 Advanced Data Structures
Fibonacci Heap Details:
Operations and amortized costs:
- Insert: O(1)
- Find-min: O(1)
- Union: O(1)
- Decrease-key: O(1)
- Delete-min: O(log n)
Potential function:
Φ(H) = t(H) + 2m(H)
where t = number of trees, m = number of marked nodes
Key techniques:
- Lazy deletion: mark instead of immediate deletion
- Cascading cuts: maintain degree bound
- Consolidation: clean up during delete-min
Applications:
- Dijkstra's shortest path: O(E + V log V) with Fibonacci heap
vs. O(E log V) with binary heap
- Prim's MST: O(E + V log V)
- Improved performance when decrease-key operations are frequent
Union-Find with Path Compression:
Inverse Ackermann function α(n):
α(2) = 1
α(4) = 2
α(16) = 3
α(65536) = 4
α(2^65536) = 5
For all practical values of n (even n = number of atoms in universe),
α(n) ≤ 5, making it effectively constant.
Union-Find operations: O(α(n)) amortized per operation
Nearly O(1) for all realistic inputs
Amortized Space Complexity:
class DynamicArray:
def __init__(self):
self.capacity = 1
self.size = 0
self.array = [None] * self.capacity
def append(self, item):
if self.size == self.capacity:
# Resize: O(n) space temporarily during copy
self.capacity *= 2
new_array = [None] * self.capacity
for i in range(self.size):
new_array[i] = self.array[i]
self.array = new_array
self.array[self.size] = item
self.size += 1
# Peak space during resize: O(old_size + new_size) = O(3n/2)
# Amortized space: O(n) - counts wasted slots
# Actual space at any moment: varies between n and 2n
Persistent Data Structures:
Persistent data structures maintain all previous versions:
Example: Persistent balanced tree
- Regular BST: O(n) space
- Persistent BST: O(n + m log n) space for m operations
- Each update creates O(log n) new nodes (path to root)
Applications:
- Version control systems
- Functional programming languages
- Undo/redo functionality
Retroactive Data Structures:
Support operations in the past without rebuilding:
Example: Retroactive priority queue
- Insert(t, x): Insert x at time t in the past
- Delete(t): Delete operation that occurred at time t
- Query: What's the state now after retroactive changes?
Complexity trade-off:
- Regular operations: O(log n)
- Retroactive operations: O(log n) to O(√n) depending on structure
8.5 Lower Bounds
Comparison-Based Sorting:
Theorem: Any comparison-based sorting requires Ω(n log n) comparisons
Proof: Decision tree argument
- n! possible permutations
- Binary tree height: log₂(n!) = Θ(n log n)
Cannot do better without additional assumptions!
Searching in Unsorted Array:
Lower bound: Ω(n)
Must examine all elements potentially
Cannot do better without additional structure
8.7 When Asymptotic Analysis Fails
Non-Comparable Functions:
# f(n) = n^(1 + sin(n))
# Oscillates between:
# n^0 = 1 (when sin(n) = -1)
# n^2 (when sin(n) = 1)
# Cannot be characterized by single Big O!
# Not Θ(n^k) for any constant k
# Asymptotic notation assumes monotonic growth
Cache Effects:
# Algorithm A: O(n), poor cache locality
def algorithm_a(arr):
# Random access pattern - cache misses
for i in random.sample(range(len(arr)), len(arr)):
process(arr[i])
# Algorithm B: O(n), good cache locality
def algorithm_b(arr):
# Sequential access - cache hits
for item in arr:
process(item)
# Same Big O, but B can be 10x faster due to caching!
# Asymptotic notation doesn't capture memory hierarchy
Parallel Algorithms:
# Sequential: O(n)
def sequential_sum(arr):
return sum(arr)
# Parallel: O(n/p) where p = processors
# But asymptotically still O(n)
def parallel_sum(arr, p):
# Divide among p processors
# Each processes n/p elements
# Combine results: O(log p)
pass
# Asymptotic notation doesn't capture parallelism well
# Need parallel complexity classes: NC, P-complete
Real-Time Constraints:
# O(1) amortized, but occasional O(n) spike
class DynamicArray:
def append(self, item):
# Usually O(1)
# Occasionally O(n) during resize
# Amortized O(1) doesn't help with hard real-time deadlines!
# Asymptotic analysis doesn't capture variance or worst-case latency
9. Summary and Quick Reference
9.1 Asymptotic Notation Summary
| Notation | Meaning | Limit Form | Use | | -------- | ------------ | --------------- | ----------------------- | | f = O(g) | Upper bound | lim sup f/g < ∞ | Worst-case guarantees | | f = Ω(g) | Lower bound | lim inf f/g > 0 | Best-case, lower bounds | | f = Θ(g) | Tight bound | 0 < lim f/g < ∞ | Exact growth rate | | f = o(g) | Strict upper | lim f/g = 0 | Strictly slower | | f = ω(g) | Strict lower | lim f/g = ∞ | Strictly faster |
9.2 Common Complexities
Time:
- O(1): Constant - array access, arithmetic
- O(log n): Logarithmic - binary search, balanced trees
- O(n): Linear - array traversal
- O(n log n): Linearithmic - efficient sorting
- O(n²): Quadratic - nested loops
- O(2ⁿ): Exponential - subsets, naive recursion
- O(n!): Factorial - permutations
Space:
- O(1): Constant - in-place algorithms
- O(log n): Logarithmic - recursive binary search
- O(n): Linear - copies, hash tables
- O(n²): Quadratic - adjacency matrices
9.3 Analysis Checklist
Time Complexity:
☐ Identify basic operation
☐ Count as function of n
☐ Analyze best/average/worst cases
☐ Simplify to dominant term
☐ Express in Big O/Θ/Ω
Space Complexity:
☐ Count primitive variables
☐ Identify data structures
☐ Determine recursion depth
☐ Calculate peak usage
☐ Simplify to dominant term
Amortized Analysis:
☐ Identify expensive operations
☐ Choose method (aggregate/accounting/potential)
☐ Prove credit invariant or potential ≥ 0
☐ Calculate amortized bound
9.4 Decision Guide
Choose Algorithm Based On:
If reliability critical → Guarantee worst-case (Merge sort, AVL tree)
If average performance matters → Optimize typical case (Quicksort, Hash table)
If inputs often structured → Use adaptive (Timsort)
If space limited → In-place algorithms (Heap sort)
If real-time deadlines → Avoid amortized (no dynamic arrays in critical path)
9.5 Common Mistakes to Avoid
- Big O ≠ worst-case (it's about growth rate, not case analysis)
- Constants matter for practical performance (but not asymptotically)
- Hidden operations (sort inside loop multiplies complexity)
- Space forgotten (exponential space makes algorithm impractical)
- Amortized ≠ average (deterministic vs. probabilistic)
- Best case ≠ typical (don't over-optimize for rare inputs)
9.6 Essential Rules
For Complexity Analysis:
- Always consider both time AND space
- Analyze best, average, AND worst cases
- Don't ignore constant factors when choosing between same Big O
- Consider input characteristics (sorted, random, adversarial)
- Amortized analysis for sequences, not single operations
- Real-world: cache effects, parallelism matter beyond Big O
For Algorithm Selection:
- Match complexity to problem scale
- Worst-case guarantees for critical systems
- Average-case for general-purpose applications
- Consider space constraints (embedded systems, large data)
- Library implementations often beat textbook versions
- Benchmark when asymptotic analysis unclear
10. Conclusion
Complexity analysis provides the fundamental framework for understanding, comparing, and optimizing algorithms. By abstracting away hardware-specific details and focusing on growth rates, it enables us to:
- Predict scalability of solutions
- Compare algorithms objectively
- Identify optimization opportunities
- Understand theoretical limits
- Make informed design decisions
Key Principles:
- Growth rate dominates for large inputs
- Different cases (best/average/worst) reveal different insights
- Amortized analysis provides realistic bounds for sequences
- Space-time tradeoffs are fundamental
- Asymptotic notation is precise mathematical language
Practical Application:
- Use O for guarantees and communication
- Use Θ for precise characterization
- Consider all cases for complete picture
- Apply amortized analysis for dynamic structures
- Always consider both time and space
Mastery of complexity analysis is essential for any computer scientist or software engineer, forming the foundation for algorithm design, system architecture, and performance optimization.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles