Randomized

Topic type: Algorithm Paradigm (randomized strategies; skip list is the one data-structure variant) LeetCode problem count: ~25–40 problems — core algorithms, main patterns, key complexities covered.

1. Core Concepts

Randomized algorithm — an algorithm that makes random choices during execution, trading deterministic guarantees for expected-case performance or simplicity. Two classes: Las Vegas (always correct, random runtime) and Monte Carlo (random correctness, bounded runtime).

Las Vegas algorithm — always produces the correct answer; randomness only affects how long it takes. Expected runtime is well-defined. Randomized quickselect is Las Vegas.

Monte Carlo algorithm — runs in bounded time but may return an incorrect answer with some probability. Repeated trials reduce error exponentially. Primality testing (Miller-Rabin) is Monte Carlo.

Expected time complexity — average cost over all random choices the algorithm could make, independent of input. Distinct from average-case complexity (which averages over inputs).

Reservoir sampling — technique to select k items uniformly at random from a stream of unknown or very large size n using O(k) memory. Each element seen so far has equal probability k/i of being in the reservoir after i elements.

Weighted random sampling — select an index i with probability proportional to weight[i] / sum(weights). Reduces to a prefix-sum binary-search problem or the alias method.

Fisher-Yates shuffle — in-place algorithm producing a uniformly random permutation of n elements in O(n) time. Each of the n! permutations is equally likely.

Quickselect — randomized algorithm to find the kth smallest element in expected O(n) time by partitioning around a random pivot, recursing only into the partition that contains rank k.

Rejection sampling — generate random samples from a target distribution by drawing from an easy distribution and accepting with a probability proportional to the target density. Used to sample uniformly from non-rectangular regions (e.g., a circle).

Skip list — probabilistic linked-list variant with O(log n) expected search, insert, and delete. Each node is promoted to the next level with probability p (typically 0.5), creating a multi-level index. Provides sorted-order operations like a BST but simpler to implement correctly.

Alias method — preprocessing technique enabling O(1) weighted random sampling after O(n) setup, by decomposing the weight distribution into n pairs (alias, probability) so each draw requires only one random integer and one random float.

2. Types / Variants

Variant	What it is	When to prefer it
Reservoir sampling (k=1)	Keep current item with prob 1/i at step i	Streaming, unknown length, single random pick
Reservoir sampling (k>1)	Maintain reservoir of size k; replace with prob k/i	Streaming k-sample; memory-bounded
Randomized quickselect	Partition around random pivot; recurse on one side	kth-order statistics in O(n) expected without sorting
Weighted random sampling (prefix sum)	Binary search on cumulative weights array	Static weights, infrequent draws, O(log n) per draw
Weighted random sampling (alias method)	O(n) preprocess; O(1) per draw	Static weights, very frequent draws
Fisher-Yates shuffle	Swap each position i with a random position j ≥ i	Full permutation needed in O(n)
Rejection sampling	Draw from bounding shape; accept if inside target	Non-rectangular distributions (e.g., circle)
Skip list	Probabilistic multi-level linked list	Sorted order + dynamic inserts + simple implementation

3. Key Algorithms

Reservoir Sampling (k = 1)

Selects one item uniformly at random from a stream of unknown length. Key insight: after seeing i items, the current item replaces the reservoir with probability 1/i, maintaining uniform distribution by induction.

import random
reservoir = None
for i, item in enumerate(stream, 1):
    if random.randint(1, i) == 1:
        reservoir = item

Time O(n), space O(1).

Reservoir Sampling (k > 1)

Fill reservoir with first k items, then for each subsequent item i (i > k), swap it into a random position of the reservoir with probability k/i.

reservoir = stream[:k]
for i in range(k, len(stream)):
    j = random.randint(0, i)
    if j < k:
        reservoir[j] = stream[i]

Time O(n), space O(k). Each item has exactly k/n probability of being in the final reservoir.

Randomized Quickselect

Finds the kth smallest element (0-indexed) in expected O(n) time. Partition around a random pivot; if pivot lands at rank k, done. Otherwise recurse into the correct half only — this is what separates it from full quicksort.

Key insight: random pivot avoids adversarial O(n²) worst case. Expected recurrence: T(n) = T(3n/4) + O(n) in expectation → O(n).

def quickselect(nums, k):
    pivot = random.choice(nums)
    lo = [x for x in nums if x < pivot]
    mid = [x for x in nums if x == pivot]
    hi = [x for x in nums if x > pivot]
    if k < len(lo): return quickselect(lo, k)
    elif k < len(lo) + len(mid): return pivot
    else: return quickselect(hi, k - len(lo) - len(mid))

Time O(n) expected, O(n²) worst case; space O(n) due to list creation (in-place variant uses O(log n) stack).

Fisher-Yates Shuffle

Produces a uniformly random permutation in-place. At step i, swap element i with a randomly chosen element from positions i..n-1. Key invariant: after processing position i, the first i+1 positions hold a uniform random sub-permutation.

for i in range(len(nums) - 1, 0, -1):
    j = random.randint(0, i)
    nums[i], nums[j] = nums[j], nums[i]

Time O(n), space O(1). The naive alternative (swap with any random position) is not uniform — it produces 3^n outcomes for n! permutations and is biased.

Weighted Random Sampling via Prefix Sum

Precompute cumulative weights; a uniform random float in [0, total] maps to an index via binary search (bisect).

import bisect, random
prefix = list(itertools.accumulate(weights))
val = random.random() * prefix[-1]
return bisect.bisect_left(prefix, val)

Setup O(n), per-draw O(log n). Correct for non-integer weights. Use when draws are infrequent relative to setup cost.

Rejection Sampling

To sample uniformly inside a circle of radius r: draw (x, y) uniformly from the bounding square [-r, r]²; reject and redraw if x²+y² > r². Expected draws per accepted sample = area(square)/area(circle) = 4/π ≈ 1.27.

Key insight: the acceptance probability must equal (target density) / (proposal density) × constant. If the ratio exceeds 1 anywhere, the proposal distribution is invalid.

Time O(1) expected per sample (geometric distribution on acceptance), space O(1).

Alias Method (O(1) Weighted Sampling)

Normalize weights to sum to n. Use a two-stack (small/large) algorithm to pair each "underfull" bucket with one "overfull" bucket, splitting the overfull bucket's excess. Each bucket stores (probability, alias). A draw picks a bucket uniformly then flips a biased coin.

Setup O(n), per-draw O(1). Prefer over prefix-sum binary search when draws vastly outnumber setup calls. Not worth implementing unless draw count >> n.

4. Advanced Techniques

Skip List Construction and Query

Solves: sorted dynamic sets requiring O(log n) expected insert, delete, search without tree rotations.

Mechanism: each node independently promoted to level l with probability p^l (geometric distribution). This creates a tower; higher levels act as express lanes. Search descends level by level, moving forward until the next node exceeds the target, then drops down.

When to prefer over BST/sorted list: when implementation simplicity matters and worst-case guarantees are not required. A sorted array with binary search is O(log n) search but O(n) insert; a skip list achieves O(log n) expected for both. Prefer a balanced BST (or Python sortedcontainers.SortedList) if worst-case bounds are needed.

Expected O(log n) per operation, space O(n log n) expected.

Randomized Pivot in Partition (Introsort fallback)

Solves: avoiding worst-case O(n²) on sorted or adversarial inputs in quicksort/quickselect.

Mechanism: choose pivot as random element (or median of three random elements). Adversary cannot construct a worst-case input without knowing the RNG state.

When to prefer over deterministic pivot: always prefer for competitive-programming inputs where sorted/reverse-sorted arrays are common adversarial cases. Median-of-3 is often sufficient; true random pivot is theoretically cleaner.

Expected O(n log n) sort, O(n) select.

5. Problem Taxonomy

By Problem Category

Problem Type	What it asks	Key Technique
Stream sampling	Pick k items uniformly from a stream	Reservoir sampling
Order statistics	Find kth largest/smallest efficiently	Randomized quickselect
Uniform permutation	Shuffle array so all permutations equally likely	Fisher-Yates
Weighted random index	Pick index proportional to weights	Prefix sum + binary search or alias method
Uniform point in shape	Generate random point inside circle/rectangle	Rejection sampling or polar coordinates
Sorted dynamic set	Insert/delete/search in sorted order	Skip list
Random node in linked list	Pick uniformly without knowing length	Reservoir sampling (k=1)

Problem Signal → Technique

Signal in problem statement	Likely technique
"unknown length" / "stream" / "cannot store all"	Reservoir sampling
"pick uniformly at random" from stream or large dataset	Reservoir sampling
"kth largest" / "kth smallest" in unsorted array	Quickselect (or heap; quickselect for expected O(n))
"shuffle" / "random permutation"	Fisher-Yates
"random pick" with given weight array	Prefix sum + bisect
"random point in circle" / "uniform inside shape"	Rejection sampling
"design a data structure" with random pick in O(1)	Hash map + array (see design.md)
"at least equal probability" / "each element equally likely"	Verify uniform distribution invariant

6. Common Patterns

Pattern	When it applies	Mechanism
Induction on stream position	Reservoir sampling correctness proof	Show P(item i in reservoir) = k/i; verify step i+1 maintains it
Partition-and-recurse on one side	Quickselect, binary search on random pivot	After partition, rank of pivot known; recurse only into relevant half
Cumulative weight + bisect	Weighted sampling, discrete CDFs	`bisect_left(prefix, random * total)` maps uniform float to index
Bounding-box accept/reject	Sampling from irregular regions	Draw from easy enclosing shape; accept if inside target
Two-pointer partition (in-place)	In-place quickselect, Dutch National Flag	Lomuto or Hoare partition; avoids O(n) extra space per level
Level promotion by coin flip	Skip list node height	Flip coin until tails; node height = number of heads

7. Edge Cases & Pitfalls

Biased naive shuffle — swapping each position with any random position (not just positions ≥ i) produces a non-uniform distribution. The Fisher-Yates swap must restrict j to [i, n-1] (or equivalently [0, i] when iterating backward). This is the most common shuffle bug.
Off-by-one in reservoir sampling — using random.randint(0, i) vs random.randint(1, i+1) for 1-indexed streams. The inclusion probability must be exactly 1/i; any other denominator breaks uniformity. Verify the denominator matches the count of elements seen so far.
Integer overflow in prefix sums — if weights are large integers, cumulative sum may overflow in languages with fixed-width integers. In Python this is not an issue; in C++/Java use long.
Float precision in weighted sampling — random.random() * total can produce values at floating-point boundaries. Using bisect_left rather than manual comparison avoids fence-post errors.
Quickselect worst case on sorted input — a fixed pivot (e.g., always first element) degrades to O(n²) on sorted arrays. Always randomize the pivot before partitioning.
Rejection sampling infinite loop — if the target region has zero area relative to the bounding shape (or the acceptance condition is wrong), the loop never terminates. Verify acceptance probability > 0 and that the condition correctly identifies the target region (use x*x + y*y <= r*r, not < r).
Reservoir size < k — if the stream has fewer than k elements, the reservoir holds all of them; this is valid and correct. Ensure the caller handles a return value smaller than k.
Modifying stream during reservoir sampling — reservoir sampling assumes the stream is read-only. Swapping elements in the original array while building the reservoir corrupts the algorithm.
Skip list level cap — without an explicit max-level cap, the geometric distribution can generate arbitrarily tall nodes on unlucky coin flips. Cap at log_{1/p}(n) + constant to keep space O(n log n) in expectation.

8. Implementation Notes

Python's random.randint(a, b) is inclusive on both ends — random.randint(0, i) has i+1 outcomes, not i. Double-check the range when implementing reservoir sampling.
random.choice(seq) requires a sequence with known length; it cannot draw from a generator. For streaming data, maintain the reservoir explicitly.
bisect.bisect_left finds the leftmost index where the value could be inserted; for prefix-sum weighted sampling this correctly handles ties in cumulative weights.
random.shuffle(lst) in Python implements Fisher-Yates internally and is uniform; prefer it over manual implementation unless the problem requires implementing the shuffle itself.
Python's random module uses Mersenne Twister — not cryptographically secure. For interview problems this is irrelevant; note it only if the problem mentions security.
When implementing quickselect in-place, the Lomuto partition scheme is simpler but slightly slower than Hoare; both are O(n) expected per level.
itertools.accumulate(weights) produces prefix sums lazily; convert to a list before calling bisect since bisect requires random access.

9. Cross-Topic Relations

Related topic	Connection
`probability-and-statistics.md`	Theoretical foundation: expected value, variance, geometric distribution underlie reservoir sampling and skip list analysis
`binary-search.md`	Prefix-sum weighted sampling uses bisect (binary search on cumulative weights array)
`sorting.md`	Randomized quicksort is the randomized cousin of quicksort; Fisher-Yates is a prerequisite for understanding random permutation
`heap-priority-queue.md`	Alternative to quickselect for kth-order statistics: heap gives O(n log k) worst-case guaranteed vs. O(n) expected for quickselect
`data-stream.md`	Reservoir sampling is the primary algorithm for random sampling from data streams; the two topics appear together frequently
`design.md`	"Random Pick Index" and "Insert Delete GetRandom O(1)" combine randomized selection with hash map + array design
`divide-and-conquer.md`	Randomized quickselect is a divide-and-conquer algorithm where the subproblem size is random rather than deterministic

10. Interview Reference

Must-Know Problems

Reservoir Sampling

LeetCode 382 — Linked List Random Node (k=1 reservoir sampling over a linked list)
LeetCode 398 — Random Pick Index (reservoir sampling for duplicate indices)

Weighted Random Sampling

LeetCode 528 — Random Pick with Weight (prefix sum + bisect; core weighted sampling)

Shuffle

LeetCode 384 — Shuffle an Array (Fisher-Yates; must reset to original)

Random Geometry

LeetCode 478 — Generate Random Point in a Circle (rejection sampling or polar method)
LeetCode 497 — Random Point in Non-overlapping Rectangles (weighted area sampling + uniform point in chosen rectangle)
LeetCode 519 — Random Flip Matrix (sparse random flip without resample; hash map trick)

Order Statistics

LeetCode 215 — Kth Largest Element in an Array (quickselect vs. heap — know both)
LeetCode 973 — K Closest Points to Origin (quickselect variant on distance)

Skip List

LeetCode 1206 — Design Skiplist (implement insert, erase, search with probabilistic levels)

Clarification Questions to Ask

Is the input a stream (unknown/very large n) or an array loaded into memory? — determines whether reservoir sampling or simpler random.choice applies.
Are weights integers or floats? Can weights be zero? — zero weights require skipping in prefix-sum approach or handling in alias method.
Must the implementation be reproducible (fixed seed) or truly random? — affects whether to expose a seed parameter.
For shuffle problems: must the original array be restorable, or is in-place modification acceptable?
For geometry problems: is the boundary included (closed region)? — affects the inequality (<= vs <).
What is the call frequency ratio of setup vs. draw operations? — determines alias method vs. prefix-sum bisect.

Common Interview Mistakes

Implementing biased shuffle (swapping with full range [0, n-1]) instead of Fisher-Yates; interviewers specifically watch for this.
Using a fixed pivot in quickselect without randomization, then being unable to defend against worst-case O(n²) on sorted input.
Forgetting that random.randint(a, b) in Python is inclusive — producing off-by-one probability errors in reservoir sampling.
Implementing weighted sampling by drawing a random integer in [0, total_weight-1] instead of a float in [0, total_weight), which is biased when weights are not all equal and total is not a power of two.
Not handling the case where all weights are equal separately — the generic code handles it, but interviewers may ask for the O(1) special case.
Claiming O(1) per draw for prefix-sum weighted sampling (it is O(log n)); confusing it with the alias method.
For reservoir sampling: not explaining why the algorithm is uniform — be ready to prove it by induction.

Typical Follow-Up Escalations

"Now make it work for a stream where you can't go back" → reservoir sampling (already streaming, but emphasize single-pass constraint).
"What if weights change dynamically?" → prefix-sum approach breaks; use a Binary Indexed Tree (Fenwick tree) for O(log n) update and O(log n) draw. See binary-indexed-tree.md.
"Can you do weighted sampling in O(1) per draw?" → alias method; explain O(n) preprocessing trade-off.
"What's the worst case for quickselect, and how do you avoid it?" → O(n²) with adversarial pivot; use random pivot or median-of-medians for O(n) worst case (at higher constant).
"Implement shuffle without extra space" → Fisher-Yates is already in-place; clarify the constraint if they push further.
"What if the circle problem uses a very small radius inside a large bounding box?" → rejection sampling becomes slow; polar coordinate method (r = R·√(uniform), θ = 2π·uniform) avoids rejection entirely.

Explore Unread

Great job! You've read all available articles

Randomized

1. Core Concepts

Las Vegas algorithm — always produces the correct answer; randomness only affects how long it takes. Expected runtime is well-defined. Randomized quickselect is Las Vegas.

Expected time complexity — average cost over all random choices the algorithm could make, independent of input. Distinct from average-case complexity (which averages over inputs).

Weighted random sampling — select an index i with probability proportional to weight[i] / sum(weights). Reduces to a prefix-sum binary-search problem or the alias method.

Fisher-Yates shuffle — in-place algorithm producing a uniformly random permutation of n elements in O(n) time. Each of the n! permutations is equally likely.

Quickselect — randomized algorithm to find the kth smallest element in expected O(n) time by partitioning around a random pivot, recursing only into the partition that contains rank k.

2. Types / Variants

Variant	What it is	When to prefer it
Reservoir sampling (k=1)	Keep current item with prob 1/i at step i	Streaming, unknown length, single random pick
Reservoir sampling (k>1)	Maintain reservoir of size k; replace with prob k/i	Streaming k-sample; memory-bounded
Randomized quickselect	Partition around random pivot; recurse on one side	kth-order statistics in O(n) expected without sorting
Weighted random sampling (prefix sum)	Binary search on cumulative weights array	Static weights, infrequent draws, O(log n) per draw
Weighted random sampling (alias method)	O(n) preprocess; O(1) per draw	Static weights, very frequent draws
Fisher-Yates shuffle	Swap each position i with a random position j ≥ i	Full permutation needed in O(n)
Rejection sampling	Draw from bounding shape; accept if inside target	Non-rectangular distributions (e.g., circle)
Skip list	Probabilistic multi-level linked list	Sorted order + dynamic inserts + simple implementation

3. Key Algorithms

Reservoir Sampling (k = 1)

import random
reservoir = None
for i, item in enumerate(stream, 1):
    if random.randint(1, i) == 1:
        reservoir = item

Time O(n), space O(1).

Reservoir Sampling (k > 1)

Fill reservoir with first k items, then for each subsequent item i (i > k), swap it into a random position of the reservoir with probability k/i.

reservoir = stream[:k]
for i in range(k, len(stream)):
    j = random.randint(0, i)
    if j < k:
        reservoir[j] = stream[i]

Time O(n), space O(k). Each item has exactly k/n probability of being in the final reservoir.

Randomized Quickselect

Key insight: random pivot avoids adversarial O(n²) worst case. Expected recurrence: T(n) = T(3n/4) + O(n) in expectation → O(n).

def quickselect(nums, k):
    pivot = random.choice(nums)
    lo = [x for x in nums if x < pivot]
    mid = [x for x in nums if x == pivot]
    hi = [x for x in nums if x > pivot]
    if k < len(lo): return quickselect(lo, k)
    elif k < len(lo) + len(mid): return pivot
    else: return quickselect(hi, k - len(lo) - len(mid))

Time O(n) expected, O(n²) worst case; space O(n) due to list creation (in-place variant uses O(log n) stack).

Fisher-Yates Shuffle

for i in range(len(nums) - 1, 0, -1):
    j = random.randint(0, i)
    nums[i], nums[j] = nums[j], nums[i]

Time O(n), space O(1). The naive alternative (swap with any random position) is not uniform — it produces 3^n outcomes for n! permutations and is biased.

Weighted Random Sampling via Prefix Sum

Precompute cumulative weights; a uniform random float in [0, total] maps to an index via binary search (bisect).

import bisect, random
prefix = list(itertools.accumulate(weights))
val = random.random() * prefix[-1]
return bisect.bisect_left(prefix, val)

Setup O(n), per-draw O(log n). Correct for non-integer weights. Use when draws are infrequent relative to setup cost.

Rejection Sampling

Key insight: the acceptance probability must equal (target density) / (proposal density) × constant. If the ratio exceeds 1 anywhere, the proposal distribution is invalid.

Time O(1) expected per sample (geometric distribution on acceptance), space O(1).

Alias Method (O(1) Weighted Sampling)

Setup O(n), per-draw O(1). Prefer over prefix-sum binary search when draws vastly outnumber setup calls. Not worth implementing unless draw count >> n.

4. Advanced Techniques

Skip List Construction and Query

Solves: sorted dynamic sets requiring O(log n) expected insert, delete, search without tree rotations.

Expected O(log n) per operation, space O(n log n) expected.

Randomized Pivot in Partition (Introsort fallback)

Solves: avoiding worst-case O(n²) on sorted or adversarial inputs in quicksort/quickselect.

Mechanism: choose pivot as random element (or median of three random elements). Adversary cannot construct a worst-case input without knowing the RNG state.

Expected O(n log n) sort, O(n) select.

5. Problem Taxonomy

By Problem Category

Problem Type	What it asks	Key Technique
Stream sampling	Pick k items uniformly from a stream	Reservoir sampling
Order statistics	Find kth largest/smallest efficiently	Randomized quickselect
Uniform permutation	Shuffle array so all permutations equally likely	Fisher-Yates
Weighted random index	Pick index proportional to weights	Prefix sum + binary search or alias method
Uniform point in shape	Generate random point inside circle/rectangle	Rejection sampling or polar coordinates
Sorted dynamic set	Insert/delete/search in sorted order	Skip list
Random node in linked list	Pick uniformly without knowing length	Reservoir sampling (k=1)

Problem Signal → Technique

Signal in problem statement	Likely technique
"unknown length" / "stream" / "cannot store all"	Reservoir sampling
"pick uniformly at random" from stream or large dataset	Reservoir sampling
"kth largest" / "kth smallest" in unsorted array	Quickselect (or heap; quickselect for expected O(n))
"shuffle" / "random permutation"	Fisher-Yates
"random pick" with given weight array	Prefix sum + bisect
"random point in circle" / "uniform inside shape"	Rejection sampling
"design a data structure" with random pick in O(1)	Hash map + array (see design.md)
"at least equal probability" / "each element equally likely"	Verify uniform distribution invariant

6. Common Patterns

Pattern	When it applies	Mechanism
Induction on stream position	Reservoir sampling correctness proof	Show P(item i in reservoir) = k/i; verify step i+1 maintains it
Partition-and-recurse on one side	Quickselect, binary search on random pivot	After partition, rank of pivot known; recurse only into relevant half
Cumulative weight + bisect	Weighted sampling, discrete CDFs	`bisect_left(prefix, random * total)` maps uniform float to index
Bounding-box accept/reject	Sampling from irregular regions	Draw from easy enclosing shape; accept if inside target
Two-pointer partition (in-place)	In-place quickselect, Dutch National Flag	Lomuto or Hoare partition; avoids O(n) extra space per level
Level promotion by coin flip	Skip list node height	Flip coin until tails; node height = number of heads

7. Edge Cases & Pitfalls

Biased naive shuffle — swapping each position with any random position (not just positions ≥ i) produces a non-uniform distribution. The Fisher-Yates swap must restrict j to [i, n-1] (or equivalently [0, i] when iterating backward). This is the most common shuffle bug.
Off-by-one in reservoir sampling — using random.randint(0, i) vs random.randint(1, i+1) for 1-indexed streams. The inclusion probability must be exactly 1/i; any other denominator breaks uniformity. Verify the denominator matches the count of elements seen so far.
Integer overflow in prefix sums — if weights are large integers, cumulative sum may overflow in languages with fixed-width integers. In Python this is not an issue; in C++/Java use long.
Float precision in weighted sampling — random.random() * total can produce values at floating-point boundaries. Using bisect_left rather than manual comparison avoids fence-post errors.
Quickselect worst case on sorted input — a fixed pivot (e.g., always first element) degrades to O(n²) on sorted arrays. Always randomize the pivot before partitioning.
Rejection sampling infinite loop — if the target region has zero area relative to the bounding shape (or the acceptance condition is wrong), the loop never terminates. Verify acceptance probability > 0 and that the condition correctly identifies the target region (use x*x + y*y <= r*r, not < r).
Reservoir size < k — if the stream has fewer than k elements, the reservoir holds all of them; this is valid and correct. Ensure the caller handles a return value smaller than k.
Modifying stream during reservoir sampling — reservoir sampling assumes the stream is read-only. Swapping elements in the original array while building the reservoir corrupts the algorithm.
Skip list level cap — without an explicit max-level cap, the geometric distribution can generate arbitrarily tall nodes on unlucky coin flips. Cap at log_{1/p}(n) + constant to keep space O(n log n) in expectation.

8. Implementation Notes

Python's random.randint(a, b) is inclusive on both ends — random.randint(0, i) has i+1 outcomes, not i. Double-check the range when implementing reservoir sampling.
random.choice(seq) requires a sequence with known length; it cannot draw from a generator. For streaming data, maintain the reservoir explicitly.
bisect.bisect_left finds the leftmost index where the value could be inserted; for prefix-sum weighted sampling this correctly handles ties in cumulative weights.
random.shuffle(lst) in Python implements Fisher-Yates internally and is uniform; prefer it over manual implementation unless the problem requires implementing the shuffle itself.
Python's random module uses Mersenne Twister — not cryptographically secure. For interview problems this is irrelevant; note it only if the problem mentions security.
When implementing quickselect in-place, the Lomuto partition scheme is simpler but slightly slower than Hoare; both are O(n) expected per level.
itertools.accumulate(weights) produces prefix sums lazily; convert to a list before calling bisect since bisect requires random access.

9. Cross-Topic Relations

Related topic	Connection
`probability-and-statistics.md`	Theoretical foundation: expected value, variance, geometric distribution underlie reservoir sampling and skip list analysis
`binary-search.md`	Prefix-sum weighted sampling uses bisect (binary search on cumulative weights array)
`sorting.md`	Randomized quicksort is the randomized cousin of quicksort; Fisher-Yates is a prerequisite for understanding random permutation
`heap-priority-queue.md`	Alternative to quickselect for kth-order statistics: heap gives O(n log k) worst-case guaranteed vs. O(n) expected for quickselect
`data-stream.md`	Reservoir sampling is the primary algorithm for random sampling from data streams; the two topics appear together frequently
`design.md`	"Random Pick Index" and "Insert Delete GetRandom O(1)" combine randomized selection with hash map + array design
`divide-and-conquer.md`	Randomized quickselect is a divide-and-conquer algorithm where the subproblem size is random rather than deterministic

10. Interview Reference

Must-Know Problems

Reservoir Sampling

LeetCode 382 — Linked List Random Node (k=1 reservoir sampling over a linked list)
LeetCode 398 — Random Pick Index (reservoir sampling for duplicate indices)

Weighted Random Sampling

LeetCode 528 — Random Pick with Weight (prefix sum + bisect; core weighted sampling)

Shuffle

LeetCode 384 — Shuffle an Array (Fisher-Yates; must reset to original)

Random Geometry

LeetCode 478 — Generate Random Point in a Circle (rejection sampling or polar method)
LeetCode 497 — Random Point in Non-overlapping Rectangles (weighted area sampling + uniform point in chosen rectangle)
LeetCode 519 — Random Flip Matrix (sparse random flip without resample; hash map trick)

Order Statistics

LeetCode 215 — Kth Largest Element in an Array (quickselect vs. heap — know both)
LeetCode 973 — K Closest Points to Origin (quickselect variant on distance)

Skip List

LeetCode 1206 — Design Skiplist (implement insert, erase, search with probabilistic levels)

Clarification Questions to Ask

Is the input a stream (unknown/very large n) or an array loaded into memory? — determines whether reservoir sampling or simpler random.choice applies.
Are weights integers or floats? Can weights be zero? — zero weights require skipping in prefix-sum approach or handling in alias method.
Must the implementation be reproducible (fixed seed) or truly random? — affects whether to expose a seed parameter.
For shuffle problems: must the original array be restorable, or is in-place modification acceptable?
For geometry problems: is the boundary included (closed region)? — affects the inequality (<= vs <).
What is the call frequency ratio of setup vs. draw operations? — determines alias method vs. prefix-sum bisect.

Common Interview Mistakes

Implementing biased shuffle (swapping with full range [0, n-1]) instead of Fisher-Yates; interviewers specifically watch for this.
Using a fixed pivot in quickselect without randomization, then being unable to defend against worst-case O(n²) on sorted input.
Forgetting that random.randint(a, b) in Python is inclusive — producing off-by-one probability errors in reservoir sampling.
Implementing weighted sampling by drawing a random integer in [0, total_weight-1] instead of a float in [0, total_weight), which is biased when weights are not all equal and total is not a power of two.
Not handling the case where all weights are equal separately — the generic code handles it, but interviewers may ask for the O(1) special case.
Claiming O(1) per draw for prefix-sum weighted sampling (it is O(log n)); confusing it with the alias method.
For reservoir sampling: not explaining why the algorithm is uniform — be ready to prove it by induction.

Typical Follow-Up Escalations

"Now make it work for a stream where you can't go back" → reservoir sampling (already streaming, but emphasize single-pass constraint).
"What if weights change dynamically?" → prefix-sum approach breaks; use a Binary Indexed Tree (Fenwick tree) for O(log n) update and O(log n) draw. See binary-indexed-tree.md.
"Can you do weighted sampling in O(1) per draw?" → alias method; explain O(n) preprocessing trade-off.
"What's the worst case for quickselect, and how do you avoid it?" → O(n²) with adversarial pivot; use random pivot or median-of-medians for O(n) worst case (at higher constant).
"Implement shuffle without extra space" → Fisher-Yates is already in-place; clarify the constraint if they push further.
"What if the circle problem uses a very small radius inside a large bounding box?" → rejection sampling becomes slow; polar coordinate method (r = R·√(uniform), θ = 2π·uniform) avoids rejection entirely.

Explore Unread

Great job! You've read all available articles

Randomized

1. Core Concepts

2. Types / Variants

3. Key Algorithms

Reservoir Sampling (k = 1)

Reservoir Sampling (k > 1)

Randomized Quickselect

Fisher-Yates Shuffle

Weighted Random Sampling via Prefix Sum

Rejection Sampling

Alias Method (O(1) Weighted Sampling)

4. Advanced Techniques

Skip List Construction and Query

Randomized Pivot in Partition (Introsort fallback)

5. Problem Taxonomy

By Problem Category

Problem Signal → Technique

6. Common Patterns

7. Edge Cases & Pitfalls

8. Implementation Notes

9. Cross-Topic Relations

10. Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Randomized

1. Core Concepts

2. Types / Variants

3. Key Algorithms

Reservoir Sampling (k = 1)

Reservoir Sampling (k > 1)

Randomized Quickselect

Fisher-Yates Shuffle

Weighted Random Sampling via Prefix Sum

Rejection Sampling

Alias Method (O(1) Weighted Sampling)

4. Advanced Techniques

Skip List Construction and Query

Randomized Pivot in Partition (Introsort fallback)

5. Problem Taxonomy

By Problem Category

Problem Signal → Technique

6. Common Patterns

7. Edge Cases & Pitfalls

8. Implementation Notes

9. Cross-Topic Relations

10. Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread