Probability and Statistics

Type: Math / Theory LeetCode tag size: < 50 problems — core concepts, key algorithms, applicability.

Core Concepts

Sample space — the set of all possible outcomes of a random experiment.

Event — a subset of the sample space. Probability of an event is P(E) ∈ [0, 1].

Uniform distribution — every outcome in the sample space is equally likely. For n outcomes, each has probability 1/n.

Conditional probability — P(A | B) = P(A ∩ B) / P(B). The probability of A given that B has already occurred.

Independence — events A and B are independent iff P(A ∩ B) = P(A) · P(B). Knowing B gives no information about A.

Expected value — E[X] = Σ x · P(X = x) over all outcomes. The probability-weighted average of all possible values.

Linearity of expectation — E[X + Y] = E[X] + E[Y] always, regardless of whether X and Y are independent. Used to decompose complex expectations into sums of simple ones.

Geometric probability — probability over a continuous space (area, length, volume) rather than discrete outcomes. P(event) = measure(event) / measure(total space).

Weighted probability — outcome i is selected with probability w_i / Σw. The "weight" replaces uniform likelihood.

Types / Variants

Form	What it is	When to use
Discrete uniform	Each of n values equally likely	Shuffling, random index selection
Weighted discrete	Each outcome has an explicit weight	Random pick with non-uniform distribution
Continuous uniform	Random point in a geometric region	Random point in circle, rectangle
Rejection sampling	Generate target distribution by filtering samples from an easy distribution	When direct sampling is impossible but you can evaluate whether a sample is valid

Key Algorithms

Fisher-Yates Shuffle

Produces a uniformly random permutation in O(n) time, O(1) extra space.

Key insight: at step i (from n−1 down to 0), swap a[i] with a uniformly random element from a[0..i]. This ensures every permutation is equally likely by a simple counting argument (each step chooses one of i+1 equally likely positions).

j = random.randint(0, i)
a[i], a[j] = a[j], a[i]

Time O(n), space O(1) in-place. Use when: uniform shuffle of a fixed array.

Reservoir Sampling

Select k items uniformly at random from a stream of unknown or very large length n, using O(k) space.

Key insight: when processing the i-th element (1-indexed), include it in the reservoir with probability k/i. If selected, replace a random element in the existing reservoir. After all n elements, each element has exactly k/n probability of being in the reservoir.

For k = 1: keep the current element with probability 1/i, replacing the current pick.

if random.randint(1, i) == 1:
    result = val

Time O(n), space O(k). Use when: stream has unknown length, or the dataset is too large to load in memory.

Weighted Random Sampling (prefix sum + binary search)

Select index i with probability proportional to weight w_i. Build a prefix-sum array, then binary-search for a uniform random value in [0, total_weight).

target = random.uniform(0, prefix[-1])
bisect.bisect_left(prefix, target)

Time O(n) build, O(log n) per query. Use when: weights are static and queries are repeated.

Rejection Sampling

Generate samples from distribution B using a sampler for distribution A, by accepting sample x with probability P_B(x) / (c · P_A(x)) for some constant c.

For integer transformation (e.g., Rand10 from Rand7): generate a combined value covering a range that is an integer multiple of the target range; reject values outside that multiple.

Key insight: accepted samples are unbiased because the acceptance criterion is derived from the target probability. Expected iterations = c, which is a constant.

Use when: the target distribution is hard to sample directly but easy to evaluate; simpler than transforming distributions algebraically.

Monte Carlo Estimation

Estimate a quantity by repeated random sampling. For geometric probability: uniformly sample points in a bounding region, count how many fall inside the target region, estimate P(inside) = count / total.

Use when: an exact formula is unavailable or complex, and an approximate answer is acceptable. Accuracy improves as O(1/√n).

Problem Taxonomy

By Problem Category

Problem type	What it asks	Key technique
Uniform shuffle	Rearrange array so all permutations equally likely	Fisher-Yates
Random pick from stream	Return random element from stream seen so far	Reservoir sampling
Weighted random pick	Return index proportional to given weights	Prefix sum + binary search
Random point in region	Sample uniformly from a geometric region	Geometric probability; rejection sampling for circles
Distribution transformation	Build one RNG from another	Rejection sampling
Random pick with exclusions	Sample uniformly, skipping blacklisted values	Whitelist mapping or rejection

Problem Signal → Technique

Signal in problem	Likely technique
"random pick", "equal probability" from fixed array	Fisher-Yates or direct `random.randint`
"stream", "unknown length", "linked list random"	Reservoir sampling
"weight", "probability proportional to"	Prefix sum + `bisect`
"random point in circle / rectangle"	Geometric probability; rejection for circles
"implement RandX using RandY"	Rejection sampling with combined range
"blacklist", "avoid certain indices"	Map valid indices to compact range

Common Patterns

Pattern	When it applies	Mechanism
Prefix sum for weighted sampling	Weights are static; multiple queries	Build prefix sums once; `bisect_left(prefix, random * total)`
Two-pass → one-pass with reservoir	Can't know n in advance	Maintain reservoir of size k; update with probability k/i
Bounding box → rejection	Sampling inside a non-rectangular region	Sample in enclosing rectangle; reject if outside target
Combine independent RNGs	Construct range that is a multiple of target	`(r1 - 1) * base + r2` gives uniform range [1, base²]
Area ratio for geometric probability	Probability of point landing in sub-region	P = area(target) / area(total)

Edge Cases & Pitfalls

Off-by-one in Fisher-Yates: the random index must be drawn from [0, i] inclusive, not [0, i). Excluding i itself biases the result by preventing elements from staying in place.
Reservoir sampling k > 1: when replacing an element in the reservoir, the replacement index must itself be chosen uniformly from [0, k). Replacing always at index 0 is biased.
Weighted sampling with zero weights: zero-weight entries must be excluded or their prefix-sum contribution is 0, making them unreachable — this is correct behavior, but bisect on a prefix sum with no increment still works as long as the random target is in (prefix[i-1], prefix[i]] logic is consistent.
Random point in circle: naively picking radius r = sqrt(random()) * R is needed because area scales as r². Picking r = random() * R concentrates points near the center.
Rejection sampling termination: if the combined range is not divisible by the target, ensure the rejection region is correctly identified. An off-by-one in the rejection boundary makes the distribution non-uniform.
Integer overflow in combined RNG: (r1 - 1) * base + r2 can exceed int range for large bases. Use 64-bit integers or reduce early.
random.randint(a, b) is inclusive on both ends in Python, unlike most languages where the upper bound is exclusive. Mixing conventions causes subtle biases.

Implementation Notes

Python random.randint(a, b) returns a value in [a, b] inclusive; random.random() returns [0.0, 1.0).
bisect.bisect_left(prefix, target) finds the leftmost index where prefix[i] >= target. For weighted sampling, use random.random() * prefix[-1] as the target.
Python random.shuffle(a) is Fisher-Yates in-place — use it directly unless the problem forbids library calls.
For geometric sampling, math.sqrt and math.atan2 introduce floating-point error; comparing against exact integer squares avoids it where possible.
Reservoir sampling requires a stateful object (current count + current pick); a closure or class is cleaner than global state.

Cross-Topic Relations

Related topic	Connection
Math / Number Theory	Modular arithmetic for combined RNG range calculations
Binary Search	Prefix sum + bisect is the standard weighted-sampling query
Prefix Sum	Foundation for O(log n) weighted random pick
Sorting	Needed if building a weighted-pick structure from unsorted weights

Interview Reference

Must-Know Problems

Shuffling / Uniform Sampling

Shuffle an Array (384) — implement Fisher-Yates; know why naive swap is wrong
Linked List Random Node (382) — reservoir sampling on a linked list

Weighted Sampling

Random Pick with Weight (528) — prefix sum + binary search
Random Pick Index (398) — reservoir sampling variant; multiple indices with same value

Distribution Transformation

Implement Rand10() Using Rand7() (470) — rejection sampling; derive the combined range

Geometric / Continuous

Random Point in Non-overlapping Rectangles (497) — weight rectangles by area; sample within chosen rectangle
Generate Random Point in a Circle (478) — rejection sampling or polar-coordinate trick

Sampling with Exclusions

Random Pick with Blacklist (710) — remap valid indices into compact range

Additional LeetCode Coverage

Calculate Maximum Information Gain (LC 10001)

Clarification Questions to Ask

Is the input static (build once, query many) or streaming (unknown length)?
Are weights integers or floats? Can they be zero?
Is the result required to be exactly uniform, or is an approximation acceptable?
For geometric problems: is the answer a float or should it be a grid point?
Does "random" mean a specific seed is required (reproducible) or truly pseudorandom?

Common Interview Mistakes

Implementing Fisher-Yates with the wrong range (randint(0, n-1) for all i instead of randint(0, i)) — produces a biased permutation.
Forgetting that reservoir sampling only produces unbiased results if the replacement index inside the reservoir is also random (for k > 1).
Using r = random() * R for a point in a circle instead of r = sqrt(random()) * R — clusters samples near the center.
Building a rejection sampler with an incorrect acceptance range — off-by-one means some outcomes appear twice as often.
Returning random.random() * total as an index directly instead of using bisect for weighted pick — fractional index is wrong.

Typical Follow-Up Escalations

"Now do it for k samples, not 1" → extend reservoir to size k; replace random index in [0, k) on acceptance.
"Now the weights change dynamically" → prefix sum + binary search no longer works; use a Fenwick tree for O(log n) update and query.
"What if the stream is infinite?" → reservoir sampling still works; the probability of any element being selected at step i is k/i and converges.
"Can you do the circle sampling without rejection?" → polar coordinates: r = R * sqrt(uniform()), θ = 2π * uniform(), convert to Cartesian.

Explore Unread

Great job! You've read all available articles