Bucket Sort

Core Concepts

Bucket — a container (list) that collects elements whose value or key falls within a designated range or equals a designated index. Buckets are indexed so that iterating them in order produces sorted output.

Bucket size — the width of the value range each bucket covers, computed as (max_val - min_val) / n for n buckets over a known range. Choosing size too large collapses all elements into one bucket; too small wastes memory without gain.

Frequency bucket — a bucket indexed by frequency rather than value. Element with frequency f goes into buckets[f]. Enables O(n) retrieval of the k most or least frequent elements because frequency is bounded by n.

Pigeonhole principle — if n items are distributed into fewer than n containers, at least one container holds more than one item. In maximum gap: if n numbers span range R, and we use n−1 buckets of size ceil(R/(n−1)), the maximum gap cannot be within a single bucket — it must cross a bucket boundary (or be between consecutive non-empty buckets).

Range partitioning — mapping a value v to a bucket index via (v - min_val) // bucket_size. This is the fundamental operation; its correctness requires that bucket_size > 0 (at least two distinct values).

Uniform distribution assumption — standard bucket sort achieves O(n) average time only when inputs are uniformly distributed across the range. Adversarial or skewed inputs degrade inner-bucket sorts to O(n²) worst case.

Counting sort — a degenerate bucket sort where bucket size = 1, each bucket holds all occurrences of one value, and no inner sort is needed. Requires integer keys in a small known range.

See counting-sort.md if that file exists; otherwise treat counting sort as a special case here.

Structural Invariants

Invariant	What it guarantees	What breaks if violated
Bucket index order matches value order	Concatenating buckets 0, 1, … k in order produces a sorted sequence	If bucket assignment is non-monotone, the output is unsorted even if each bucket is sorted internally
`min_val` placed in bucket 0, `max_val` placed in bucket `n−1`	Range boundaries are inclusive; no value falls outside `[0, n−1]`	`index = (max_val - min_val) // bucket_size` may equal n; must clamp with `min(index, n-1)`
Bucket size > 0	Division is defined; at least two distinct values exist	Zero bucket size causes ZeroDivisionError; must special-case when all values are equal
Frequency bucket index ≤ len(array)	Frequency of any element is at most n	An index out of range n+1 corrupts the bucket array; allocate `buckets` of size `n+1`

Types / Variants

Variant	What it is	When to use	Key property
Standard bucket sort	Partition floats/ints into equal-width buckets; sort each; concatenate	Input is roughly uniform over a known range; want O(n) average	O(n) average, O(n²) worst; requires known range
Frequency bucket	Index buckets by element frequency; collect top-k by iterating from high to low index	"Top k frequent elements", "sort by frequency" problems	O(n) always; frequency ≤ n bounds the index
Pigeonhole / maximum gap	Use n−1 buckets to prove the max gap straddles bucket boundaries; track only min/max per bucket	"Maximum gap" variant; need the largest difference between consecutive sorted elements	O(n) time and space; avoids full sort
Counting sort (degenerate)	One bucket per distinct integer value; no inner sort needed	Small integer range (e.g., 0–100), values are keys directly	O(n + k) where k = range size; unstable or stable depending on scatter pass

Key Algorithms

Standard Bucket Sort

Distribute n elements into n equal-width buckets over the known range, sort each bucket with a comparison sort, then concatenate.

Key insight: if elements are uniformly distributed, each bucket holds O(1) elements on average, making the inner sort O(1) per bucket and O(n) overall.

Time O(n) average, O(n²) worst; Space O(n).

bucket_size = (max_val - min_val) / n
idx = min(int((v - min_val) / bucket_size), n - 1)

The min(..., n-1) clamp handles max_val exactly hitting a bucket boundary.

Frequency Bucket (Top-K Frequent Elements)

Count element frequencies, then place each element into buckets[freq]. Iterate buckets from index n down to 1, collecting elements until k are gathered.

Key insight: frequency is bounded by n, so the bucket array has fixed size n+1; no comparison sort needed at all.

Time O(n); Space O(n).

freq = Counter(nums)
buckets = [[] for _ in range(len(nums) + 1)]
for val, f in freq.items():
    buckets[f].append(val)

Collect by iterating for b in reversed(buckets): result.extend(b) until len(result) == k.

Maximum Gap (Pigeonhole Bucket)

Given n unsorted numbers, find the maximum difference between consecutive elements in their sorted order — in O(n) time without a full sort.

Key insight: with n−1 buckets of width ceil((max_val - min_val) / (n−1)), the maximum gap must be between elements in different buckets (gaps within a bucket are at most bucket_size). Store only (bucket_min, bucket_max) per bucket; scan adjacent non-empty buckets for the largest bucket_min[i] − bucket_max[i−1].

Time O(n); Space O(n).

size = max(1, (hi - lo) // (n - 1))
idx = (v - lo) // size          # maps v into bucket index

Edge case: n < 2 → return 0. All equal values → return 0 (hi == lo, handled by max(1, ...)).

Sort Characters by Frequency

Count character frequencies, place characters into buckets[freq], then build the result string by iterating from highest-frequency bucket downward.

Key insight: identical to frequency bucket but the "values" are characters; bucket index is the character's count. Output is char * freq repeated for each bucket entry.

Time O(n); Space O(n) where n = string length.

H-Index via Counting Sort

Given citation counts, find the largest h such that the researcher has at least h papers with ≥ h citations.

Key insight: treat it as a counting sort where bucket[c] = number of papers with exactly c citations; clamp citations > n into bucket n. A suffix-sum scan from high to low finds the first index where cumulative count ≥ index.

Time O(n); Space O(n).

buckets = [0] * (n + 1)
for c in citations:
    buckets[min(c, n)] += 1

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Top-k by frequency	Return k elements (or characters) with highest occurrence	Frequency bucket; iterate from index n downward
Maximum consecutive gap	Largest gap between sorted-order neighbors	Pigeonhole bucket; store only min/max per bucket
Sort by derived key	Sort by frequency, then lexicographically, etc.	Frequency bucket with tie-breaking on concatenation
H-index / threshold count	Largest h satisfying a count-threshold condition	Counting bucket with suffix sum
Rank / percentile in stream	Given a value, find its rank among all seen values	Counting sort bucket; prefix sum for rank query

Problem Signal → Technique

Signal in problem statement	Likely technique
"top k frequent" / "k most frequent"	Frequency bucket (O(n), avoids heap)
"maximum gap" / "largest difference between consecutive sorted"	Pigeonhole bucket
"sort by frequency"	Frequency bucket; iterate high to low
"h-index" / "at least h papers with ≥ h citations"	Counting bucket + suffix sum
Values in range [0, n] or [0, k] with k small	Counting sort (degenerate bucket)
"O(n) sort" / "linear time sort" with bounded range	Bucket or counting sort
"sort colors" / three-way partition	Dutch national flag (not bucket sort; see two-pointers)

Common Patterns

Pattern	When it applies	Mechanism
Frequency-then-bucket	Need elements ordered by how often they appear	Build `Counter`, allocate `buckets[0..n]`, scatter by freq, collect in reverse
Clamp-to-last-bucket	`max_val` maps to index n but only n−1 or n buckets exist	`idx = min(computed_idx, num_buckets - 1)` after division
Min-max-only bucket	Only the extremes within each bucket matter (max gap)	Store `(lo, hi)` tuple per bucket instead of a list; O(1) per insertion
Suffix-sum over count array	Count occurrences first, then accumulate from the right	`total += counts[i]; if total >= i: return i` (H-index pattern)
Two-pass count-then-place	Stable counting sort output order	First pass fills counts; second pass computes start positions; third pass places elements

Edge Cases & Pitfalls

All values identical: max_val == min_val → bucket_size = 0 → ZeroDivisionError. Fix: check before computing bucket size and return the array as-is (already sorted).
Single element: trivially sorted; bucket size computation may still divide by zero if not guarded. Always check n < 2 early.
max_val lands exactly on a computed boundary: (max_val - min_val) / bucket_size equals an integer, which may be exactly num_buckets, causing an out-of-bounds write. Fix: clamp with min(index, num_buckets - 1).
Frequency bucket size off-by-one: if the input has n elements, a single element can appear n times, so buckets must be of size n + 1 (indices 0 through n). Allocating size n causes an IndexError on the highest-frequency element.
Negative values in range-based bucket: (v - min_val) // bucket_size is always non-negative as long as min_val is the true minimum. If min_val is not updated correctly (e.g., empty input), the formula produces negative indices. Always recompute min_val/max_val from the actual input.
Maximum gap with n < 2: there are no consecutive pairs; return 0 immediately. Attempting to compute bucket size on a single element divides by zero.
Floating-point bucket assignment drift: using / (true division) instead of // (floor division) for bucket index may place a value in the wrong bucket due to float rounding. Prefer integer arithmetic when input values are integers.
Assuming uniform distribution for correctness: bucket sort's O(n) average time requires roughly uniform distribution. For correctness, all elements must still be sorted within each bucket; skipping the inner sort breaks the algorithm, not just the complexity.

Implementation Notes

Python's Counter(iterable) from collections returns a dict of {element: frequency}; .most_common(k) returns the top-k pairs already sorted — use it when a heap-based O(n log k) solution is acceptable, but prefer frequency bucket for strict O(n).
buckets = [[] for _ in range(n + 1)] allocates n+1 separate lists; do not use [[]] * (n+1) — that creates n+1 references to the same list, so appending to one appends to all.
Integer division // in Python floors toward negative infinity, not toward zero. For non-negative inputs this matches C-style truncation, but for mixed-sign ranges always ensure v - min_val ≥ 0 before dividing.
Python has no built-in bucket sort; the pattern is always manually constructed. sorted() uses Timsort (O(n log n)); use it for inner-bucket sorts.
Recursion depth is not a concern for bucket sort (it is iterative), but very large counting arrays (range k >> n) waste memory — prefer a hash-based approach for sparse integer ranges.

Cross-Topic Relations

Related topic	Specific connection
Counting Sort	Degenerate case of bucket sort with bucket size = 1; same O(n + k) complexity; counting sort is a prerequisite concept
Radix Sort	Applies bucket sort digit-by-digit from least to most significant; bucket sort is the inner pass of radix sort
Hash Table	Frequency buckets use a Counter (hash map) to build the frequency index before scattering into the bucket array
Heap / Priority Queue	Alternative for top-k frequent problems: heap gives O(n log k) vs. bucket's O(n); bucket sort is preferred when k is close to n
Two Pointers / Dutch National Flag	Sort Colors problem looks like 3-bucket sort but is solved optimally with three-way partition (in-place); bucket sort would require extra space
Prefix Sum	H-index and rank queries use suffix/prefix sums over the count array produced by bucket sort
Sorting (general)	Bucket sort is a non-comparison sort; its O(n) average breaks the Ω(n log n) comparison-sort lower bound by exploiting value distribution

Interview Reference

Must-Know Problems

Frequency Bucket (Top-K)

Top K Frequent Elements (LeetCode 347) — canonical frequency bucket problem
Sort Characters By Frequency (LeetCode 451) — same pattern with character output
Top K Frequent Words (LeetCode 692) — frequency bucket with lexicographic tie-breaking

Maximum Gap

Maximum Gap (LeetCode 164) — canonical pigeonhole bucket problem; O(n) required

H-Index / Threshold Count

H-Index (LeetCode 274) — counting bucket with suffix sum
H-Index II (LeetCode 275) — sorted input variant (binary search preferred, but bucket solution instructive)

Counting Sort Variants

Sort Colors (LeetCode 75) — three-value sort; Dutch national flag is optimal but counting sort works
Relative Sort Array (LeetCode 1122) — counting sort with custom order for elements outside the order list

Clarification Questions to Ask

"Are values integers, or can they be floats?" (float range partitioning requires care with division)
"What is the value range — is it bounded or arbitrarily large?" (unbounded range makes counting sort infeasible; use comparison-based instead)
"Are there negative values?" (bucket index formula assumes v − min_val ≥ 0; confirm min)
"Is stability required?" (standard bucket sort is stable if inner sort is stable and elements are placed left-to-right; confirm if output order among equal-frequency elements matters)
"What defines 'frequent'?" (for top-k: is k guaranteed < n? can there be ties at the k-th frequency?)

Common Interview Mistakes

Allocating [[]] * n instead of [[] for _ in range(n)] — all buckets share the same list object; every append corrupts all buckets simultaneously.
Forgetting to handle max_val == min_val before computing bucket size — causes ZeroDivisionError on uniform inputs.
Off-by-one on frequency bucket size: allocating n slots instead of n + 1, missing the case where one element appears n times.
Using sorted() on the full array instead of building frequency buckets — gets the right answer but is O(n log n), failing an explicit O(n) requirement.
In maximum gap: storing all elements per bucket (O(n) per bucket in worst case) instead of just (min, max) — conceptually correct but misses the O(n) space insight.
Applying bucket sort when the range is much larger than n (e.g., values up to 10^9 with n = 100) — the bucket array itself would require 10^9 slots; use a comparison sort instead.

Typical Follow-Up Escalations

"Now solve it in O(1) extra space" → Dutch national flag / counting in-place for sort colors; for top-k this is generally not possible without relaxing constraints.
"What if k can equal n?" → frequency bucket handles this naturally; heap-based solution degrades to O(n log n).
"What if there are ties at the k-th frequency?" → clarify whether any valid subset is acceptable or all tied elements must be included; adjust stopping condition in bucket iteration.
"Can you do maximum gap without extra space?" → not in O(n) time while preserving input; the pigeonhole argument requires storing per-bucket extremes.
"What if the input is a stream and you must answer online?" → bucket sort is offline (requires knowing min/max first); switch to a sorted structure or segment tree for online range queries.

Explore Unread

Great job! You've read all available articles

Bucket Sort

Core Concepts

Counting sort — a degenerate bucket sort where bucket size = 1, each bucket holds all occurrences of one value, and no inner sort is needed. Requires integer keys in a small known range.

See counting-sort.md if that file exists; otherwise treat counting sort as a special case here.

Structural Invariants

Invariant	What it guarantees	What breaks if violated
Bucket index order matches value order	Concatenating buckets 0, 1, … k in order produces a sorted sequence	If bucket assignment is non-monotone, the output is unsorted even if each bucket is sorted internally
`min_val` placed in bucket 0, `max_val` placed in bucket `n−1`	Range boundaries are inclusive; no value falls outside `[0, n−1]`	`index = (max_val - min_val) // bucket_size` may equal n; must clamp with `min(index, n-1)`
Bucket size > 0	Division is defined; at least two distinct values exist	Zero bucket size causes ZeroDivisionError; must special-case when all values are equal
Frequency bucket index ≤ len(array)	Frequency of any element is at most n	An index out of range n+1 corrupts the bucket array; allocate `buckets` of size `n+1`

Types / Variants

Variant	What it is	When to use	Key property
Standard bucket sort	Partition floats/ints into equal-width buckets; sort each; concatenate	Input is roughly uniform over a known range; want O(n) average	O(n) average, O(n²) worst; requires known range
Frequency bucket	Index buckets by element frequency; collect top-k by iterating from high to low index	"Top k frequent elements", "sort by frequency" problems	O(n) always; frequency ≤ n bounds the index
Pigeonhole / maximum gap	Use n−1 buckets to prove the max gap straddles bucket boundaries; track only min/max per bucket	"Maximum gap" variant; need the largest difference between consecutive sorted elements	O(n) time and space; avoids full sort
Counting sort (degenerate)	One bucket per distinct integer value; no inner sort needed	Small integer range (e.g., 0–100), values are keys directly	O(n + k) where k = range size; unstable or stable depending on scatter pass

Key Algorithms

Standard Bucket Sort

Distribute n elements into n equal-width buckets over the known range, sort each bucket with a comparison sort, then concatenate.

Key insight: if elements are uniformly distributed, each bucket holds O(1) elements on average, making the inner sort O(1) per bucket and O(n) overall.

Time O(n) average, O(n²) worst; Space O(n).

bucket_size = (max_val - min_val) / n
idx = min(int((v - min_val) / bucket_size), n - 1)

The min(..., n-1) clamp handles max_val exactly hitting a bucket boundary.

Frequency Bucket (Top-K Frequent Elements)

Count element frequencies, then place each element into buckets[freq]. Iterate buckets from index n down to 1, collecting elements until k are gathered.

Key insight: frequency is bounded by n, so the bucket array has fixed size n+1; no comparison sort needed at all.

Time O(n); Space O(n).

freq = Counter(nums)
buckets = [[] for _ in range(len(nums) + 1)]
for val, f in freq.items():
    buckets[f].append(val)

Collect by iterating for b in reversed(buckets): result.extend(b) until len(result) == k.

Maximum Gap (Pigeonhole Bucket)

Given n unsorted numbers, find the maximum difference between consecutive elements in their sorted order — in O(n) time without a full sort.

Time O(n); Space O(n).

size = max(1, (hi - lo) // (n - 1))
idx = (v - lo) // size          # maps v into bucket index

Edge case: n < 2 → return 0. All equal values → return 0 (hi == lo, handled by max(1, ...)).

Sort Characters by Frequency

Count character frequencies, place characters into buckets[freq], then build the result string by iterating from highest-frequency bucket downward.

Key insight: identical to frequency bucket but the "values" are characters; bucket index is the character's count. Output is char * freq repeated for each bucket entry.

Time O(n); Space O(n) where n = string length.

H-Index via Counting Sort

Given citation counts, find the largest h such that the researcher has at least h papers with ≥ h citations.

Time O(n); Space O(n).

buckets = [0] * (n + 1)
for c in citations:
    buckets[min(c, n)] += 1

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Top-k by frequency	Return k elements (or characters) with highest occurrence	Frequency bucket; iterate from index n downward
Maximum consecutive gap	Largest gap between sorted-order neighbors	Pigeonhole bucket; store only min/max per bucket
Sort by derived key	Sort by frequency, then lexicographically, etc.	Frequency bucket with tie-breaking on concatenation
H-index / threshold count	Largest h satisfying a count-threshold condition	Counting bucket with suffix sum
Rank / percentile in stream	Given a value, find its rank among all seen values	Counting sort bucket; prefix sum for rank query

Problem Signal → Technique

Signal in problem statement	Likely technique
"top k frequent" / "k most frequent"	Frequency bucket (O(n), avoids heap)
"maximum gap" / "largest difference between consecutive sorted"	Pigeonhole bucket
"sort by frequency"	Frequency bucket; iterate high to low
"h-index" / "at least h papers with ≥ h citations"	Counting bucket + suffix sum
Values in range [0, n] or [0, k] with k small	Counting sort (degenerate bucket)
"O(n) sort" / "linear time sort" with bounded range	Bucket or counting sort
"sort colors" / three-way partition	Dutch national flag (not bucket sort; see two-pointers)

Common Patterns

Pattern	When it applies	Mechanism
Frequency-then-bucket	Need elements ordered by how often they appear	Build `Counter`, allocate `buckets[0..n]`, scatter by freq, collect in reverse
Clamp-to-last-bucket	`max_val` maps to index n but only n−1 or n buckets exist	`idx = min(computed_idx, num_buckets - 1)` after division
Min-max-only bucket	Only the extremes within each bucket matter (max gap)	Store `(lo, hi)` tuple per bucket instead of a list; O(1) per insertion
Suffix-sum over count array	Count occurrences first, then accumulate from the right	`total += counts[i]; if total >= i: return i` (H-index pattern)
Two-pass count-then-place	Stable counting sort output order	First pass fills counts; second pass computes start positions; third pass places elements

Edge Cases & Pitfalls

All values identical: max_val == min_val → bucket_size = 0 → ZeroDivisionError. Fix: check before computing bucket size and return the array as-is (already sorted).
Single element: trivially sorted; bucket size computation may still divide by zero if not guarded. Always check n < 2 early.
max_val lands exactly on a computed boundary: (max_val - min_val) / bucket_size equals an integer, which may be exactly num_buckets, causing an out-of-bounds write. Fix: clamp with min(index, num_buckets - 1).
Frequency bucket size off-by-one: if the input has n elements, a single element can appear n times, so buckets must be of size n + 1 (indices 0 through n). Allocating size n causes an IndexError on the highest-frequency element.
Negative values in range-based bucket: (v - min_val) // bucket_size is always non-negative as long as min_val is the true minimum. If min_val is not updated correctly (e.g., empty input), the formula produces negative indices. Always recompute min_val/max_val from the actual input.
Maximum gap with n < 2: there are no consecutive pairs; return 0 immediately. Attempting to compute bucket size on a single element divides by zero.
Floating-point bucket assignment drift: using / (true division) instead of // (floor division) for bucket index may place a value in the wrong bucket due to float rounding. Prefer integer arithmetic when input values are integers.
Assuming uniform distribution for correctness: bucket sort's O(n) average time requires roughly uniform distribution. For correctness, all elements must still be sorted within each bucket; skipping the inner sort breaks the algorithm, not just the complexity.

Implementation Notes

Python's Counter(iterable) from collections returns a dict of {element: frequency}; .most_common(k) returns the top-k pairs already sorted — use it when a heap-based O(n log k) solution is acceptable, but prefer frequency bucket for strict O(n).
buckets = [[] for _ in range(n + 1)] allocates n+1 separate lists; do not use [[]] * (n+1) — that creates n+1 references to the same list, so appending to one appends to all.
Integer division // in Python floors toward negative infinity, not toward zero. For non-negative inputs this matches C-style truncation, but for mixed-sign ranges always ensure v - min_val ≥ 0 before dividing.
Python has no built-in bucket sort; the pattern is always manually constructed. sorted() uses Timsort (O(n log n)); use it for inner-bucket sorts.
Recursion depth is not a concern for bucket sort (it is iterative), but very large counting arrays (range k >> n) waste memory — prefer a hash-based approach for sparse integer ranges.

Cross-Topic Relations

Related topic	Specific connection
Counting Sort	Degenerate case of bucket sort with bucket size = 1; same O(n + k) complexity; counting sort is a prerequisite concept
Radix Sort	Applies bucket sort digit-by-digit from least to most significant; bucket sort is the inner pass of radix sort
Hash Table	Frequency buckets use a Counter (hash map) to build the frequency index before scattering into the bucket array
Heap / Priority Queue	Alternative for top-k frequent problems: heap gives O(n log k) vs. bucket's O(n); bucket sort is preferred when k is close to n
Two Pointers / Dutch National Flag	Sort Colors problem looks like 3-bucket sort but is solved optimally with three-way partition (in-place); bucket sort would require extra space
Prefix Sum	H-index and rank queries use suffix/prefix sums over the count array produced by bucket sort
Sorting (general)	Bucket sort is a non-comparison sort; its O(n) average breaks the Ω(n log n) comparison-sort lower bound by exploiting value distribution

Interview Reference

Must-Know Problems

Frequency Bucket (Top-K)

Top K Frequent Elements (LeetCode 347) — canonical frequency bucket problem
Sort Characters By Frequency (LeetCode 451) — same pattern with character output
Top K Frequent Words (LeetCode 692) — frequency bucket with lexicographic tie-breaking

Maximum Gap

Maximum Gap (LeetCode 164) — canonical pigeonhole bucket problem; O(n) required

H-Index / Threshold Count

H-Index (LeetCode 274) — counting bucket with suffix sum
H-Index II (LeetCode 275) — sorted input variant (binary search preferred, but bucket solution instructive)

Counting Sort Variants

Sort Colors (LeetCode 75) — three-value sort; Dutch national flag is optimal but counting sort works
Relative Sort Array (LeetCode 1122) — counting sort with custom order for elements outside the order list

Clarification Questions to Ask

"Are values integers, or can they be floats?" (float range partitioning requires care with division)
"What is the value range — is it bounded or arbitrarily large?" (unbounded range makes counting sort infeasible; use comparison-based instead)
"Are there negative values?" (bucket index formula assumes v − min_val ≥ 0; confirm min)
"Is stability required?" (standard bucket sort is stable if inner sort is stable and elements are placed left-to-right; confirm if output order among equal-frequency elements matters)
"What defines 'frequent'?" (for top-k: is k guaranteed < n? can there be ties at the k-th frequency?)

Common Interview Mistakes

Allocating [[]] * n instead of [[] for _ in range(n)] — all buckets share the same list object; every append corrupts all buckets simultaneously.
Forgetting to handle max_val == min_val before computing bucket size — causes ZeroDivisionError on uniform inputs.
Off-by-one on frequency bucket size: allocating n slots instead of n + 1, missing the case where one element appears n times.
Using sorted() on the full array instead of building frequency buckets — gets the right answer but is O(n log n), failing an explicit O(n) requirement.
In maximum gap: storing all elements per bucket (O(n) per bucket in worst case) instead of just (min, max) — conceptually correct but misses the O(n) space insight.
Applying bucket sort when the range is much larger than n (e.g., values up to 10^9 with n = 100) — the bucket array itself would require 10^9 slots; use a comparison sort instead.

Typical Follow-Up Escalations

"Now solve it in O(1) extra space" → Dutch national flag / counting in-place for sort colors; for top-k this is generally not possible without relaxing constraints.
"What if k can equal n?" → frequency bucket handles this naturally; heap-based solution degrades to O(n log n).
"What if there are ties at the k-th frequency?" → clarify whether any valid subset is acceptable or all tied elements must be included; adjust stopping condition in bucket iteration.
"Can you do maximum gap without extra space?" → not in O(n) time while preserving input; the pigeonhole argument requires storing per-bucket extremes.
"What if the input is a stream and you must answer online?" → bucket sort is offline (requires knowing min/max first); switch to a sorted structure or segment tree for online range queries.

Explore Unread

Great job! You've read all available articles

Bucket Sort

Core Concepts

Structural Invariants

Types / Variants

Key Algorithms

Standard Bucket Sort

Frequency Bucket (Top-K Frequent Elements)

Maximum Gap (Pigeonhole Bucket)

Sort Characters by Frequency

H-Index via Counting Sort

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Bucket Sort

Core Concepts

Structural Invariants

Types / Variants

Key Algorithms

Standard Bucket Sort

Frequency Bucket (Top-K Frequent Elements)

Maximum Gap (Pigeonhole Bucket)

Sort Characters by Frequency

H-Index via Counting Sort

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread