Radix Sort
Topic type: Technique-Based LeetCode tag: Sorting / Radix Sort
Core Concepts
Radix — the base b used to decompose a key into digits; common choices are 2 (bit-level), 10 (decimal), 256 (byte-level), or the alphabet size for strings.
Non-comparison sort — radix sort sidesteps the Ω(n log n) comparison lower bound by treating keys as structured sequences of digits and sorting digit-by-digit using a linear stable subroutine (counting sort).
Counting sort (subroutine) — for each digit position, tally frequencies in an array of size b, prefix-sum to get positions, then scatter elements into output in reverse order to preserve stability. Time O(n + b), space O(n + b).
Stability requirement — each digit-pass must be stable so that the ordering established by less-significant digits is preserved when more-significant digits are equal. Instability in any pass corrupts the final sort.
d passes — if keys have at most d digits in base b, the total time is O(d(n + b)) and space is O(n + b). Choosing b = n gives O(d·n); choosing b = 2 gives O(32n) for 32-bit integers.
LSD (Least-Significant Digit first) — sort from the rightmost digit to the leftmost. Natural for fixed-width keys (integers, fixed-length strings). Result is fully sorted after d passes.
MSD (Most-Significant Digit first) — sort on the leading digit, then recursively sort each bucket on the next digit. Natural for variable-length strings, enables early termination and in-place variants.
Digit extraction — isolating digit i of key x in base b: (x // b**i) % b. For base-2: (x >> i) & 1. For byte-level (b=256): (x >> (8*i)) & 0xFF.
Types / Variants
| Variant | What it is | When to prefer | Key property |
|---|---|---|---|
| LSD radix sort | Sort passes from right (LSB) to left (MSB) | Fixed-width keys; integers; simple implementation | Iterative; d stable counting-sort passes |
| MSD radix sort | Recursive sort from left (MSB) to right (LSB) | Variable-length strings; lexicographic order; partial sort | Enables early exit; naturally produces lexicographic groups |
| Binary MSD (Bitsort) | MSD with b=2, partitions by bits high-to-low | In-place O(1) extra space; maximum XOR queries | Each pass is a stable partition on one bit |
| 3-way radix quicksort | MSD variant that handles equal-digit buckets in-place | Strings with long common prefixes; avoids O(b) overhead per node | Combines MSD with 3-way partitioning; O(n log n) average |
| Counting sort | Single-digit-pass stable linear sort | When key range is small (≤ n); used as subroutine | O(n + k) time; not a full sort on its own for large keys |
Key Algorithms
LSD Radix Sort
Sorts n integers with values in [0, b^d) by making d left-sweeping stable passes, one per digit position.
Key insight: stability after each pass means "equal on digit i → preserve order from passes 0…i−1", so after all d passes the sequence is fully sorted.
def counting_pass(arr, exp, b=10):
cnt = [0] * b
for x in arr: cnt[(x // exp) % b] += 1
for i in range(1, b): cnt[i] += cnt[i-1]
out = [0] * len(arr)
for x in reversed(arr): cnt[(x // exp) % b] -= 1; out[cnt[(x // exp) % b]] = x
return out
Time O(d(n + b)), space O(n + b). Optimal when d is small and b ≈ n.
MSD Radix Sort (String Lexicographic)
Recursively partitions strings by character at position pos, then recurses within each character-group.
Key insight: grouping by the leading character first means any two strings that differ at position 0 are correctly ordered without ever inspecting later characters.
Time O(n · L) worst case for strings of length L; O(n log n) average with short common prefixes. Space O(n + b + L) for recursion stack depth L.
Counting Sort (Standalone / Subroutine)
Sorts n elements with keys in [0, k) in O(n + k) time and O(n + k) space. Use when k is small relative to n (k ≤ O(n)).
cnt = [0] * (k + 1)
for x in arr: cnt[x] += 1
for i in range(1, k+1): cnt[i] += cnt[i-1]
out = [0]*len(arr)
for x in reversed(arr): cnt[x] -= 1; out[cnt[x]] = x
The reversed iteration is what makes it stable — elements with equal keys appear in their original relative order.
Suffix Array via Radix Sort (DC3 / Skew Algorithm)
Constructs the suffix array of a string in O(n) time using radix sort on triples of characters.
Key insight: radix-sort all suffixes on (char[i], char[i+1], char[i+2]) triples at positions i ≡ 1, 2 mod 3, assign ranks, recurse on the reduced string of ranks, then merge with positions i ≡ 0 mod 3 in a final O(n) pass.
Time O(n), space O(n). Preferred over O(n log² n) prefix-doubling when n > 10^5.
See suffix-array.md for full coverage of suffix array construction and applications.
Prefix-Doubling Suffix Array (Radix-accelerated)
Builds suffix array by iteratively sorting on doubled-length prefixes. Each doubling round uses two counting-sort passes (one per half of the 2k-length key).
Key insight: after round r, suffixes are ranked by their first 2^r characters; two radix-sort passes per round give O(n log n) total.
Time O(n log n), space O(n). Simpler to implement than DC3; preferred for n ≤ 10^5.
Maximum XOR via Binary MSD Trie
For each number, insert bits MSB-to-LSB into a binary trie. To maximize XOR with a query, greedily take the opposite bit at each level.
Key insight: this is MSD binary radix sort applied incrementally — the trie is the MSD tree of the bit-sorted set, and the greedy walk is a trie query.
Time O(n · W) build, O(W) per query, space O(n · W), where W = bit-width (32 or 64).
See trie.md for full trie coverage including XOR trie patterns.
Advanced Techniques
Radix Sort on (key, index) Pairs for Stable Rank Assignment
Problem class: Problems requiring stable secondary-sort ordering (suffix arrays, frequency sorts, coordinate compression with tie-breaking).
Mechanism: Encode the secondary sort key into the low bits and the primary key into the high bits of a single integer, then apply one LSD radix sort pass. Alternatively, do two passes: first sort by secondary key, then stable-sort by primary key.
When to use over alternatives: Use when you need O(n) total sort (not O(n log n)) AND the secondary key is bounded. Python's sorted(..., key=..., stable=True) achieves the same but in O(n log n); the radix version wins only when the constant matters.
Complexity: O(n + b) per pass, O(n) total with two passes.
Bucket-Width Pigeonhole (Maximum Gap)
Problem class: Finding the maximum gap between consecutive elements in a sorted sequence without actually sorting — LeetCode 164 (Maximum Gap).
Mechanism: With n elements in [min, max], create n−1 buckets each of width ⌈(max−min)/(n−1)⌉. By pigeonhole, the maximum gap must span at least one empty bucket, so only track each bucket's min and max. Scan adjacent buckets for the maximum cross-bucket gap.
When to use over alternatives: Use when the problem guarantees O(n) time is required and you want to avoid full radix sort. Simpler than LSD radix sort for this specific problem. Does NOT sort the array; only finds the max gap.
Complexity: O(n) time, O(n) space.
Negative Numbers and Floats Encoding
Problem class: Applying radix/counting sort to inputs that include negative integers or IEEE floats.
Mechanism for negatives: Shift all values by subtracting the minimum, sort, then shift back. Alternatively, sort positive and negative halves separately, reverse the negative half, and concatenate.
Mechanism for floats: Reinterpret the 32-bit/64-bit float bit pattern as an unsigned integer. IEEE 754 ordering of non-negative floats matches integer ordering of their bit patterns. For negatives: flip all bits (not just the sign bit) to restore ordering.
When to use over alternatives: Encoding trick is only worthwhile if you need O(n) sort. For most LeetCode problems, Python's sorted() suffices.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Sort integers in O(n) | Sort array with bounded values | LSD radix sort or counting sort |
| Maximum gap | Largest difference between consecutive sorted elements | Bucket pigeonhole (O(n)) or radix sort |
| Lexicographic string sort | Sort strings by dictionary order | MSD radix sort or Python sorted() |
| Suffix array construction | All suffixes sorted lexicographically | Prefix-doubling + radix sort, or DC3 |
| Maximum XOR pair | Find pair (a, b) maximizing a XOR b | Binary MSD trie (see trie.md) |
| Frequency sort | Sort by frequency, then lexicographically | Two-pass counting + stable radix pass |
| Rank / coordinate compression | Assign dense ranks preserving order | Counting sort for bounded keys |
| Sort by multiple keys | Primary + secondary sort in O(n) | Two-pass LSD (secondary first, then primary) |
Problem Signal → Technique
| Signal in problem statement | Likely technique |
|---|---|
| "O(n) time" or "linear time sort" | Counting sort or LSD radix sort |
| "maximum gap between consecutive elements" | Bucket pigeonhole (Maximum Gap pattern) |
| "sort strings lexicographically" at scale | MSD radix sort |
| "suffix array" or "all suffixes sorted" | Prefix-doubling with radix sort |
| "maximum XOR" of subset or pair | Binary MSD trie |
| "sort by frequency" | Counting sort on frequencies + stable pass |
| "integers in range [0, k]" | Counting sort (k ≤ O(n)) |
| "non-negative integers, range ≤ 10^6" | Counting sort or LSD radix sort (b=10, d=7) |
| "stable sort required" | Any LSD-based approach with reversed scatter |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Reverse-scatter stable pass | Any LSD counting sort pass | Iterate reversed(arr) during scatter to preserve input order for equal keys |
| Prefix-sum for position mapping | Every counting sort pass | cnt[i] += cnt[i-1] converts frequency array to start/end positions |
| Two-pass LSD for compound keys | Sorting by (primary, secondary) in O(n) | Pass 1: stable sort by secondary; Pass 2: stable sort by primary |
| Bucketize then scan adjacent | Maximum gap, nearest-neighbor in 1D | Assign each element to a bucket; only adjacent buckets can hold the answer |
| Rank renaming between rounds | Suffix array prefix-doubling | After each radix sort round, reassign integer ranks to enable next-round sort on rank pairs |
| Bit-level partition (bitsort) | In-place MSD with b=2 | Partition array in-place by the current bit (high to low); no extra array needed |
Edge Cases & Pitfalls
-
Instability in the counting-sort pass — iterating
arrforward during scatter instead ofreversed(arr)breaks stability. The consequence is that LSD radix sort produces wrong results even though individual passes look correct. Fix: always scatter in reverse input order. -
Off-by-one in digit count — computing d as
len(str(max(arr)))fails forarr = [0](returns 1, correct) but fails if all elements are 0 and you compute digits from a non-zero max. Guard withmax(arr) or 1. -
Negative values without encoding — digit extraction
(x // exp) % bproduces negative remainders in Python for negativexif using//(Python floors toward −∞, so it actually works for Python, but not in C/Java). In other languages, shift all values to non-negative before sorting. -
Variable-length strings in LSD — LSD requires all keys to have the same length. Pad shorter strings with a sentinel character (e.g., null byte, value 0) that sorts before all real characters, OR use MSD which handles variable lengths natively.
-
Base choice too large — using b = 10^9 for a single-pass radix sort allocates a 10^9-entry count array, blowing memory. Keep b ≤ n or use b = 256/1024 for byte-level sorts.
-
Counting sort with mutable key during scatter — if you modify elements in the array while scattering (in-place variants), you corrupt the count array's position tracking. Always write to a separate output array.
-
Suffix array rank ties — in prefix-doubling, if two suffixes have the same rank pair after round r, they must receive the same rank (not distinct arbitrary ranks). Assigning distinct arbitrary ranks produces a wrong suffix array silently.
-
Maximum Gap with n < 2 — bucket computation divides by (n−1); guard against n=1 (gap is undefined) and n=0.
-
All elements identical — radix sort handles this correctly, but counting sort's prefix-sum produces cnt[x] = n and everything scatters to the last n positions. Confirm output array size is exactly n.
Implementation Notes
- Python's built-in
sorted()uses Timsort (O(n log n), stable); reach for radix sort only when a problem explicitly requires O(n) or when implementing a suffix array. (x // b**i) % buses Python's arbitrary-precision integers safely; for large b or i, precomputeexp = b**ioutside the inner loop.- Python list comprehensions for the scatter loop are faster than explicit for-loops at n > 10^4 due to bytecode overhead.
- The
reversed()call in the scatter pass has zero extra memory cost in Python (returns an iterator, not a copy). - For suffix arrays, use the prefix-doubling approach for n ≤ 10^5 (simpler code); switch to DC3/SA-IS for n > 10^5.
collections.Counterfollowed bysorted(counter.items())is effectively counting sort + output rebuild in two lines and is idiomatic Python for frequency-sort problems.- When b = 256 (byte-level), use
(x >> (8*pass)) & 0xFFfor digit extraction — it is faster than integer division.
Cross-Topic Relations
| Related topic | Specific connection |
|---|---|
| Counting Sort | Direct subroutine of every LSD/MSD radix sort pass; radix sort is counting sort applied d times |
| Trie | MSD radix sort tree is structurally identical to a trie; XOR-trie problems use binary MSD logic |
| Suffix Array | Efficient suffix array construction (O(n log n) and O(n)) depends entirely on radix sort for the ranking passes |
| Sorting (general) | Radix sort is the only O(n) general-purpose sort for integers; compare with merge/heap/quick for comparison-based lower bound |
| Bit Manipulation | Binary (b=2) radix sort is a sorting application of bit-level operations; XOR maximization links to bit manipulation patterns |
| Prefix Sum | The prefix-sum step inside counting sort is a direct application of the prefix-sum technique |
| Bucket Sort | Bucket sort and radix sort both distribute elements into bins; radix sort is bucket sort applied recursively on digit significance |
| Divide and Conquer | MSD radix sort is a divide-and-conquer algorithm: split by leading digit, recurse on each group |
Interview Reference
Must-Know Problems
Counting Sort / Basic Radix
- Sort Colors (LC 75) — Dutch National Flag; counting sort with 3 values
- Sort Array by Parity (LC 905) — two-bucket partition
- Relative Sort Array (LC 1122) — counting sort with custom order
- Sort Characters By Frequency (LC 451) — counting sort on char frequencies
Maximum Gap / Bucket Pigeonhole
- Maximum Gap (LC 164) — O(n) bucket approach; classic radix sort or pigeonhole
Suffix Array (Radix-Accelerated)
- Longest Duplicate Substring (LC 1044) — binary search + rolling hash OR suffix array
- Last Substring in Lexicographical Order (LC 1163) — suffix array or two-pointer
- Number of Distinct Substrings — suffix array + LCP array
Maximum XOR (Binary MSD)
- Maximum XOR of Two Numbers in an Array (LC 421) — binary trie or bit-by-bit greedy
- Maximum XOR With an Element From Array (LC 1707) — offline queries on XOR trie
Hard / Multi-Technique
- Count of Smaller Numbers After Self (LC 315) — merge sort or BIT; radix sort variant possible
- Minimum Number of Increments on Subarrays (LC 1526) — counting sort on heights
- Smallest Range Covering Elements from K Lists (LC 632) — sort + sliding window
Clarification Questions to Ask
- "Are values guaranteed non-negative, or can they be negative?" — determines whether encoding is needed.
- "What is the range of values? Is k ≤ O(n)?" — determines if counting sort is applicable at all.
- "Are strings fixed-length or variable-length?" — LSD requires fixed length or padding; MSD handles variable.
- "Is a stable sort required, or only correct final ordering?" — stability matters for multi-key sorts but not for single-key integer sort.
- "Is O(n log n) acceptable or is O(n) required?" — determines whether to use radix sort vs. Python's built-in.
- "Are there duplicate keys?" — affects suffix array rank assignment and must be handled explicitly.
Common Interview Mistakes
- Implementing counting sort as unstable (forward scatter instead of reverse scatter), then claiming the LSD sort is correct — wrong output on ties.
- Choosing base b equal to the max value (e.g., b = 10^9) without checking memory — allocates a gigabyte-sized array.
- Forgetting to re-copy the output array back into the original array between passes in an in-place LSD implementation — each pass overwrites but the next pass reads from the original.
- Using MSD radix sort on fixed-width integers and not unifying the recursion base case — infinite recursion when all remaining digits are 0.
- Confusing "sort is stable" with "output is correct" — a sort on a single integer key doesn't need stability; stability only matters when there are secondary keys that were established by previous passes.
- In suffix array prefix-doubling, assigning distinct ranks to equal rank-pairs — silently produces a wrong suffix array that passes most tests but fails on strings with many repeated characters.
Typical Follow-Up Escalations
- "Now do it in O(n) space instead of O(n + b)" → In-place MSD bitsort (b=2) using two-pointer partition on each bit; O(n·W) time, O(W) stack space.
- "What if numbers can be negative?" → Shift by minimum, sort, shift back; or sort positive/negative halves separately.
- "Can you sort strings instead of integers?" → Switch to MSD radix sort with alphabet size as base; handle variable lengths by treating end-of-string as a sentinel smaller than all characters.
- "What's the maximum XOR of any subarray (not just pair)?" → The answer is always achievable by a prefix XOR pair; build a trie of prefix XORs and query each (reduces to LC 421 variant).
- "Now find the k-th largest element in O(n)" → Radix selection: apply LSD passes only until the bucket containing the k-th element is isolated; O(d·n) worst case, O(n) practical.
- "What if the values are floats?" → Reinterpret IEEE 754 bit patterns as unsigned integers (with sign-bit correction for negatives), apply integer radix sort, reinterpret back.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles