Sorting

Topic type: Technique-Based — a specific coding technique applied to data; sort is both a standalone problem and a preprocessing step.

LeetCode has 200+ problems tagged Sorting. All sections are included at full depth.

Core Concepts

Total order — a relation ≤ on a set satisfying reflexivity, antisymmetry, transitivity, and totality (every pair is comparable). A sort produces a sequence respecting a total order.

Strict weak ordering — what a comparator function must implement: irreflexivity, asymmetry, transitivity, and transitivity of equivalence. Violating any property causes undefined behavior in std::sort and incorrect results everywhere.

Stable sort — equal elements retain their original relative order. Required when secondary sort key is implicit (e.g., the position in the original array).

In-place sort — uses O(1) extra space (or O(log n) for stack frames). Not synonymous with unstable; merge sort is stable but not in-place.

Comparison-based sort — only uses a[i] < a[j] queries. Lower bound: Ω(n log n) comparisons for any comparison sort in the worst case.

Non-comparison sort — exploits the fact that values come from a bounded domain. Can achieve O(n) time by distributing rather than comparing.

Inversion — a pair (i, j) with i < j but a[i] > a[j]. A fully sorted array has 0 inversions; a reverse-sorted array has n(n−1)/2. Inversion count measures "how unsorted" an array is.

Comparator — a function cmp(a, b) returning negative/zero/positive (or a key(x) mapping each element to a comparable value). Key functions are faster in Python because they call the function once per element, not once per comparison.

Structural Invariants

Invariant	What maintains it	What breaks if violated
For all i < j in output: key(a[i]) ≤ key(a[j])	Every sort algorithm's postcondition	Downstream algorithms (binary search, two pointers) produce wrong answers
Comparator is a strict weak ordering	Caller's responsibility	`sort` may loop, crash, or produce non-deterministic output
Stable sort: equal elements preserve input order	Only guaranteed by stable algorithms	Implicit secondary sort key is silently broken
In-place: no allocation beyond O(log n) stack	Algorithm choice	OOM for large inputs when O(n) space assumed to be free

Types / Variants

Algorithm	Time (avg / worst)	Space	Stable	Notes
Merge Sort	O(n log n) / O(n log n)	O(n)	Yes	Divide-and-conquer; preferred when stability required
Quick Sort	O(n log n) / O(n²)	O(log n)	No	In-place; worst case on sorted input without randomization
Heap Sort	O(n log n) / O(n log n)	O(1)	No	In-place; poor cache performance; rarely used in practice
Insertion Sort	O(n) best / O(n²)	O(1)	Yes	Best for small (< 32) or nearly-sorted arrays; used inside TimSort
Tim Sort	O(n) best / O(n log n)	O(n)	Yes	Hybrid merge+insertion; Python's `list.sort()` and Java's `Arrays.sort`
Counting Sort	O(n + k)	O(k)	Yes	Non-comparison; k = value range; impractical for large k
Radix Sort	O(nk) / O(nk)	O(n + k)	Yes	Non-comparison; k = number of digits; outperforms comparison sorts when k is small
Bucket Sort	O(n) avg / O(n²)	O(n)	Yes	Non-comparison; requires values distributed uniformly over a range
Shell Sort	O(n log² n) avg	O(1)	No	Generalized insertion sort; used in practice for medium-sized arrays

When to use each:

Default: Python/Java built-in (Tim Sort) — stable, fast in practice, handles nearly-sorted inputs well.
Need in-place + no stability: Quick Sort with randomization.
Need O(n): values are integers in a small range → Counting Sort; many digits but small digit range → Radix Sort; uniform floats in [0,1] → Bucket Sort.
Small subarray (< 32 elements): Insertion Sort.
Custom objects with multi-key ordering: Tim Sort with a key= function.

Key Algorithms

Merge Sort

Recursively split array into halves, sort each half, merge sorted halves in O(n). Merge maintains two pointers advancing through each half and appending the smaller element.

Key insight: merge of two sorted sequences of length m and n takes O(m+n) and produces a sorted sequence; the recursion depth is O(log n).

Time O(n log n), Space O(n) for the merge buffer. Parallelises naturally.

Quick Sort

Choose a pivot, partition array into elements ≤ pivot and > pivot, recurse on each side.

Lomuto partition: pivot = last element; single pointer scans left-to-right swapping elements ≤ pivot left. Simple but performs poorly on duplicates.

Hoare partition: two pointers converge from both ends, swapping when a[left] > pivot and a[right] < pivot. Fewer swaps; handles duplicates better.

3-way (Dutch flag) partition: three regions [< pivot | == pivot | > pivot]. Reduces O(n²) to O(n) when all elements equal.

Key insight: after partitioning, the pivot is in its final position. Expected recursion depth O(log n) with random pivot; worst-case O(n) without randomization.

Time O(n log n) avg / O(n²) worst, Space O(log n) stack.

import random
def quicksort(a, lo, hi):
    if lo < hi:
        p = random.randint(lo, hi); a[p], a[hi] = a[hi], a[p]
        # Lomuto: returns pivot index after partition

Quick Select (k-th smallest)

Same partition as Quick Sort, but recurse only into the side containing index k. Expected O(n) time.

Key insight: after each partition, the pivot is placed at its final position. If pivot index == k, done.

Time O(n) avg / O(n²) worst, Space O(1) iterative.

Heap Sort

Build a max-heap in O(n) via bottom-up heapify, then repeatedly extract the max (swap root with last, sift-down, shrink heap).

Key insight: after i extractions, the last i positions hold the i largest elements in sorted order.

Time O(n log n), Space O(1). See heap.md for heapify details.

Counting Sort

For integers in [0, k): count occurrences, compute prefix sums (cumulative counts = positions), scatter elements into output array right-to-left for stability.

Key insight: prefix sum at position v gives the starting output index for all elements with value v.

Time O(n + k), Space O(k). Breaks when k >> n.

Radix Sort (LSD)

Sort by least-significant digit first, then next digit, ..., most-significant digit. Use stable sort (counting sort) per digit.

Key insight: stable sorting by each digit in order of significance produces a fully sorted result because later passes preserve relative order from earlier passes.

Time O(nk) where k = number of digits, Space O(n + b) where b = base.

Bucket Sort

Partition input range into n buckets of equal width, distribute elements, sort each bucket (usually insertion sort), concatenate.

Key insight: if input is uniformly distributed, expected elements per bucket is O(1), giving O(n) total.

Time O(n) avg / O(n²) worst, Space O(n).

Counting Inversions via Merge Sort

Augment merge: when an element from the right half is placed before elements from the left half, count how many left-half elements remain (all form inversions with this right element).

Key insight: during merge of two sorted halves, each "right wins over left" event contributes exactly mid - left_pointer inversions.

Time O(n log n), Space O(n).

Advanced Techniques

Custom Comparator with Transitive Closure

Solves: Problems where natural ordering doesn't apply but a relative preference exists (e.g., "Largest Number": should "3" come before "30"? Compare "330" vs. "303").

Mechanism: define cmp(a, b) that returns negative if a should come first. Convert to key via functools.cmp_to_key.

When to use over key=: only when the ordering is inherently relational (depends on both elements) and cannot be expressed as a mapping of individual elements to a comparable value. key= is always faster when applicable.

Time O(n log n), but each comparison may be O(k) where k is element size.

Sort + Two/Three Pointers

Solves: pair/triplet/subarray problems where you need to find combinations summing to a target, or minimize/maximize a gap.

Mechanism: sorting establishes monotone structure — moving one pointer in one direction has predictable effect. After sorting, two pointers can scan from both ends in O(n) rather than O(n²) brute force.

When to use: after sorting doesn't destroy required information (indices don't matter, duplicates can be handled). Do NOT use when original indices are part of the answer and cannot be recovered.

Time O(n log n + n) = O(n log n).

Coordinate Compression

Solves: problems where values span a huge range (up to 10^9) but only relative order matters, and you need to use the values as indices (e.g., BIT/segment tree indexed by value).

Mechanism: sort unique values, assign rank 0..m−1, replace all occurrences with their rank. Reduces value range from 10^9 to n.

When to use over direct indexing: when k (value range) >> n (number of elements), making O(k) space/time algorithms impractical.

Time O(n log n) for sort + O(log n) per lookup via binary search.

sorted_vals = sorted(set(a))
rank = {v: i for i, v in enumerate(sorted_vals)}

Partial Sort / Top-K Selection

Solves: finding k smallest/largest without fully sorting.

Mechanism: QuickSelect for exact k-th element in O(n) avg; heap of size k for streaming top-k in O(n log k).

When to use over full sort: when k << n and you don't need a sorted order of all k elements. If you need them sorted, heap gives O(n log k) vs. sort's O(n log n); heap wins when k << n.

Indirect Sort (sort indices, not values)

Solves: problems needing the sorted permutation rather than sorted values (argsort); preserving original indices while sorting.

Mechanism: sort list(range(n)) by key lambda i: a[i]. Original array untouched.

When to use: when you need to answer queries in sorted order but also need original indices, or when sorting multiple arrays by the same key.

order = sorted(range(n), key=lambda i: a[i])

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Sort and scan	Process elements in sorted order to find optimal grouping/pairing	Sort + single pass
Merge intervals	Given intervals, find overlapping/non-overlapping groups	Sort by start, greedy merge
Custom ordering	Arrange elements by a non-standard rule	Custom comparator / `cmp_to_key`
Count inversions	How many pairs are out of order	Merge sort augmentation
Top-K elements	K largest/smallest/frequent	Partial sort, heap, QuickSelect
K-th element	Exact rank query	QuickSelect O(n), heap O(n log k)
Sort as preprocessing	Sorting makes a different algorithm (two pointers, binary search, greedy) feasible	Sort + apply algorithm
Wiggle / alternate sort	Arrange so a[0] ≤ a[1] ≥ a[2] ≤ ...	Sort + interleave, or in-place
Nearly sorted	Input has small inversions or is k-sorted	Insertion sort O(nk), heap O(n log k)

Problem Signal → Technique

Signal in problem	Likely technique
"sort" / "sorted order" / "ascending"	Direct sort
"largest number formed by"	`cmp_to_key` string concatenation comparator
"number of inversions" / "how many swaps"	Merge sort inversion count
"k-th smallest/largest"	QuickSelect or heap of size k
"top K frequent"	Count + partial sort / bucket sort
"meeting rooms" / "intervals overlap"	Sort by start time
"minimum difference between any two"	Sort + adjacent difference scan
"two sum / three sum" after sort	Sort + two pointers
"median of stream"	Two heaps; for static array: sort
values up to 10^9 but n ≤ 10^5	Coordinate compression
"relative order preserved for equal"	Stable sort required

Common Patterns

Pattern	When it applies	Mechanism
Sort + adjacent scan	Find minimum difference, detect duplicates, group equal elements	Sort, then iterate; consecutive elements are closest under sort order
Sort + binary search	Find if target exists, count elements in range, find insertion point	Sort once O(n log n), then O(log n) per query
Sort + greedy sweep	Interval scheduling, task assignment, meeting rooms	Sort by start/end/deadline; greedy decision at each step is correct because sorted order eliminates need to look back
Sort + two pointers	Pair/triplet sum, container with most water variant	Monotone structure after sort lets two pointers eliminate O(n) candidates per step
Sort + sliding window	Minimum difference between elements from k arrays	Sort all, use window to track elements across sources
Sort by frequency	Group anagrams, top-K frequent, reorganize string	Count + sort by count; bucket sort for O(n)
Indirect sort (argsort)	Answer multiple queries in value order while retaining indices	Sort index array by value; process in sorted order
Dutch flag (3-way partition)	Sort array with 3 distinct values (colors), or partition around pivot with duplicates	Three-pointer scan: lo, mid, hi maintaining three regions

Edge Cases & Pitfalls

Comparator violates strict weak ordering: if cmp(a,b) returns 0 for unequal elements inconsistently, sort produces non-deterministic output. Always verify transitivity: if cmp(a,b) < 0 and cmp(b,c) < 0 then cmp(a,c) < 0 must hold.
Integer overflow in comparator: return a - b overflows when a is large positive and b is large negative (or vice versa). Use return (a > b) - (a < b) or return 1 if a > b else -1 if a < b else 0.
Counting sort with negative values: range is [min, max], not [0, max]. Offset all values by -min before counting; forget the offset and you access index -1.
Counting sort when k >> n: k = 10^9 is unacceptable. Recognize when non-comparison sort is impractical; fall back to O(n log n).
Quick sort worst case on sorted/reverse-sorted input: without randomization, pivot = first or last element causes O(n²) behavior. Always randomize pivot or use median-of-three.
Stability assumption with unstable sort: using heap sort or quick sort when equal elements must maintain original order silently corrupts the result. Check whether stability is required before choosing algorithm.
Mutating array during sort: Python's list.sort() is undefined if the list changes during sorting (e.g., via a comparator with side effects). Never mutate the array being sorted.
sort() vs sorted(): list.sort() returns None, not the sorted list. x = a.sort() then using x is a common bug.
Dutch flag off-by-one: the mid pointer processes element at mid before incrementing. Swapping a[mid] and a[hi] does NOT increment mid because the swapped-in element is unclassified.
Merge sort extra allocations: creating a new list per merge call in Python leads to O(n log n) allocations. Allocate a single temporary buffer at the top level and pass it down.

Implementation Notes

Python's list.sort() and sorted() are Tim Sort — stable, O(n log n) worst case, O(n) for nearly-sorted input. Prefer key= over cmp_to_key; key function is called once per element vs. once per comparison.
functools.cmp_to_key wraps a comparison function for use with sorted. Required when ordering is inherently relational (two elements must both be visible to determine order).
heapq is a min-heap; negate values for max-heap. For top-k largest, use heapq.nlargest(k, iterable) which is O(n log k).
Python recursion limit (default 1000) breaks naive recursive merge sort for n > ~500. Use iterative bottom-up merge or increase with sys.setrecursionlimit.
For counting sort, collections.Counter builds frequency dict in O(n) but iterating in sorted key order is O(k log k) — use range(min_val, max_val+1) for O(k) scatter.
bisect.bisect_left(a, x) requires a to already be sorted; does not sort for you.
Comparing strings with < in Python is lexicographic (not length-based). For "Largest Number", the comparator str(a)+str(b) > str(b)+str(a) is correct but must be wrapped with cmp_to_key.

Cross-Topic Relations

Related topic	Connection
Two Pointers	Sorting is the standard preprocessing step that makes two-pointer O(n) scans correct
Binary Search	Binary search requires sorted input; sort once, query many times
Heap / Priority Queue	Heap sort, partial sort, top-K, and k-th element problems; QuickSelect vs. heap tradeoff
Greedy	Most greedy interval/scheduling algorithms require sorting by start/end/deadline first
Merge Intervals	Interval problems almost always begin with sort by start time
Sliding Window	Sometimes combined: sort, then apply sliding window on sorted array
Divide and Conquer	Merge sort is the canonical divide-and-conquer sort; inversion count uses the same structure
Counting / Prefix Sum	Counting sort and radix sort; coordinate compression enables prefix-sum on compressed domain

Interview Reference

Must-Know Problems

Sorting mechanics and partitioning

Sort Colors (Dutch flag, 3-way partition) — Easy
Kth Largest Element in an Array (QuickSelect or heap) — Medium
Find K Closest Elements — Medium

Sort as preprocessing

Merge Intervals — Medium
Non-overlapping Intervals — Medium
Meeting Rooms II — Medium
Minimum Number of Arrows to Burst Balloons — Medium
Two Sum II (sorted array) — Medium
3Sum — Medium

Custom ordering

Largest Number — Medium
Reorder Data in Log Files — Medium
Sort Characters By Frequency — Medium
Wiggle Sort II — Medium

Advanced / Hard

Count of Smaller Numbers After Self (inversion count via merge sort) — Hard
Reverse Pairs (modified merge sort) — Hard
Count of Range Sum (merge sort on prefix sums) — Hard
Maximum Gap (bucket sort, pigeonhole) — Hard

Additional LeetCode Coverage

Third Maximum Number (LC 414)
Largest Number At Least Twice of Others (LC 747)
Largest Perimeter Triangle (LC 976)
Minimum Absolute Difference (LC 1200)
Average Salary Excluding the Minimum and Maximum Salary (LC 1491)
Can Make Arithmetic Progression From Sequence (LC 1502)
Maximum Units on a Truck (LC 1710)
Find Players With Zero or One Losses (LC 2225)

Clarification Questions to Ask

Are values integers? What is the range? (Determines feasibility of counting/radix sort.)
Are there duplicates? Does their relative order matter? (Stability requirement.)
Is the array nearly sorted? (Insertion sort / Tim Sort O(n) best case.)
Can I modify the input array, or must I return a new sorted array?
What does "k-th" mean — 0-indexed or 1-indexed?
For custom ordering problems: is the comparator well-defined for all pairs? (Transitive closure.)

Common Interview Mistakes

Choosing counting sort without checking value range, leading to O(10^9) space.
Writing return a - b as a comparator and not considering integer overflow.
Forgetting to randomize the pivot in Quick Sort when interviewer inputs a sorted array as a test case.
Using unstable sort when equal elements must preserve relative order (e.g., sorting by one key then expecting secondary key from original order to be preserved).
Confusing list.sort() (in-place, returns None) with sorted() (returns new list) and using the return value of sort().
Applying Dutch flag three-way partition but incrementing mid after swapping with hi, processing an unclassified element.
Using O(n log n) sort when QuickSelect achieves the same k-th element query in O(n).

Typical Follow-Up Escalations

"Now do it in O(n) time." → Recognize values are bounded integers → counting/radix sort; or k-th element → QuickSelect.
"Now the array is a stream — you can't store all elements." → Heap of size k for top-k; reservoir sampling for uniform random k-th; two heaps for median.
"Now find the k-th smallest in a sorted matrix." → Binary search on value, not index. Sort-based approach is O(n² log n) — wrong.
"Now count inversions." → Augmented merge sort; explain why naive O(n²) is insufficient.
"Your solution uses O(n) space; make it O(1)." → Heap sort (in-place, unstable); or in-place merge sort (complex, rarely expected).
"Now do it without a comparator — integers only." → Counting sort or radix sort; discuss when each is appropriate.

Explore Unread

Great job! You've read all available articles

Sorting

Topic type: Technique-Based — a specific coding technique applied to data; sort is both a standalone problem and a preprocessing step.

LeetCode has 200+ problems tagged Sorting. All sections are included at full depth.

Core Concepts

Total order — a relation ≤ on a set satisfying reflexivity, antisymmetry, transitivity, and totality (every pair is comparable). A sort produces a sequence respecting a total order.

Stable sort — equal elements retain their original relative order. Required when secondary sort key is implicit (e.g., the position in the original array).

In-place sort — uses O(1) extra space (or O(log n) for stack frames). Not synonymous with unstable; merge sort is stable but not in-place.

Comparison-based sort — only uses a[i] < a[j] queries. Lower bound: Ω(n log n) comparisons for any comparison sort in the worst case.

Non-comparison sort — exploits the fact that values come from a bounded domain. Can achieve O(n) time by distributing rather than comparing.

Inversion — a pair (i, j) with i < j but a[i] > a[j]. A fully sorted array has 0 inversions; a reverse-sorted array has n(n−1)/2. Inversion count measures "how unsorted" an array is.

Structural Invariants

Invariant	What maintains it	What breaks if violated
For all i < j in output: key(a[i]) ≤ key(a[j])	Every sort algorithm's postcondition	Downstream algorithms (binary search, two pointers) produce wrong answers
Comparator is a strict weak ordering	Caller's responsibility	`sort` may loop, crash, or produce non-deterministic output
Stable sort: equal elements preserve input order	Only guaranteed by stable algorithms	Implicit secondary sort key is silently broken
In-place: no allocation beyond O(log n) stack	Algorithm choice	OOM for large inputs when O(n) space assumed to be free

Types / Variants

Algorithm	Time (avg / worst)	Space	Stable	Notes
Merge Sort	O(n log n) / O(n log n)	O(n)	Yes	Divide-and-conquer; preferred when stability required
Quick Sort	O(n log n) / O(n²)	O(log n)	No	In-place; worst case on sorted input without randomization
Heap Sort	O(n log n) / O(n log n)	O(1)	No	In-place; poor cache performance; rarely used in practice
Insertion Sort	O(n) best / O(n²)	O(1)	Yes	Best for small (< 32) or nearly-sorted arrays; used inside TimSort
Tim Sort	O(n) best / O(n log n)	O(n)	Yes	Hybrid merge+insertion; Python's `list.sort()` and Java's `Arrays.sort`
Counting Sort	O(n + k)	O(k)	Yes	Non-comparison; k = value range; impractical for large k
Radix Sort	O(nk) / O(nk)	O(n + k)	Yes	Non-comparison; k = number of digits; outperforms comparison sorts when k is small
Bucket Sort	O(n) avg / O(n²)	O(n)	Yes	Non-comparison; requires values distributed uniformly over a range
Shell Sort	O(n log² n) avg	O(1)	No	Generalized insertion sort; used in practice for medium-sized arrays

When to use each:

Default: Python/Java built-in (Tim Sort) — stable, fast in practice, handles nearly-sorted inputs well.
Need in-place + no stability: Quick Sort with randomization.
Need O(n): values are integers in a small range → Counting Sort; many digits but small digit range → Radix Sort; uniform floats in [0,1] → Bucket Sort.
Small subarray (< 32 elements): Insertion Sort.
Custom objects with multi-key ordering: Tim Sort with a key= function.

Key Algorithms

Merge Sort

Recursively split array into halves, sort each half, merge sorted halves in O(n). Merge maintains two pointers advancing through each half and appending the smaller element.

Key insight: merge of two sorted sequences of length m and n takes O(m+n) and produces a sorted sequence; the recursion depth is O(log n).

Time O(n log n), Space O(n) for the merge buffer. Parallelises naturally.

Quick Sort

Choose a pivot, partition array into elements ≤ pivot and > pivot, recurse on each side.

Lomuto partition: pivot = last element; single pointer scans left-to-right swapping elements ≤ pivot left. Simple but performs poorly on duplicates.

Hoare partition: two pointers converge from both ends, swapping when a[left] > pivot and a[right] < pivot. Fewer swaps; handles duplicates better.

3-way (Dutch flag) partition: three regions [< pivot | == pivot | > pivot]. Reduces O(n²) to O(n) when all elements equal.

Key insight: after partitioning, the pivot is in its final position. Expected recursion depth O(log n) with random pivot; worst-case O(n) without randomization.

Time O(n log n) avg / O(n²) worst, Space O(log n) stack.

import random
def quicksort(a, lo, hi):
    if lo < hi:
        p = random.randint(lo, hi); a[p], a[hi] = a[hi], a[p]
        # Lomuto: returns pivot index after partition

Quick Select (k-th smallest)

Same partition as Quick Sort, but recurse only into the side containing index k. Expected O(n) time.

Key insight: after each partition, the pivot is placed at its final position. If pivot index == k, done.

Time O(n) avg / O(n²) worst, Space O(1) iterative.

Heap Sort

Build a max-heap in O(n) via bottom-up heapify, then repeatedly extract the max (swap root with last, sift-down, shrink heap).

Key insight: after i extractions, the last i positions hold the i largest elements in sorted order.

Time O(n log n), Space O(1). See heap.md for heapify details.

Counting Sort

For integers in [0, k): count occurrences, compute prefix sums (cumulative counts = positions), scatter elements into output array right-to-left for stability.

Key insight: prefix sum at position v gives the starting output index for all elements with value v.

Time O(n + k), Space O(k). Breaks when k >> n.

Radix Sort (LSD)

Sort by least-significant digit first, then next digit, ..., most-significant digit. Use stable sort (counting sort) per digit.

Key insight: stable sorting by each digit in order of significance produces a fully sorted result because later passes preserve relative order from earlier passes.

Time O(nk) where k = number of digits, Space O(n + b) where b = base.

Bucket Sort

Partition input range into n buckets of equal width, distribute elements, sort each bucket (usually insertion sort), concatenate.

Key insight: if input is uniformly distributed, expected elements per bucket is O(1), giving O(n) total.

Time O(n) avg / O(n²) worst, Space O(n).

Counting Inversions via Merge Sort

Augment merge: when an element from the right half is placed before elements from the left half, count how many left-half elements remain (all form inversions with this right element).

Key insight: during merge of two sorted halves, each "right wins over left" event contributes exactly mid - left_pointer inversions.

Time O(n log n), Space O(n).

Advanced Techniques

Custom Comparator with Transitive Closure

Solves: Problems where natural ordering doesn't apply but a relative preference exists (e.g., "Largest Number": should "3" come before "30"? Compare "330" vs. "303").

Mechanism: define cmp(a, b) that returns negative if a should come first. Convert to key via functools.cmp_to_key.

Time O(n log n), but each comparison may be O(k) where k is element size.

Sort + Two/Three Pointers

Solves: pair/triplet/subarray problems where you need to find combinations summing to a target, or minimize/maximize a gap.

When to use: after sorting doesn't destroy required information (indices don't matter, duplicates can be handled). Do NOT use when original indices are part of the answer and cannot be recovered.

Time O(n log n + n) = O(n log n).

Coordinate Compression

Solves: problems where values span a huge range (up to 10^9) but only relative order matters, and you need to use the values as indices (e.g., BIT/segment tree indexed by value).

Mechanism: sort unique values, assign rank 0..m−1, replace all occurrences with their rank. Reduces value range from 10^9 to n.

When to use over direct indexing: when k (value range) >> n (number of elements), making O(k) space/time algorithms impractical.

Time O(n log n) for sort + O(log n) per lookup via binary search.

sorted_vals = sorted(set(a))
rank = {v: i for i, v in enumerate(sorted_vals)}

Partial Sort / Top-K Selection

Solves: finding k smallest/largest without fully sorting.

Mechanism: QuickSelect for exact k-th element in O(n) avg; heap of size k for streaming top-k in O(n log k).

When to use over full sort: when k << n and you don't need a sorted order of all k elements. If you need them sorted, heap gives O(n log k) vs. sort's O(n log n); heap wins when k << n.

Indirect Sort (sort indices, not values)

Solves: problems needing the sorted permutation rather than sorted values (argsort); preserving original indices while sorting.

Mechanism: sort list(range(n)) by key lambda i: a[i]. Original array untouched.

When to use: when you need to answer queries in sorted order but also need original indices, or when sorting multiple arrays by the same key.

order = sorted(range(n), key=lambda i: a[i])

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Sort and scan	Process elements in sorted order to find optimal grouping/pairing	Sort + single pass
Merge intervals	Given intervals, find overlapping/non-overlapping groups	Sort by start, greedy merge
Custom ordering	Arrange elements by a non-standard rule	Custom comparator / `cmp_to_key`
Count inversions	How many pairs are out of order	Merge sort augmentation
Top-K elements	K largest/smallest/frequent	Partial sort, heap, QuickSelect
K-th element	Exact rank query	QuickSelect O(n), heap O(n log k)
Sort as preprocessing	Sorting makes a different algorithm (two pointers, binary search, greedy) feasible	Sort + apply algorithm
Wiggle / alternate sort	Arrange so a[0] ≤ a[1] ≥ a[2] ≤ ...	Sort + interleave, or in-place
Nearly sorted	Input has small inversions or is k-sorted	Insertion sort O(nk), heap O(n log k)

Problem Signal → Technique

Signal in problem	Likely technique
"sort" / "sorted order" / "ascending"	Direct sort
"largest number formed by"	`cmp_to_key` string concatenation comparator
"number of inversions" / "how many swaps"	Merge sort inversion count
"k-th smallest/largest"	QuickSelect or heap of size k
"top K frequent"	Count + partial sort / bucket sort
"meeting rooms" / "intervals overlap"	Sort by start time
"minimum difference between any two"	Sort + adjacent difference scan
"two sum / three sum" after sort	Sort + two pointers
"median of stream"	Two heaps; for static array: sort
values up to 10^9 but n ≤ 10^5	Coordinate compression
"relative order preserved for equal"	Stable sort required

Common Patterns

Pattern	When it applies	Mechanism
Sort + adjacent scan	Find minimum difference, detect duplicates, group equal elements	Sort, then iterate; consecutive elements are closest under sort order
Sort + binary search	Find if target exists, count elements in range, find insertion point	Sort once O(n log n), then O(log n) per query
Sort + greedy sweep	Interval scheduling, task assignment, meeting rooms	Sort by start/end/deadline; greedy decision at each step is correct because sorted order eliminates need to look back
Sort + two pointers	Pair/triplet sum, container with most water variant	Monotone structure after sort lets two pointers eliminate O(n) candidates per step
Sort + sliding window	Minimum difference between elements from k arrays	Sort all, use window to track elements across sources
Sort by frequency	Group anagrams, top-K frequent, reorganize string	Count + sort by count; bucket sort for O(n)
Indirect sort (argsort)	Answer multiple queries in value order while retaining indices	Sort index array by value; process in sorted order
Dutch flag (3-way partition)	Sort array with 3 distinct values (colors), or partition around pivot with duplicates	Three-pointer scan: lo, mid, hi maintaining three regions

Edge Cases & Pitfalls

Comparator violates strict weak ordering: if cmp(a,b) returns 0 for unequal elements inconsistently, sort produces non-deterministic output. Always verify transitivity: if cmp(a,b) < 0 and cmp(b,c) < 0 then cmp(a,c) < 0 must hold.
Integer overflow in comparator: return a - b overflows when a is large positive and b is large negative (or vice versa). Use return (a > b) - (a < b) or return 1 if a > b else -1 if a < b else 0.
Counting sort with negative values: range is [min, max], not [0, max]. Offset all values by -min before counting; forget the offset and you access index -1.
Counting sort when k >> n: k = 10^9 is unacceptable. Recognize when non-comparison sort is impractical; fall back to O(n log n).
Quick sort worst case on sorted/reverse-sorted input: without randomization, pivot = first or last element causes O(n²) behavior. Always randomize pivot or use median-of-three.
Stability assumption with unstable sort: using heap sort or quick sort when equal elements must maintain original order silently corrupts the result. Check whether stability is required before choosing algorithm.
Mutating array during sort: Python's list.sort() is undefined if the list changes during sorting (e.g., via a comparator with side effects). Never mutate the array being sorted.
sort() vs sorted(): list.sort() returns None, not the sorted list. x = a.sort() then using x is a common bug.
Dutch flag off-by-one: the mid pointer processes element at mid before incrementing. Swapping a[mid] and a[hi] does NOT increment mid because the swapped-in element is unclassified.
Merge sort extra allocations: creating a new list per merge call in Python leads to O(n log n) allocations. Allocate a single temporary buffer at the top level and pass it down.

Implementation Notes

Python's list.sort() and sorted() are Tim Sort — stable, O(n log n) worst case, O(n) for nearly-sorted input. Prefer key= over cmp_to_key; key function is called once per element vs. once per comparison.
functools.cmp_to_key wraps a comparison function for use with sorted. Required when ordering is inherently relational (two elements must both be visible to determine order).
heapq is a min-heap; negate values for max-heap. For top-k largest, use heapq.nlargest(k, iterable) which is O(n log k).
Python recursion limit (default 1000) breaks naive recursive merge sort for n > ~500. Use iterative bottom-up merge or increase with sys.setrecursionlimit.
For counting sort, collections.Counter builds frequency dict in O(n) but iterating in sorted key order is O(k log k) — use range(min_val, max_val+1) for O(k) scatter.
bisect.bisect_left(a, x) requires a to already be sorted; does not sort for you.
Comparing strings with < in Python is lexicographic (not length-based). For "Largest Number", the comparator str(a)+str(b) > str(b)+str(a) is correct but must be wrapped with cmp_to_key.

Cross-Topic Relations

Related topic	Connection
Two Pointers	Sorting is the standard preprocessing step that makes two-pointer O(n) scans correct
Binary Search	Binary search requires sorted input; sort once, query many times
Heap / Priority Queue	Heap sort, partial sort, top-K, and k-th element problems; QuickSelect vs. heap tradeoff
Greedy	Most greedy interval/scheduling algorithms require sorting by start/end/deadline first
Merge Intervals	Interval problems almost always begin with sort by start time
Sliding Window	Sometimes combined: sort, then apply sliding window on sorted array
Divide and Conquer	Merge sort is the canonical divide-and-conquer sort; inversion count uses the same structure
Counting / Prefix Sum	Counting sort and radix sort; coordinate compression enables prefix-sum on compressed domain

Interview Reference

Must-Know Problems

Sorting mechanics and partitioning

Sort Colors (Dutch flag, 3-way partition) — Easy
Kth Largest Element in an Array (QuickSelect or heap) — Medium
Find K Closest Elements — Medium

Sort as preprocessing

Merge Intervals — Medium
Non-overlapping Intervals — Medium
Meeting Rooms II — Medium
Minimum Number of Arrows to Burst Balloons — Medium
Two Sum II (sorted array) — Medium
3Sum — Medium

Custom ordering

Largest Number — Medium
Reorder Data in Log Files — Medium
Sort Characters By Frequency — Medium
Wiggle Sort II — Medium

Advanced / Hard

Count of Smaller Numbers After Self (inversion count via merge sort) — Hard
Reverse Pairs (modified merge sort) — Hard
Count of Range Sum (merge sort on prefix sums) — Hard
Maximum Gap (bucket sort, pigeonhole) — Hard

Additional LeetCode Coverage

Third Maximum Number (LC 414)
Largest Number At Least Twice of Others (LC 747)
Largest Perimeter Triangle (LC 976)
Minimum Absolute Difference (LC 1200)
Average Salary Excluding the Minimum and Maximum Salary (LC 1491)
Can Make Arithmetic Progression From Sequence (LC 1502)
Maximum Units on a Truck (LC 1710)
Find Players With Zero or One Losses (LC 2225)

Clarification Questions to Ask

Are values integers? What is the range? (Determines feasibility of counting/radix sort.)
Are there duplicates? Does their relative order matter? (Stability requirement.)
Is the array nearly sorted? (Insertion sort / Tim Sort O(n) best case.)
Can I modify the input array, or must I return a new sorted array?
What does "k-th" mean — 0-indexed or 1-indexed?
For custom ordering problems: is the comparator well-defined for all pairs? (Transitive closure.)

Common Interview Mistakes

Choosing counting sort without checking value range, leading to O(10^9) space.
Writing return a - b as a comparator and not considering integer overflow.
Forgetting to randomize the pivot in Quick Sort when interviewer inputs a sorted array as a test case.
Using unstable sort when equal elements must preserve relative order (e.g., sorting by one key then expecting secondary key from original order to be preserved).
Confusing list.sort() (in-place, returns None) with sorted() (returns new list) and using the return value of sort().
Applying Dutch flag three-way partition but incrementing mid after swapping with hi, processing an unclassified element.
Using O(n log n) sort when QuickSelect achieves the same k-th element query in O(n).

Typical Follow-Up Escalations

"Now do it in O(n) time." → Recognize values are bounded integers → counting/radix sort; or k-th element → QuickSelect.
"Now the array is a stream — you can't store all elements." → Heap of size k for top-k; reservoir sampling for uniform random k-th; two heaps for median.
"Now find the k-th smallest in a sorted matrix." → Binary search on value, not index. Sort-based approach is O(n² log n) — wrong.
"Now count inversions." → Augmented merge sort; explain why naive O(n²) is insufficient.
"Your solution uses O(n) space; make it O(1)." → Heap sort (in-place, unstable); or in-place merge sort (complex, rarely expected).
"Now do it without a comparator — integers only." → Counting sort or radix sort; discuss when each is appropriate.

Explore Unread

Great job! You've read all available articles

Sorting

Core Concepts

Structural Invariants

Types / Variants

Key Algorithms

Merge Sort

Quick Sort

Quick Select (k-th smallest)

Heap Sort

Counting Sort

Radix Sort (LSD)

Bucket Sort

Counting Inversions via Merge Sort

Advanced Techniques

Custom Comparator with Transitive Closure

Sort + Two/Three Pointers

Coordinate Compression

Partial Sort / Top-K Selection

Indirect Sort (sort indices, not values)

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Sorting

Core Concepts

Structural Invariants

Types / Variants

Key Algorithms

Merge Sort

Quick Sort

Quick Select (k-th smallest)

Heap Sort

Counting Sort

Radix Sort (LSD)

Bucket Sort

Counting Inversions via Merge Sort

Advanced Techniques

Custom Comparator with Transitive Closure

Sort + Two/Three Pointers

Coordinate Compression

Partial Sort / Top-K Selection

Indirect Sort (sort indices, not values)

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread