Quickselect

Quickselect is a Technique-Based topic: a selection algorithm that finds the k-th order statistic (k-th smallest or largest element) in an unsorted array. LeetCode problem count: ~50–100 problems. Depth level: core algorithms, main patterns, key complexities.

Core Concepts

Order statistic — the k-th smallest element in a collection when sorted; the 1st order statistic is the minimum, the n-th is the maximum. "k-th largest" maps to the (n − k + 1)-th smallest.

Selection problem — find the k-th order statistic without fully sorting; solvable in expected O(n) time vs. O(n log n) for sort.

Partition — rearranges an array subrange so all elements less than a pivot come before it and all elements greater come after it; returns the pivot's final index. The pivot is now in its sorted position permanently.

k-th order statistic (0-indexed) — after partitioning, if the pivot lands at index p: if p == k, return pivot; if p > k, recurse left; if p < k, recurse right. Only one branch is ever explored.

Expected O(n) time — with random pivot selection, expected work is n + n/2 + n/4 + … = 2n. Worst case O(n²) when pivots are always extreme (already-sorted input with first/last pivot).

Partial sort — returning the first k elements in sorted order; quickselect finds position k, then the left side is unsorted but all ≤ pivot; a second sort on k elements costs O(k log k). Total: O(n + k log k).

In-place selection — quickselect mutates the input array; the result is the value at index k after the algorithm finishes, not a separate output structure.

Introselect — hybrid: quickselect with random pivot until recursion depth exceeds a threshold, then falls back to median-of-medians to guarantee O(n) worst case. Used in C++ std::nth_element.

Types / Variants

Variant	What it is	When to use	Key property
Lomuto partition	Pivot = last element; single forward scan with swap counter	Simple implementation; teaching / interviews	Unstable; O(n²) on sorted input with fixed pivot
Hoare partition	Two pointers moving inward; swaps crossing pairs	Slightly fewer swaps on average than Lomuto	Pivot ends up in the middle zone, not at exact final index
3-way (DNF) partition	Splits into <, =, > regions using Dutch National Flag	Arrays with many duplicate values	Degrades to O(n) on all-equal inputs; avoids O(n²) duplicates trap
Randomised pivot	Random index selection before partitioning	Default choice for interviews and production	Expected O(n); defeats adversarial sorted inputs
Median-of-medians	Guaranteed O(n) worst case via recursive pivot selection	When worst-case guarantee is required (rare in interviews)	High constant factor; rarely faster in practice than randomised
Heap-based selection	Push all into a min/max heap or maintain a size-k heap	Online stream, repeated queries, or when input must not be mutated	O(n log k); simpler code; preferred when k << n and mutation is forbidden

Key Algorithms

Lomuto Partition

Scans left to right; maintains a boundary where everything left of it is ≤ pivot (pivot = last element). Returns final pivot index.

Time: O(n) per call. Space: O(1).

Key insight: the scan variable j walks forward unconditionally; i advances only on a swap, tracking the boundary.

# pivot = arr[hi]; i tracks boundary of ≤ pivot region
if arr[j] <= pivot: i += 1; arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[hi] = arr[hi], arr[i+1]; return i + 1

Hoare Partition

Two pointers start at opposite ends and move inward, swapping elements that are on the wrong side. Does not place pivot at its final index — returns a split point.

Time: O(n) per call. Space: O(1).

Key insight: the return value p means arr[lo..p] are all ≤ pivot and arr[p+1..hi] are all ≥ pivot, but p is not necessarily the pivot's sorted index. Recurse on [lo, p] and [p+1, hi], not [lo, p-1] and [p+1, hi].

3-Way Partition (Dutch National Flag)

Partitions into three contiguous regions: < pivot, == pivot, > pivot. Uses three pointers: lo (< boundary), mid (current), hi (> boundary).

Time: O(n) per call. Space: O(1).

Key insight: when all elements equal the pivot, the loop terminates in one pass with the entire array in the == region — O(n) total instead of O(n²).

lo, mid, hi = left, left, right
while mid <= hi:
    if arr[mid] < pivot: arr[lo],arr[mid]=arr[mid],arr[lo]; lo+=1; mid+=1
    elif arr[mid] > pivot: arr[mid],arr[hi]=arr[hi],arr[mid]; hi-=1
    else: mid+=1

Quickselect (Core Loop)

Repeatedly partitions and recurses into only the subarray containing index k.

Time: expected O(n) with random pivot; O(n²) worst case. Space: O(log n) call stack expected, O(n) worst.

Key insight: unlike quicksort, only one branch is ever explored after each partition, giving the geometric series convergence to O(n).

if lo == hi: return arr[lo]
p = partition(arr, lo, hi)  # random pivot swap before partitioning
if p == k: return arr[p]
elif p > k: return quickselect(arr, lo, p-1, k)
else: return quickselect(arr, p+1, hi, k)

Median-of-Medians

Guarantees O(n) worst-case by choosing a pivot that is at least in the 30th–70th percentile.

Time: O(n) worst case. Space: O(n) due to auxiliary group arrays (or O(log n) with in-place grouping).

Key insight: split into groups of 5, find each group's median by sorting (O(1) per group since size is constant), then recursively find the median of those medians. This pivot guarantees at least 3n/10 elements on each side.

The high constant factor (~10×) makes it slower than randomised quickselect in practice; use only when adversarial inputs or hard worst-case guarantees are required.

Heap-Based Selection

Build a max-heap of the first k elements, then scan the remaining n − k elements, replacing the heap root when a smaller element is found. Final heap root = k-th smallest.

Time: O(n log k). Space: O(k).

Key insight: maintaining a bounded heap is simpler, never mutates the input, and is O(n log k) — preferable when k is small (log k ≈ constant) or queries are online.

import heapq
heap = [-x for x in arr[:k]]
heapq.heapify(heap)  # max-heap of size k via negation
for x in arr[k:]:
    if x < -heap[0]: heapq.heapreplace(heap, -x)
return -heap[0]

See heap-priority-queue.md for full heap coverage.

nth_element (Library Shortcut)

Python does not have std::nth_element. Use sorted() or implement quickselect manually. In interviews, always implement quickselect from scratch unless told otherwise.

Advanced Techniques

Weighted k-th Smallest

Problem class: each element has a weight (frequency/count); find the element at cumulative weight position k.

Core mechanism: binary search on the value range or sorted candidates; for each candidate value v, count total weight of elements ≤ v in O(n); binary search over this monotone function.

When to prefer over quickselect: when the input is given as a frequency table or the "array" is implicitly defined (matrix, range), and explicit expansion is prohibitive.

Complexity: O(n log(max − min)) with binary search on value range.

Selection in a Row-and-Column Sorted Matrix

Problem class: find k-th smallest in an n × n matrix where rows and columns are sorted (LeetCode 378).

Core mechanism: binary search on value range [matrix[0][0], matrix[n-1][n-1]]; count elements ≤ mid using a staircase walk in O(n); narrow range until lo == hi.

When to prefer over quickselect: input is not a flat array; extracting to a flat array costs O(n²) space.

Complexity: O(n log(max − min)).

See binary-search.md for binary-search-on-answer pattern.

Introselect / Fallback Strategy

Problem class: production-grade selection where adversarial inputs exist.

Core mechanism: run randomised quickselect; track recursion depth; if depth exceeds 2 log n, switch to median-of-medians for the remaining subproblem.

When to prefer over pure quickselect: the input may be adversarial (e.g., competitive programming, security-sensitive contexts). In interviews, random pivot alone suffices.

Complexity: O(n) worst case.

Problem Taxonomy

By Problem Category

Problem Type	What it asks	Key technique
k-th smallest/largest in unsorted array	Single selection query, mutation allowed	Quickselect with random pivot
k-th smallest/largest, no mutation	Same, but input must stay intact	Copy + quickselect, or heap
k closest points to origin	Find k elements minimising some metric	Quickselect on metric values (partial sort)
Top-k frequent elements	k elements with highest frequency	Build frequency map, then quickselect on frequencies
Median of unsorted array	50th percentile statistic	Quickselect for index n//2 (and n//2−1 for even n)
k-th largest in a stream	Online / append-only input	Min-heap of size k; O(log k) per insert
k-th smallest in sorted matrix	Grid with row/column sorted order	Binary search on value range with count
Partial sort (return sorted top-k)	k smallest/largest in sorted order	Quickselect to position k, then sort left partition
k-th smallest pair sum / distance	Selection over implicit sorted list	Binary search on answer + counting function

Problem Signal → Technique

Signal in problem statement	Likely technique
"find k-th largest / smallest"	Quickselect or min-heap of size k
"return k closest / k nearest"	Quickselect on distance metric
"top k frequent"	Frequency map + quickselect or bucket sort
"median of unsorted"	Quickselect at index n//2
"stream" or "online" with k	Heap of size k, not quickselect
"no extra space"	In-place quickselect
"k-th smallest in matrix"	Binary search on value + row staircase count
"without sorting"	Quickselect (expected O(n))
"worst-case O(n)"	Median-of-medians or heap

Common Patterns

Pattern	When it applies	Mechanism
Random pivot swap before partition	All quickselect implementations	`swap(arr, random(lo, hi), hi)` before Lomuto; eliminates sorted-input worst case
k-th largest → k-th smallest mapping	Problem gives "k-th largest"	Use index `n - k` (0-indexed) for k-th smallest equivalent
Partial sort via quickselect	Need sorted top-k, not just the boundary	Quickselect to find position k, then sort `arr[:k]`
Heap of size k for streaming	Data arrives one element at a time	Maintain max-heap (k-th smallest) or min-heap (k-th largest) of exactly k elements
Binary search on answer with count	Implicit or non-flat input	`lo, hi = min_val, max_val`; count elements ≤ mid; shrink range
3-way partition guard	Input has many duplicates	Replace standard 2-way partition; prevents O(n²) on all-equal subarrays
Iterative quickselect	Avoid recursion stack overflow	Replace tail recursion: update `lo` or `hi` based on pivot position, loop

Edge Cases & Pitfalls

k out of range. If k < 1 or k > n, the algorithm accesses an invalid index or returns a wrong element. Always validate k before entering the selection loop.
All identical elements with 2-way partition. Lomuto partition places all equal elements on one side, causing O(n²) behaviour on uniform arrays. Fix: use 3-way (DNF) partition when duplicates are common, or at minimum check if all elements are equal before partitioning.
k-th largest vs k-th smallest index confusion. "k-th largest" in a 1-indexed problem maps to index n - k in 0-indexed quickselect. Off-by-one here returns the wrong element entirely. Confirm the convention at the top of implementation.
Hoare partition recurse bounds. After Hoare partition returns split point p, the left subproblem must include p: recurse on [lo, p] not [lo, p-1]. Using p-1 can skip the pivot-side element and produce an infinite loop or wrong answer.
Mutating caller's array. Quickselect rearranges the input in-place. If the caller expects the original order preserved, copy before calling. Forgetting this introduces subtle bugs in problems that reuse the array.
Random pivot with fixed seed. Using random.randint without seeding is fine for most LeetCode; but in tests that check determinism, the result is correct but undebuggable. Keep the randomisation but add a fixed seed during debugging.
Single-element subarray infinite recursion. When lo == hi, return immediately. Without this base case, partition is called on a 1-element range and may loop or produce an out-of-bounds index.
Median of even-length array. For even n, the median is typically the average of the two middle values (indices n//2 - 1 and n//2). Quickselect must be called twice (or once with 3-way partition that identifies the exact range). Returning only one index produces a wrong median.
Integer overflow when computing average for median. (a + b) // 2 can overflow in other languages; not an issue in Python (arbitrary precision integers), but the two-call quickselect approach avoids needing division at all.

Implementation Notes

Python's random.randint(lo, hi) is inclusive on both ends; use it to pick a random pivot index within [lo, hi].
Python has no built-in nth_element; heapq.nsmallest(k, arr) returns sorted top-k in O(n log k) and is acceptable when k is small, but not when O(n) is required.
heapq is a min-heap only; simulate a max-heap by negating values.
Python's default recursion limit is 1000; for arrays of size > 500 with adversarial inputs, the call stack may overflow. Use iterative quickselect or sys.setrecursionlimit.
random.shuffle before calling a non-randomised quickselect is equivalent to random pivot but shuffles the entire array — it's O(n) and acceptable for small n but changes the array order globally.
When using Lomuto partition, the pivot swap must happen before entering the scan loop, not after — placing it after the loop corrupts the scan.

Cross-Topic Relations

Related topic	Connection
sorting.md	Quickselect derives from quicksort; same partition logic, but recurses into only one branch. Partial sort uses quickselect + sort on the smaller partition.
heap-priority-queue.md	Heap-based selection (size-k heap) is the main alternative; preferred for streaming, small k, or when mutation is forbidden.
divide-and-conquer.md	Quickselect is a divide-and-conquer algorithm; median-of-medians applies recursive sub-problem decomposition.
binary-search.md	Binary search on value range replaces quickselect when the input is non-flat (sorted matrix, frequency table).
arrays.md	All quickselect variants operate on arrays; in-place mutation and index arithmetic are array-specific concerns.
two-pointers.md	3-way (DNF) partition uses a two-pointer / three-pointer technique internally.

Interview Reference

Must-Know Problems

Core Selection

Kth Largest Element in an Array (LC 215) — canonical quickselect problem; implement both quickselect and heap approaches
Kth Largest Element in a Stream (LC 703) — heap of size k; online variant
Find the Kth Smallest Element in a Sorted Matrix (LC 378) — binary search on value range
K Closest Points to Origin (LC 973) — quickselect on squared distance

Top-K Variants

Top K Frequent Elements (LC 347) — frequency map + quickselect or bucket sort
Top K Frequent Words (LC 692) — same pattern with lexicographic tiebreak
Sort Colors (LC 75) — 3-way DNF partition; foundational for understanding duplicate handling

Median / Percentile

Find Median from Data Stream (LC 295) — two-heap approach; not quickselect but tests same intuition
Wiggle Sort II (LC 324) — median via quickselect then 3-way partition for placement

Harder Applications

K-th Smallest Prime Fraction (LC 786) — binary search on answer with counting
Find K-th Smallest Pair Distance (LC 719) — binary search on answer + sliding window count
K-th Smallest in Multiplication Table (LC 668) — binary search on value + row-wise count

Clarification Questions to Ask

"Is the array 0-indexed or 1-indexed for k?" — determines the index used in quickselect.
"Can the array be modified?" — determines whether to copy before running in-place quickselect.
"Are there duplicate values?" — if yes, confirm the partition strategy handles them (3-way partition).
"Is input given as a stream or all at once?" — stream → heap; batch → quickselect.
"For 'median', what is the definition when n is even?" — average of two middles, lower middle, or upper middle.
"What are the constraints on k?" — 1 ≤ k ≤ n is typical but must be confirmed.

Common Interview Mistakes

Implementing quicksort instead of quickselect — recursing into both branches instead of one, giving O(n log n) instead of O(n).
Using a fixed pivot (first or last element) without randomisation — interviewers will ask about sorted input performance; answer must address O(n²) worst case.
Forgetting the base case if lo == hi: return arr[lo] — causes infinite recursion or index-out-of-range on single-element subarrays.
Off-by-one in k-th largest to k-th smallest conversion — using n - k vs n - k + 1 depending on 0- vs 1-indexing of k; derive it explicitly rather than guessing.
Assuming 2-way partition is sufficient for all inputs — failing to mention 3-way partition when asked about duplicate-heavy inputs shows incomplete understanding.
Claiming O(n) without qualifying "expected" — median-of-medians is O(n) worst case; standard quickselect is O(n) expected. Conflating these is a red flag.
Not knowing the heap alternative — interviewers frequently ask "what if you cannot modify the input?" or "what if data arrives in a stream?"; missing the heap answer loses points.

Typical Follow-Up Escalations

"Can you do it in O(n) worst case?" → Switch to median-of-medians; explain the group-of-5 pivot construction and the guaranteed 30/70 split.
"What if the array arrives as a stream?" → Use a min-heap of size k; insert each new element and pop root if heap exceeds k; O(log k) per element.
"What if there are many duplicates?" → Replace 2-way partition with 3-way (DNF) partition; the == region collapses all equal-to-pivot elements in one pass.
"Can you avoid modifying the input?" → Either copy the array first (O(n) space) or use the heap approach (O(k) space).
"Now return the k smallest in sorted order." → After quickselect places the boundary at index k, sort arr[:k]; total O(n + k log k).
"What if the input is a matrix?" → Binary search on value range [min, max]; count elements ≤ mid using a staircase walk; O(n log(max − min)).

Explore Unread

Great job! You've read all available articles

Quickselect

Core Concepts

Order statistic — the k-th smallest element in a collection when sorted; the 1st order statistic is the minimum, the n-th is the maximum. "k-th largest" maps to the (n − k + 1)-th smallest.

Selection problem — find the k-th order statistic without fully sorting; solvable in expected O(n) time vs. O(n log n) for sort.

k-th order statistic (0-indexed) — after partitioning, if the pivot lands at index p: if p == k, return pivot; if p > k, recurse left; if p < k, recurse right. Only one branch is ever explored.

Expected O(n) time — with random pivot selection, expected work is n + n/2 + n/4 + … = 2n. Worst case O(n²) when pivots are always extreme (already-sorted input with first/last pivot).

In-place selection — quickselect mutates the input array; the result is the value at index k after the algorithm finishes, not a separate output structure.

Introselect — hybrid: quickselect with random pivot until recursion depth exceeds a threshold, then falls back to median-of-medians to guarantee O(n) worst case. Used in C++ std::nth_element.

Types / Variants

Variant	What it is	When to use	Key property
Lomuto partition	Pivot = last element; single forward scan with swap counter	Simple implementation; teaching / interviews	Unstable; O(n²) on sorted input with fixed pivot
Hoare partition	Two pointers moving inward; swaps crossing pairs	Slightly fewer swaps on average than Lomuto	Pivot ends up in the middle zone, not at exact final index
3-way (DNF) partition	Splits into <, =, > regions using Dutch National Flag	Arrays with many duplicate values	Degrades to O(n) on all-equal inputs; avoids O(n²) duplicates trap
Randomised pivot	Random index selection before partitioning	Default choice for interviews and production	Expected O(n); defeats adversarial sorted inputs
Median-of-medians	Guaranteed O(n) worst case via recursive pivot selection	When worst-case guarantee is required (rare in interviews)	High constant factor; rarely faster in practice than randomised
Heap-based selection	Push all into a min/max heap or maintain a size-k heap	Online stream, repeated queries, or when input must not be mutated	O(n log k); simpler code; preferred when k << n and mutation is forbidden

Key Algorithms

Lomuto Partition

Scans left to right; maintains a boundary where everything left of it is ≤ pivot (pivot = last element). Returns final pivot index.

Time: O(n) per call. Space: O(1).

Key insight: the scan variable j walks forward unconditionally; i advances only on a swap, tracking the boundary.

# pivot = arr[hi]; i tracks boundary of ≤ pivot region
if arr[j] <= pivot: i += 1; arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[hi] = arr[hi], arr[i+1]; return i + 1

Hoare Partition

Two pointers start at opposite ends and move inward, swapping elements that are on the wrong side. Does not place pivot at its final index — returns a split point.

Time: O(n) per call. Space: O(1).

3-Way Partition (Dutch National Flag)

Partitions into three contiguous regions: < pivot, == pivot, > pivot. Uses three pointers: lo (< boundary), mid (current), hi (> boundary).

Time: O(n) per call. Space: O(1).

Key insight: when all elements equal the pivot, the loop terminates in one pass with the entire array in the == region — O(n) total instead of O(n²).

lo, mid, hi = left, left, right
while mid <= hi:
    if arr[mid] < pivot: arr[lo],arr[mid]=arr[mid],arr[lo]; lo+=1; mid+=1
    elif arr[mid] > pivot: arr[mid],arr[hi]=arr[hi],arr[mid]; hi-=1
    else: mid+=1

Quickselect (Core Loop)

Repeatedly partitions and recurses into only the subarray containing index k.

Time: expected O(n) with random pivot; O(n²) worst case. Space: O(log n) call stack expected, O(n) worst.

Key insight: unlike quicksort, only one branch is ever explored after each partition, giving the geometric series convergence to O(n).

if lo == hi: return arr[lo]
p = partition(arr, lo, hi)  # random pivot swap before partitioning
if p == k: return arr[p]
elif p > k: return quickselect(arr, lo, p-1, k)
else: return quickselect(arr, p+1, hi, k)

Median-of-Medians

Guarantees O(n) worst-case by choosing a pivot that is at least in the 30th–70th percentile.

Time: O(n) worst case. Space: O(n) due to auxiliary group arrays (or O(log n) with in-place grouping).

The high constant factor (~10×) makes it slower than randomised quickselect in practice; use only when adversarial inputs or hard worst-case guarantees are required.

Heap-Based Selection

Build a max-heap of the first k elements, then scan the remaining n − k elements, replacing the heap root when a smaller element is found. Final heap root = k-th smallest.

Time: O(n log k). Space: O(k).

Key insight: maintaining a bounded heap is simpler, never mutates the input, and is O(n log k) — preferable when k is small (log k ≈ constant) or queries are online.

import heapq
heap = [-x for x in arr[:k]]
heapq.heapify(heap)  # max-heap of size k via negation
for x in arr[k:]:
    if x < -heap[0]: heapq.heapreplace(heap, -x)
return -heap[0]

See heap-priority-queue.md for full heap coverage.

nth_element (Library Shortcut)

Python does not have std::nth_element. Use sorted() or implement quickselect manually. In interviews, always implement quickselect from scratch unless told otherwise.

Advanced Techniques

Weighted k-th Smallest

Problem class: each element has a weight (frequency/count); find the element at cumulative weight position k.

Core mechanism: binary search on the value range or sorted candidates; for each candidate value v, count total weight of elements ≤ v in O(n); binary search over this monotone function.

When to prefer over quickselect: when the input is given as a frequency table or the "array" is implicitly defined (matrix, range), and explicit expansion is prohibitive.

Complexity: O(n log(max − min)) with binary search on value range.

Selection in a Row-and-Column Sorted Matrix

Problem class: find k-th smallest in an n × n matrix where rows and columns are sorted (LeetCode 378).

Core mechanism: binary search on value range [matrix[0][0], matrix[n-1][n-1]]; count elements ≤ mid using a staircase walk in O(n); narrow range until lo == hi.

When to prefer over quickselect: input is not a flat array; extracting to a flat array costs O(n²) space.

Complexity: O(n log(max − min)).

See binary-search.md for binary-search-on-answer pattern.

Introselect / Fallback Strategy

Problem class: production-grade selection where adversarial inputs exist.

Core mechanism: run randomised quickselect; track recursion depth; if depth exceeds 2 log n, switch to median-of-medians for the remaining subproblem.

When to prefer over pure quickselect: the input may be adversarial (e.g., competitive programming, security-sensitive contexts). In interviews, random pivot alone suffices.

Complexity: O(n) worst case.

Problem Taxonomy

By Problem Category

Problem Type	What it asks	Key technique
k-th smallest/largest in unsorted array	Single selection query, mutation allowed	Quickselect with random pivot
k-th smallest/largest, no mutation	Same, but input must stay intact	Copy + quickselect, or heap
k closest points to origin	Find k elements minimising some metric	Quickselect on metric values (partial sort)
Top-k frequent elements	k elements with highest frequency	Build frequency map, then quickselect on frequencies
Median of unsorted array	50th percentile statistic	Quickselect for index n//2 (and n//2−1 for even n)
k-th largest in a stream	Online / append-only input	Min-heap of size k; O(log k) per insert
k-th smallest in sorted matrix	Grid with row/column sorted order	Binary search on value range with count
Partial sort (return sorted top-k)	k smallest/largest in sorted order	Quickselect to position k, then sort left partition
k-th smallest pair sum / distance	Selection over implicit sorted list	Binary search on answer + counting function

Problem Signal → Technique

Signal in problem statement	Likely technique
"find k-th largest / smallest"	Quickselect or min-heap of size k
"return k closest / k nearest"	Quickselect on distance metric
"top k frequent"	Frequency map + quickselect or bucket sort
"median of unsorted"	Quickselect at index n//2
"stream" or "online" with k	Heap of size k, not quickselect
"no extra space"	In-place quickselect
"k-th smallest in matrix"	Binary search on value + row staircase count
"without sorting"	Quickselect (expected O(n))
"worst-case O(n)"	Median-of-medians or heap

Common Patterns

Pattern	When it applies	Mechanism
Random pivot swap before partition	All quickselect implementations	`swap(arr, random(lo, hi), hi)` before Lomuto; eliminates sorted-input worst case
k-th largest → k-th smallest mapping	Problem gives "k-th largest"	Use index `n - k` (0-indexed) for k-th smallest equivalent
Partial sort via quickselect	Need sorted top-k, not just the boundary	Quickselect to find position k, then sort `arr[:k]`
Heap of size k for streaming	Data arrives one element at a time	Maintain max-heap (k-th smallest) or min-heap (k-th largest) of exactly k elements
Binary search on answer with count	Implicit or non-flat input	`lo, hi = min_val, max_val`; count elements ≤ mid; shrink range
3-way partition guard	Input has many duplicates	Replace standard 2-way partition; prevents O(n²) on all-equal subarrays
Iterative quickselect	Avoid recursion stack overflow	Replace tail recursion: update `lo` or `hi` based on pivot position, loop

Edge Cases & Pitfalls

k out of range. If k < 1 or k > n, the algorithm accesses an invalid index or returns a wrong element. Always validate k before entering the selection loop.
All identical elements with 2-way partition. Lomuto partition places all equal elements on one side, causing O(n²) behaviour on uniform arrays. Fix: use 3-way (DNF) partition when duplicates are common, or at minimum check if all elements are equal before partitioning.
k-th largest vs k-th smallest index confusion. "k-th largest" in a 1-indexed problem maps to index n - k in 0-indexed quickselect. Off-by-one here returns the wrong element entirely. Confirm the convention at the top of implementation.
Hoare partition recurse bounds. After Hoare partition returns split point p, the left subproblem must include p: recurse on [lo, p] not [lo, p-1]. Using p-1 can skip the pivot-side element and produce an infinite loop or wrong answer.
Mutating caller's array. Quickselect rearranges the input in-place. If the caller expects the original order preserved, copy before calling. Forgetting this introduces subtle bugs in problems that reuse the array.
Random pivot with fixed seed. Using random.randint without seeding is fine for most LeetCode; but in tests that check determinism, the result is correct but undebuggable. Keep the randomisation but add a fixed seed during debugging.
Single-element subarray infinite recursion. When lo == hi, return immediately. Without this base case, partition is called on a 1-element range and may loop or produce an out-of-bounds index.
Median of even-length array. For even n, the median is typically the average of the two middle values (indices n//2 - 1 and n//2). Quickselect must be called twice (or once with 3-way partition that identifies the exact range). Returning only one index produces a wrong median.
Integer overflow when computing average for median. (a + b) // 2 can overflow in other languages; not an issue in Python (arbitrary precision integers), but the two-call quickselect approach avoids needing division at all.

Implementation Notes

Python's random.randint(lo, hi) is inclusive on both ends; use it to pick a random pivot index within [lo, hi].
Python has no built-in nth_element; heapq.nsmallest(k, arr) returns sorted top-k in O(n log k) and is acceptable when k is small, but not when O(n) is required.
heapq is a min-heap only; simulate a max-heap by negating values.
Python's default recursion limit is 1000; for arrays of size > 500 with adversarial inputs, the call stack may overflow. Use iterative quickselect or sys.setrecursionlimit.
random.shuffle before calling a non-randomised quickselect is equivalent to random pivot but shuffles the entire array — it's O(n) and acceptable for small n but changes the array order globally.
When using Lomuto partition, the pivot swap must happen before entering the scan loop, not after — placing it after the loop corrupts the scan.

Cross-Topic Relations

Related topic	Connection
sorting.md	Quickselect derives from quicksort; same partition logic, but recurses into only one branch. Partial sort uses quickselect + sort on the smaller partition.
heap-priority-queue.md	Heap-based selection (size-k heap) is the main alternative; preferred for streaming, small k, or when mutation is forbidden.
divide-and-conquer.md	Quickselect is a divide-and-conquer algorithm; median-of-medians applies recursive sub-problem decomposition.
binary-search.md	Binary search on value range replaces quickselect when the input is non-flat (sorted matrix, frequency table).
arrays.md	All quickselect variants operate on arrays; in-place mutation and index arithmetic are array-specific concerns.
two-pointers.md	3-way (DNF) partition uses a two-pointer / three-pointer technique internally.

Interview Reference

Must-Know Problems

Core Selection

Kth Largest Element in an Array (LC 215) — canonical quickselect problem; implement both quickselect and heap approaches
Kth Largest Element in a Stream (LC 703) — heap of size k; online variant
Find the Kth Smallest Element in a Sorted Matrix (LC 378) — binary search on value range
K Closest Points to Origin (LC 973) — quickselect on squared distance

Top-K Variants

Top K Frequent Elements (LC 347) — frequency map + quickselect or bucket sort
Top K Frequent Words (LC 692) — same pattern with lexicographic tiebreak
Sort Colors (LC 75) — 3-way DNF partition; foundational for understanding duplicate handling

Median / Percentile

Find Median from Data Stream (LC 295) — two-heap approach; not quickselect but tests same intuition
Wiggle Sort II (LC 324) — median via quickselect then 3-way partition for placement

Harder Applications

K-th Smallest Prime Fraction (LC 786) — binary search on answer with counting
Find K-th Smallest Pair Distance (LC 719) — binary search on answer + sliding window count
K-th Smallest in Multiplication Table (LC 668) — binary search on value + row-wise count

Clarification Questions to Ask

"Is the array 0-indexed or 1-indexed for k?" — determines the index used in quickselect.
"Can the array be modified?" — determines whether to copy before running in-place quickselect.
"Are there duplicate values?" — if yes, confirm the partition strategy handles them (3-way partition).
"Is input given as a stream or all at once?" — stream → heap; batch → quickselect.
"For 'median', what is the definition when n is even?" — average of two middles, lower middle, or upper middle.
"What are the constraints on k?" — 1 ≤ k ≤ n is typical but must be confirmed.

Common Interview Mistakes

Implementing quicksort instead of quickselect — recursing into both branches instead of one, giving O(n log n) instead of O(n).
Using a fixed pivot (first or last element) without randomisation — interviewers will ask about sorted input performance; answer must address O(n²) worst case.
Forgetting the base case if lo == hi: return arr[lo] — causes infinite recursion or index-out-of-range on single-element subarrays.
Off-by-one in k-th largest to k-th smallest conversion — using n - k vs n - k + 1 depending on 0- vs 1-indexing of k; derive it explicitly rather than guessing.
Assuming 2-way partition is sufficient for all inputs — failing to mention 3-way partition when asked about duplicate-heavy inputs shows incomplete understanding.
Claiming O(n) without qualifying "expected" — median-of-medians is O(n) worst case; standard quickselect is O(n) expected. Conflating these is a red flag.
Not knowing the heap alternative — interviewers frequently ask "what if you cannot modify the input?" or "what if data arrives in a stream?"; missing the heap answer loses points.

Typical Follow-Up Escalations

"Can you do it in O(n) worst case?" → Switch to median-of-medians; explain the group-of-5 pivot construction and the guaranteed 30/70 split.
"What if the array arrives as a stream?" → Use a min-heap of size k; insert each new element and pop root if heap exceeds k; O(log k) per element.
"What if there are many duplicates?" → Replace 2-way partition with 3-way (DNF) partition; the == region collapses all equal-to-pivot elements in one pass.
"Can you avoid modifying the input?" → Either copy the array first (O(n) space) or use the heap approach (O(k) space).
"Now return the k smallest in sorted order." → After quickselect places the boundary at index k, sort arr[:k]; total O(n + k log k).
"What if the input is a matrix?" → Binary search on value range [min, max]; count elements ≤ mid using a staircase walk; O(n log(max − min)).

Explore Unread

Great job! You've read all available articles

Quickselect

Core Concepts

Types / Variants

Key Algorithms

Lomuto Partition

Hoare Partition

3-Way Partition (Dutch National Flag)

Quickselect (Core Loop)

Median-of-Medians

Heap-Based Selection

nth_element (Library Shortcut)

Advanced Techniques

Weighted k-th Smallest

Selection in a Row-and-Column Sorted Matrix

Introselect / Fallback Strategy

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Quickselect

Core Concepts

Types / Variants

Key Algorithms

Lomuto Partition

Hoare Partition

3-Way Partition (Dutch National Flag)

Quickselect (Core Loop)

Median-of-Medians

Heap-Based Selection

nth_element (Library Shortcut)

Advanced Techniques

Weighted k-th Smallest

Selection in a Row-and-Column Sorted Matrix

Introselect / Fallback Strategy

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread