Merge Sort

Topic type: Algorithm Paradigm — a divide-and-conquer strategy centered on recursively splitting input, sorting or processing each half independently, and merging the results. On LeetCode the tag covers both the sorting algorithm and any problem where the merge-step structure is the core technique (inversion counting, cross-half pair counting, k-way merge).

LeetCode has approximately 60–80 problems tagged Merge Sort. Core algorithms, main patterns, and key complexities are covered at full depth; the counting-pair family and k-way merge are the techniques that unlock the harder problems.

Core Concepts

Divide — split the input at the midpoint into two halves of equal (or near-equal) size. The split itself carries no cost beyond computing a midpoint; no data is moved.

Conquer — recursively sort (or process) each half until a base case (size ≤ 1) is reached. Subproblems are fully independent: the left half and right half share no state.

Merge — combine two sorted sequences of length m and n into one sorted sequence of length m+n in O(m+n) time using two pointers. This is the step where all hard work is done; the divide step is trivial.

Recurrence — T(n) = 2T(n/2) + O(n). By the Master Theorem (a=2, b=2, d=1 → d = log_b a = 1), this resolves to O(n log n) time.

Merge buffer — auxiliary array of size n required to hold merged results before writing back; merge sort is not in-place. Space complexity is O(n).

Stability — equal elements from the left half are placed before equal elements from the right half during merge, preserving original relative order. Merge sort is a stable sort.

Inversion — a pair (i, j) with i < j but a[i] > a[j]. The merge step can count inversions in O(n log n) by observing that when a right-half element is placed before remaining left-half elements, each of those left-half elements forms an inversion with it.

Cross-half pairs — the general class: pairs (i, j) where i is in the left half and j is in the right half, and the pair satisfies some condition (e.g., a[i] > a[j], a[i] > 2·a[j], a[i] + a[j] > target). The sorted order of each half allows counting such pairs in O(n) per merge level rather than O(n²) brute force.

k-way merge — generalisation: merge k sorted sequences simultaneously using a min-heap. Used when input arrives as k sorted lists or when external sorting is needed. Time O(n log k) where n is total elements.

Types / Variants

Variant	What it does	When to use over alternatives
Top-down (recursive)	Split at midpoint, recurse, merge	Default; clear structure, easy to augment with counters
Bottom-up (iterative)	Merge windows of size 1, then 2, then 4, ...	When recursion depth is a concern (n > 10^4 in Python); avoids stack frames
Natural merge sort	Detects existing sorted runs instead of splitting at midpoint	Input is nearly sorted; O(n) best case, O(n log n) worst
In-place merge sort	Merges in-place using block rearrangement	Only when O(1) extra space is required; O(n log² n) or O(n log n) with complex implementation; rarely expected in interviews
k-way merge	Simultaneously merges k sorted sequences via a min-heap	Output of k sorted sources (e.g., k sorted arrays, external sort); reduces to pairwise merge only when k is small
Augmented merge (counting)	Standard top-down merge with a counter updated during the merge step	Inversion count, cross-half pair count; the augmentation does not increase asymptotic complexity

Key Algorithms

Standard Merge Sort (Top-Down)

Recursively split at the midpoint, sort each half, merge the two sorted halves using two pointers.

Key insight: merging two sorted arrays of sizes m and n takes exactly m+n steps; balanced splitting ensures log n levels, giving O(n log n) total work.

Time O(n log n) worst/average/best (unlike quicksort, no adversarial input exists). Space O(n) for the merge buffer plus O(log n) stack frames.

def merge(arr, tmp, lo, mid, hi):
    tmp[lo:hi+1] = arr[lo:hi+1]
    l, r = lo, mid + 1
    for k in range(lo, hi + 1):
        if l > mid: arr[k] = tmp[r]; r += 1
        elif r > hi or tmp[l] <= tmp[r]: arr[k] = tmp[l]; l += 1
        else: arr[k] = tmp[r]; r += 1

Allocate tmp once at the top level and pass it down; this avoids O(n log n) separate allocations.

Bottom-Up Merge Sort (Iterative)

Treat each element as a sorted run of size 1. Merge adjacent pairs to get runs of size 2, then 4, then 8, ..., until one run covers the whole array. No recursion; same O(n log n) time and O(n) space.

size = 1
while size < n:
    for lo in range(0, n, 2 * size):
        mid = min(lo + size - 1, n - 1)
        hi = min(lo + 2 * size - 1, n - 1)
        merge(arr, tmp, lo, mid, hi)
    size *= 2

Use when Python's recursion limit (1000 by default) would be hit for n > ~500.

Counting Inversions (Augmented Merge)

Augment the merge step: when a right-half element tmp[r] is placed before elements remaining in the left half (l through mid), exactly mid - l + 1 inversions are contributed — each remaining left element is larger than this right element and precedes it in the original array.

Key insight: both halves are sorted, so all remaining left elements are ≥ tmp[l] > tmp[r]; counting once per right-wins event is sufficient.

Time O(n log n), Space O(n). Inversion counts can reach n(n−1)/2 ≈ 5×10^9 for n = 10^5; use a 64-bit integer (Python ints are arbitrary precision).

elif tmp[l] <= tmp[r]: arr[k] = tmp[l]; l += 1
else: arr[k] = tmp[r]; r += 1; count += (mid - l + 1)

Counting Reverse Pairs (Modified Merge)

Count pairs (i, j) with i < j and a[i] > 2·a[j] (LeetCode 493). Cannot use the same loop as the standard merge because the condition (> 2·a[j]) is not the same condition used to decide merge order (> a[j]).

Key insight: use a separate pointer t to count valid right-half partners before the merge loop, then perform the merge separately. Both sub-passes are O(n) per level.

t = mid + 1
for i in range(lo, mid + 1):
    while t <= hi and arr[i] > 2 * arr[t]: t += 1
    count += t - (mid + 1)
# then perform normal merge

Time O(n log n), Space O(n).

Count of Smaller Numbers After Self (Merge Sort + Index Tracking)

For each element, count how many elements to its right are strictly smaller (LeetCode 315). Augment merge sort to operate on index-value pairs. When a right-element overtakes left elements during merge, each such left element gains (right pointer's offset) to its count.

Key insight: track original indices alongside values; "smaller number after self" is exactly an inversion from the perspective of the right element.

Time O(n log n), Space O(n).

Count of Range Sum (Merge Sort on Prefix Sums)

Count pairs (i, j) with lower ≤ prefix[j] − prefix[i] ≤ upper (LeetCode 327). Compute prefix sums, then on the sorted prefix array use two extra pointers (lo_ptr, hi_ptr) that slide monotonically as i advances through left-half elements to find the window of right-half elements satisfying the range condition.

Key insight: for a fixed left-half prefix[i], valid prefix[j] values form a contiguous window in the sorted right half; sorting both halves lets the window endpoints advance monotonically across all i, giving O(n) per level.

Time O(n log n), Space O(n).

k-Way Merge

Merge k sorted lists into one sorted output using a min-heap of size k. Push the first element of each list onto the heap. Repeatedly extract the minimum, append to output, and push the next element from the same list.

Key insight: the heap always holds the current minimum candidate from each live list; each element is pushed and popped exactly once.

Time O(n log k) where n = total elements across all lists. Space O(k) for the heap, O(n) for output.

import heapq
heap = [(lst[0], i, 0) for i, lst in enumerate(lists) if lst]
heapq.heapify(heap)

External Sort (Merge Phase)

When data exceeds memory: split into chunks that fit in RAM, sort each chunk independently, then use k-way merge to produce final sorted output. The merge phase runs in O(n log k) with O(k) heap memory regardless of total n.

Advanced Techniques

Merge-Step Cross-Half Counting (General Pattern)

Solves: counting pairs (i, j) with i in the left half and j in the right half satisfying an order-based condition that can be evaluated in O(n) per level given that both halves are sorted.

Core mechanism: after recursively processing both halves (which sorts them), use one or two extra monotone pointers that sweep the right half for each left element. Because both halves are sorted, pointer movement is monotone across left elements: the valid window only moves forward. Total pointer movement per level is O(n), giving O(n log n) overall.

When to use over brute force: when the pair condition is order-preserving — if a[i] satisfies the condition against some a[j], it either does or doesn't against all a[j'] beyond j (monotone in j). For conditions that are not monotone in the sorted order, this technique does not apply; use a hash/sorted-container approach.

When to use over BIT/segment tree + coordinate compression: both solve the same family. Merge-step counting is O(n log n) time/space and avoids coordinate compression; BIT approach is also O(n log n) but requires coordinate compression setup and handles online queries. For offline batch counting, either works; merge sort is simpler to implement correctly without extra data structures.

Time O(n log n), Space O(n).

Linked List Merge Sort

Solves: sorting a singly linked list in O(n log n) with O(log n) space (only stack frames; no array buffer needed).

Core mechanism: find the midpoint using slow/fast pointers, split the list by setting slow.next = None, recurse on both halves, merge by relinking nodes. The merge buffer trick for arrays is unnecessary because linking nodes doesn't require extra space.

When to use over array merge sort: when input is a linked list and O(1) extra space (beyond stack) is required. For O(1) total space, use bottom-up iterative linked-list merge sort to eliminate the O(log n) stack.

Time O(n log n), Space O(log n) recursive / O(1) iterative.

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Sort a collection	Produce sorted output from unsorted input	Standard merge sort
Count inversions	How many pairs (i<j) are out of order	Augmented merge: count on right-wins
Count reverse pairs	Pairs where a[i] > k·a[j], j > i	Separate count pass + merge
Count range sums	Pairs of prefix sums whose difference lies in [lo, hi]	Merge sort on prefix array with sliding window pointers
Count smaller after self	For each element, count elements to its right that are smaller	Index-tracked merge sort
Sort linked list	Sort a linked list in O(n log n)	Linked list merge sort with slow/fast pointer split
Merge k sorted inputs	Combine k sorted lists/arrays into one	k-way merge with min-heap
Median of two sorted arrays	Find the median of merged sorted inputs without full merge	Binary search on shorter array (not merge sort, but related to merge intuition)

Problem Signal → Technique

Signal in problem	Likely technique
"count inversions" / "minimum swaps to sort"	Augmented merge sort
"reverse pairs" / "important reverse pairs" / a[i] > k·a[j]	Separate-count augmented merge
"count range sum" / prefix sum pairs in range	Merge sort on prefix array
"count of smaller numbers after self"	Index-tracked merge sort
"sort a linked list"	Linked list merge sort
"merge k sorted lists / arrays"	k-way merge with min-heap
"kth smallest in k sorted arrays"	k-way merge or binary search on answer
sorted input arriving in chunks / streams	k-way merge / external sort pattern

Common Patterns

Pattern	When it applies	Mechanism
Allocate one buffer at top level	Any recursive merge sort	Create `tmp = arr[:]` once; pass as parameter to all recursive calls to avoid O(n log n) allocations
Index-value pair sorting	Counting problems that need original positions	Wrap elements as `(value, original_index)` tuples; sort pairs; update a result array by index
Separate count pass before merge	Pair conditions that differ from merge order (e.g., a[i] > 2·a[j])	Count valid j for each i using a separate monotone pointer; then perform normal merge for sorting
Two-pointer window on right half	Count pairs where condition is a range (e.g., lower ≤ diff ≤ upper)	Maintain lo_ptr and hi_ptr into right half; advance monotonically as left pointer moves
Slow/fast pointer split for linked list	Finding midpoint of linked list before merge	Advance slow by 1 and fast by 2; when fast reaches end, slow is at midpoint
Bottom-up size doubling	Iterative merge sort to avoid recursion	Outer loop over sizes 1,2,4,8,...; inner loop over windows of size 2·size

Edge Cases & Pitfalls

Array slicing in recursive calls: arr[:mid] creates an O(n) copy per call, making merge sort O(n² log n). Always pass (lo, mid, hi) indices into the same array; use a pre-allocated buffer for the merge.
Inversion count overflow: for n = 10^5, the maximum inversion count is n(n−1)/2 ≈ 5×10^9, exceeding 32-bit integer range. Python handles this automatically; in Java/C++ use long.
Off-by-one in midpoint: use mid = lo + (hi - lo) // 2 to prevent overflow in fixed-width languages. (lo + hi) // 2 is safe in Python but a habit to avoid.
Counting right-wins at the wrong point: the inversion count mid - l + 1 must be added when tmp[l] > tmp[r], not when tmp[l] >= tmp[r]. Adding it for equal elements over-counts; equal elements are not inversions.
Reverse pairs: using same loop for count and merge: the condition for counting (a[i] > 2·a[j]) is different from the merge condition (a[i] > a[j]). Sharing one loop produces wrong counts. Always separate the count pass from the merge pass.
k-way merge: heap tuple ordering: when values are equal, the heap will try to compare the second tuple element. Store (value, list_index, element_index) as the heap entry; list_index as a tiebreaker prevents comparison errors on non-comparable value types.
Linked list split: missing null termination: after finding the midpoint with slow/fast pointers, slow.next must be set to None before recursing. Without this, the left sublist still points into the right sublist, causing infinite recursion.
Bottom-up: window boundary clamping: hi = min(lo + 2*size - 1, n - 1) and mid = min(lo + size - 1, n - 1) are required; without clamping, the last window may exceed the array bounds or have mid >= hi (empty right half), causing an incorrect merge.
Natural merge sort: not detecting all runs: runs must be detected as maximal non-decreasing sequences. Treating a strictly increasing condition as the run boundary misses equal elements and produces O(n log n) behavior even on already-sorted input.
Stability in augmented sorts: when wrapping elements as (value, index) pairs, the comparator must use <= (not <) for the value comparison in the merge step to preserve stability and correctness of the inversion count.

Implementation Notes

Python's default recursion limit is 1000; naive top-down merge sort fails for n > ~500 without sys.setrecursionlimit. Use bottom-up iterative merge sort for large inputs.
Python's list.sort() and sorted() are Timsort — stable, O(n log n) worst case, O(n) for nearly-sorted. Always prefer built-in sort for pure sorting; implement merge sort only when the merge step carries additional logic.
heapq in Python is a min-heap; for k-way merge of descending-sorted lists, negate values or use a key wrapper. heapq.merge(*iterables) is a lazy k-way merge iterator (O(1) extra space, O(log k) per step) but requires already-sorted iterables.
When sorting index-value pairs, tuples compare lexicographically: (val, idx) pairs sort by value first, then by index as a stable tiebreaker. This tiebreaker matters if val values are equal and downstream logic depends on original order.
In the k-way merge heap entry (val, list_idx, elem_idx), list_idx must be an integer tiebreaker if val is a type that does not support comparison (e.g., custom objects without __lt__); without a tiebreaker the heap will raise a TypeError.
Merge sort on a linked list avoids the O(n) buffer required for array merge sort; the only extra space is O(log n) stack frames (or O(1) with bottom-up iterative).

Cross-Topic Relations

Related topic	Connection
divide-and-conquer.md	Merge sort is the canonical divide-and-conquer sorting algorithm; the merge-step counting technique generalises to all cross-half pair counting problems covered there
sorting.md	Covers merge sort in the context of all sorting algorithms; quicksort, heap sort, counting sort, and stability tradeoffs
heap-priority-queue.md	k-way merge uses a min-heap as its core data structure; heap sort is an alternative O(n log n) in-place sort
linked-list.md	Linked list merge sort requires slow/fast pointer midpoint finding and pointer relinking instead of index arithmetic
prefix-sum.md	Count of Range Sum (LeetCode 327) applies merge sort to a prefix sum array; prefix sums are the preprocessing step
binary-search.md	Median of Two Sorted Arrays solves a merge-related problem more efficiently with binary search; understanding why merge is O(n) motivates the O(log n) binary search approach
two-pointers.md	The merge step itself is a two-pointer technique; cross-half counting windows are two-pointer scans on sorted halves

Interview Reference

Must-Know Problems

Core sorting

Sort an Array — LeetCode 912 (implement merge sort from scratch)
Sort List — LeetCode 148 (linked list merge sort)

Inversion / cross-half counting

Count of Smaller Numbers After Self — LeetCode 315 (Medium/Hard)
Reverse Pairs — LeetCode 493 (Hard)
Count of Range Sum — LeetCode 327 (Hard)
Count Inversions (classic; also on GFG / Striver A2Z sheet)

k-way merge

Merge K Sorted Lists — LeetCode 23 (Hard)
Kth Smallest Element in a Sorted Matrix — LeetCode 378 (Medium)
Find K Pairs with Smallest Sums — LeetCode 373 (Medium)
Smallest Range Covering Elements from K Lists — LeetCode 632 (Hard)

Related merge applications

Merge Sorted Array — LeetCode 88 (Easy; core merge step in isolation)
Find Median from Data Stream — LeetCode 295 (uses merge intuition; solved with two heaps)

Clarification Questions to Ask

Are there duplicate values? Do duplicates count as inversions or as separate pairs? (Affects the <= vs < boundary in the merge condition.)
What is the value range? (Determines if inversion count can overflow 32-bit int.)
For pair-counting problems: are pairs (i, j) ordered (i < j required) or unordered?
For k-way merge: are input lists guaranteed sorted? Can lists be empty?
For linked list sort: is the list singly or doubly linked? (Affects the split and merge implementation.)
Is in-place sorting required, or is O(n) extra space acceptable?

Common Interview Mistakes

Slicing the array in recursive calls (arr[:mid], arr[mid:]) instead of passing indices, accidentally making the solution O(n² log n).
Using the same loop for counting and merging in the reverse pairs variant, producing an incorrect count when the count condition differs from the merge condition.
Forgetting slow.next = None when splitting a linked list, causing infinite recursion because the left sublist still links into the right.
Adding mid - l + 1 to the inversion count when elements are equal (tmp[l] == tmp[r]), over-counting; equal elements are not inversions.
Not accounting for heap tuple comparison errors in k-way merge when values are equal; omitting an integer tiebreaker causes TypeError on custom types or non-deterministic ordering.
Forgetting boundary clamping (min(..., n-1)) in the bottom-up iterative variant, leading to index-out-of-bounds or merging an empty right half.
Claiming O(n) space is avoidable in merge sort without qualification; in-place merge sort is O(n log n) or O(n log² n) time and complex; the interviewer almost certainly wants O(n) space acknowledged upfront.

Typical Follow-Up Escalations

"Implement merge sort" → "Now count the number of inversions while sorting" → "Now count reverse pairs where arr[i] > 2·arr[j]"
"Sort this linked list" → "Now do it without recursion (O(1) extra space)" → "Bottom-up iterative linked list merge sort"
"Merge k sorted lists" → "Now find the k-th smallest element across k sorted arrays" → "Now find the smallest range that includes one element from each of k sorted lists"
"Count of Smaller Numbers After Self" → "How would you handle it if the array were streaming (online)?" → Sorted container (SortedList from sortedcontainers) with binary search for O(log n) per element
"Your merge sort uses O(n) extra space — can you do O(1)?" → Acknowledge in-place merge sort exists but is O(n log n) or O(n log² n) with high constant; in practice, O(n) is accepted for merge sort

Explore Unread

Great job! You've read all available articles

Merge Sort

Core Concepts

Divide — split the input at the midpoint into two halves of equal (or near-equal) size. The split itself carries no cost beyond computing a midpoint; no data is moved.

Conquer — recursively sort (or process) each half until a base case (size ≤ 1) is reached. Subproblems are fully independent: the left half and right half share no state.

Recurrence — T(n) = 2T(n/2) + O(n). By the Master Theorem (a=2, b=2, d=1 → d = log_b a = 1), this resolves to O(n log n) time.

Merge buffer — auxiliary array of size n required to hold merged results before writing back; merge sort is not in-place. Space complexity is O(n).

Stability — equal elements from the left half are placed before equal elements from the right half during merge, preserving original relative order. Merge sort is a stable sort.

Types / Variants

Variant	What it does	When to use over alternatives
Top-down (recursive)	Split at midpoint, recurse, merge	Default; clear structure, easy to augment with counters
Bottom-up (iterative)	Merge windows of size 1, then 2, then 4, ...	When recursion depth is a concern (n > 10^4 in Python); avoids stack frames
Natural merge sort	Detects existing sorted runs instead of splitting at midpoint	Input is nearly sorted; O(n) best case, O(n log n) worst
In-place merge sort	Merges in-place using block rearrangement	Only when O(1) extra space is required; O(n log² n) or O(n log n) with complex implementation; rarely expected in interviews
k-way merge	Simultaneously merges k sorted sequences via a min-heap	Output of k sorted sources (e.g., k sorted arrays, external sort); reduces to pairwise merge only when k is small
Augmented merge (counting)	Standard top-down merge with a counter updated during the merge step	Inversion count, cross-half pair count; the augmentation does not increase asymptotic complexity

Key Algorithms

Standard Merge Sort (Top-Down)

Recursively split at the midpoint, sort each half, merge the two sorted halves using two pointers.

Key insight: merging two sorted arrays of sizes m and n takes exactly m+n steps; balanced splitting ensures log n levels, giving O(n log n) total work.

Time O(n log n) worst/average/best (unlike quicksort, no adversarial input exists). Space O(n) for the merge buffer plus O(log n) stack frames.

def merge(arr, tmp, lo, mid, hi):
    tmp[lo:hi+1] = arr[lo:hi+1]
    l, r = lo, mid + 1
    for k in range(lo, hi + 1):
        if l > mid: arr[k] = tmp[r]; r += 1
        elif r > hi or tmp[l] <= tmp[r]: arr[k] = tmp[l]; l += 1
        else: arr[k] = tmp[r]; r += 1

Allocate tmp once at the top level and pass it down; this avoids O(n log n) separate allocations.

Bottom-Up Merge Sort (Iterative)

Treat each element as a sorted run of size 1. Merge adjacent pairs to get runs of size 2, then 4, then 8, ..., until one run covers the whole array. No recursion; same O(n log n) time and O(n) space.

size = 1
while size < n:
    for lo in range(0, n, 2 * size):
        mid = min(lo + size - 1, n - 1)
        hi = min(lo + 2 * size - 1, n - 1)
        merge(arr, tmp, lo, mid, hi)
    size *= 2

Use when Python's recursion limit (1000 by default) would be hit for n > ~500.

Counting Inversions (Augmented Merge)

Key insight: both halves are sorted, so all remaining left elements are ≥ tmp[l] > tmp[r]; counting once per right-wins event is sufficient.

Time O(n log n), Space O(n). Inversion counts can reach n(n−1)/2 ≈ 5×10^9 for n = 10^5; use a 64-bit integer (Python ints are arbitrary precision).

elif tmp[l] <= tmp[r]: arr[k] = tmp[l]; l += 1
else: arr[k] = tmp[r]; r += 1; count += (mid - l + 1)

Counting Reverse Pairs (Modified Merge)

Key insight: use a separate pointer t to count valid right-half partners before the merge loop, then perform the merge separately. Both sub-passes are O(n) per level.

t = mid + 1
for i in range(lo, mid + 1):
    while t <= hi and arr[i] > 2 * arr[t]: t += 1
    count += t - (mid + 1)
# then perform normal merge

Time O(n log n), Space O(n).

Count of Smaller Numbers After Self (Merge Sort + Index Tracking)

Key insight: track original indices alongside values; "smaller number after self" is exactly an inversion from the perspective of the right element.

Time O(n log n), Space O(n).

Count of Range Sum (Merge Sort on Prefix Sums)

Time O(n log n), Space O(n).

k-Way Merge

Key insight: the heap always holds the current minimum candidate from each live list; each element is pushed and popped exactly once.

Time O(n log k) where n = total elements across all lists. Space O(k) for the heap, O(n) for output.

import heapq
heap = [(lst[0], i, 0) for i, lst in enumerate(lists) if lst]
heapq.heapify(heap)

External Sort (Merge Phase)

Advanced Techniques

Merge-Step Cross-Half Counting (General Pattern)

Solves: counting pairs (i, j) with i in the left half and j in the right half satisfying an order-based condition that can be evaluated in O(n) per level given that both halves are sorted.

Time O(n log n), Space O(n).

Linked List Merge Sort

Solves: sorting a singly linked list in O(n log n) with O(log n) space (only stack frames; no array buffer needed).

Time O(n log n), Space O(log n) recursive / O(1) iterative.

Problem Taxonomy

By Problem Category

Category	What it asks	Key technique
Sort a collection	Produce sorted output from unsorted input	Standard merge sort
Count inversions	How many pairs (i<j) are out of order	Augmented merge: count on right-wins
Count reverse pairs	Pairs where a[i] > k·a[j], j > i	Separate count pass + merge
Count range sums	Pairs of prefix sums whose difference lies in [lo, hi]	Merge sort on prefix array with sliding window pointers
Count smaller after self	For each element, count elements to its right that are smaller	Index-tracked merge sort
Sort linked list	Sort a linked list in O(n log n)	Linked list merge sort with slow/fast pointer split
Merge k sorted inputs	Combine k sorted lists/arrays into one	k-way merge with min-heap
Median of two sorted arrays	Find the median of merged sorted inputs without full merge	Binary search on shorter array (not merge sort, but related to merge intuition)

Problem Signal → Technique

Signal in problem	Likely technique
"count inversions" / "minimum swaps to sort"	Augmented merge sort
"reverse pairs" / "important reverse pairs" / a[i] > k·a[j]	Separate-count augmented merge
"count range sum" / prefix sum pairs in range	Merge sort on prefix array
"count of smaller numbers after self"	Index-tracked merge sort
"sort a linked list"	Linked list merge sort
"merge k sorted lists / arrays"	k-way merge with min-heap
"kth smallest in k sorted arrays"	k-way merge or binary search on answer
sorted input arriving in chunks / streams	k-way merge / external sort pattern

Common Patterns

Pattern	When it applies	Mechanism
Allocate one buffer at top level	Any recursive merge sort	Create `tmp = arr[:]` once; pass as parameter to all recursive calls to avoid O(n log n) allocations
Index-value pair sorting	Counting problems that need original positions	Wrap elements as `(value, original_index)` tuples; sort pairs; update a result array by index
Separate count pass before merge	Pair conditions that differ from merge order (e.g., a[i] > 2·a[j])	Count valid j for each i using a separate monotone pointer; then perform normal merge for sorting
Two-pointer window on right half	Count pairs where condition is a range (e.g., lower ≤ diff ≤ upper)	Maintain lo_ptr and hi_ptr into right half; advance monotonically as left pointer moves
Slow/fast pointer split for linked list	Finding midpoint of linked list before merge	Advance slow by 1 and fast by 2; when fast reaches end, slow is at midpoint
Bottom-up size doubling	Iterative merge sort to avoid recursion	Outer loop over sizes 1,2,4,8,...; inner loop over windows of size 2·size

Edge Cases & Pitfalls

Array slicing in recursive calls: arr[:mid] creates an O(n) copy per call, making merge sort O(n² log n). Always pass (lo, mid, hi) indices into the same array; use a pre-allocated buffer for the merge.
Inversion count overflow: for n = 10^5, the maximum inversion count is n(n−1)/2 ≈ 5×10^9, exceeding 32-bit integer range. Python handles this automatically; in Java/C++ use long.
Off-by-one in midpoint: use mid = lo + (hi - lo) // 2 to prevent overflow in fixed-width languages. (lo + hi) // 2 is safe in Python but a habit to avoid.
Counting right-wins at the wrong point: the inversion count mid - l + 1 must be added when tmp[l] > tmp[r], not when tmp[l] >= tmp[r]. Adding it for equal elements over-counts; equal elements are not inversions.
Reverse pairs: using same loop for count and merge: the condition for counting (a[i] > 2·a[j]) is different from the merge condition (a[i] > a[j]). Sharing one loop produces wrong counts. Always separate the count pass from the merge pass.
k-way merge: heap tuple ordering: when values are equal, the heap will try to compare the second tuple element. Store (value, list_index, element_index) as the heap entry; list_index as a tiebreaker prevents comparison errors on non-comparable value types.
Linked list split: missing null termination: after finding the midpoint with slow/fast pointers, slow.next must be set to None before recursing. Without this, the left sublist still points into the right sublist, causing infinite recursion.
Bottom-up: window boundary clamping: hi = min(lo + 2*size - 1, n - 1) and mid = min(lo + size - 1, n - 1) are required; without clamping, the last window may exceed the array bounds or have mid >= hi (empty right half), causing an incorrect merge.
Natural merge sort: not detecting all runs: runs must be detected as maximal non-decreasing sequences. Treating a strictly increasing condition as the run boundary misses equal elements and produces O(n log n) behavior even on already-sorted input.
Stability in augmented sorts: when wrapping elements as (value, index) pairs, the comparator must use <= (not <) for the value comparison in the merge step to preserve stability and correctness of the inversion count.

Implementation Notes

Python's default recursion limit is 1000; naive top-down merge sort fails for n > ~500 without sys.setrecursionlimit. Use bottom-up iterative merge sort for large inputs.
Python's list.sort() and sorted() are Timsort — stable, O(n log n) worst case, O(n) for nearly-sorted. Always prefer built-in sort for pure sorting; implement merge sort only when the merge step carries additional logic.
heapq in Python is a min-heap; for k-way merge of descending-sorted lists, negate values or use a key wrapper. heapq.merge(*iterables) is a lazy k-way merge iterator (O(1) extra space, O(log k) per step) but requires already-sorted iterables.
When sorting index-value pairs, tuples compare lexicographically: (val, idx) pairs sort by value first, then by index as a stable tiebreaker. This tiebreaker matters if val values are equal and downstream logic depends on original order.
In the k-way merge heap entry (val, list_idx, elem_idx), list_idx must be an integer tiebreaker if val is a type that does not support comparison (e.g., custom objects without __lt__); without a tiebreaker the heap will raise a TypeError.
Merge sort on a linked list avoids the O(n) buffer required for array merge sort; the only extra space is O(log n) stack frames (or O(1) with bottom-up iterative).

Cross-Topic Relations

Related topic	Connection
divide-and-conquer.md	Merge sort is the canonical divide-and-conquer sorting algorithm; the merge-step counting technique generalises to all cross-half pair counting problems covered there
sorting.md	Covers merge sort in the context of all sorting algorithms; quicksort, heap sort, counting sort, and stability tradeoffs
heap-priority-queue.md	k-way merge uses a min-heap as its core data structure; heap sort is an alternative O(n log n) in-place sort
linked-list.md	Linked list merge sort requires slow/fast pointer midpoint finding and pointer relinking instead of index arithmetic
prefix-sum.md	Count of Range Sum (LeetCode 327) applies merge sort to a prefix sum array; prefix sums are the preprocessing step
binary-search.md	Median of Two Sorted Arrays solves a merge-related problem more efficiently with binary search; understanding why merge is O(n) motivates the O(log n) binary search approach
two-pointers.md	The merge step itself is a two-pointer technique; cross-half counting windows are two-pointer scans on sorted halves

Interview Reference

Must-Know Problems

Core sorting

Sort an Array — LeetCode 912 (implement merge sort from scratch)
Sort List — LeetCode 148 (linked list merge sort)

Inversion / cross-half counting

Count of Smaller Numbers After Self — LeetCode 315 (Medium/Hard)
Reverse Pairs — LeetCode 493 (Hard)
Count of Range Sum — LeetCode 327 (Hard)
Count Inversions (classic; also on GFG / Striver A2Z sheet)

k-way merge

Merge K Sorted Lists — LeetCode 23 (Hard)
Kth Smallest Element in a Sorted Matrix — LeetCode 378 (Medium)
Find K Pairs with Smallest Sums — LeetCode 373 (Medium)
Smallest Range Covering Elements from K Lists — LeetCode 632 (Hard)

Related merge applications

Merge Sorted Array — LeetCode 88 (Easy; core merge step in isolation)
Find Median from Data Stream — LeetCode 295 (uses merge intuition; solved with two heaps)

Clarification Questions to Ask

Are there duplicate values? Do duplicates count as inversions or as separate pairs? (Affects the <= vs < boundary in the merge condition.)
What is the value range? (Determines if inversion count can overflow 32-bit int.)
For pair-counting problems: are pairs (i, j) ordered (i < j required) or unordered?
For k-way merge: are input lists guaranteed sorted? Can lists be empty?
For linked list sort: is the list singly or doubly linked? (Affects the split and merge implementation.)
Is in-place sorting required, or is O(n) extra space acceptable?

Common Interview Mistakes

Slicing the array in recursive calls (arr[:mid], arr[mid:]) instead of passing indices, accidentally making the solution O(n² log n).
Using the same loop for counting and merging in the reverse pairs variant, producing an incorrect count when the count condition differs from the merge condition.
Forgetting slow.next = None when splitting a linked list, causing infinite recursion because the left sublist still links into the right.
Adding mid - l + 1 to the inversion count when elements are equal (tmp[l] == tmp[r]), over-counting; equal elements are not inversions.
Not accounting for heap tuple comparison errors in k-way merge when values are equal; omitting an integer tiebreaker causes TypeError on custom types or non-deterministic ordering.
Forgetting boundary clamping (min(..., n-1)) in the bottom-up iterative variant, leading to index-out-of-bounds or merging an empty right half.
Claiming O(n) space is avoidable in merge sort without qualification; in-place merge sort is O(n log n) or O(n log² n) time and complex; the interviewer almost certainly wants O(n) space acknowledged upfront.

Typical Follow-Up Escalations

"Implement merge sort" → "Now count the number of inversions while sorting" → "Now count reverse pairs where arr[i] > 2·arr[j]"
"Sort this linked list" → "Now do it without recursion (O(1) extra space)" → "Bottom-up iterative linked list merge sort"
"Merge k sorted lists" → "Now find the k-th smallest element across k sorted arrays" → "Now find the smallest range that includes one element from each of k sorted lists"
"Count of Smaller Numbers After Self" → "How would you handle it if the array were streaming (online)?" → Sorted container (SortedList from sortedcontainers) with binary search for O(log n) per element
"Your merge sort uses O(n) extra space — can you do O(1)?" → Acknowledge in-place merge sort exists but is O(n log n) or O(n log² n) with high constant; in practice, O(n) is accepted for merge sort

Explore Unread

Great job! You've read all available articles

Merge Sort

Core Concepts

Types / Variants

Key Algorithms

Standard Merge Sort (Top-Down)

Bottom-Up Merge Sort (Iterative)

Counting Inversions (Augmented Merge)

Counting Reverse Pairs (Modified Merge)

Count of Smaller Numbers After Self (Merge Sort + Index Tracking)

Count of Range Sum (Merge Sort on Prefix Sums)

k-Way Merge

External Sort (Merge Phase)

Advanced Techniques

Merge-Step Cross-Half Counting (General Pattern)

Linked List Merge Sort

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread

Merge Sort

Core Concepts

Types / Variants

Key Algorithms

Standard Merge Sort (Top-Down)

Bottom-Up Merge Sort (Iterative)

Counting Inversions (Augmented Merge)

Counting Reverse Pairs (Modified Merge)

Count of Smaller Numbers After Self (Merge Sort + Index Tracking)

Count of Range Sum (Merge Sort on Prefix Sums)

k-Way Merge

External Sort (Merge Phase)

Advanced Techniques

Merge-Step Cross-Half Counting (General Pattern)

Linked List Merge Sort

Problem Taxonomy

By Problem Category

Problem Signal → Technique

Common Patterns

Edge Cases & Pitfalls

Implementation Notes

Cross-Topic Relations

Interview Reference

Must-Know Problems

Clarification Questions to Ask

Common Interview Mistakes

Typical Follow-Up Escalations

Read Next

Explore Unread