Bitmask
This file covers bitmask as a subset representation and state-compression technique. For bitwise operations (XOR tricks, popcount, linear basis), see bit-manipulation.md.
Core Concepts
Bitmask — an integer whose k-th bit encodes membership of element k in a set; a compact O(1)-space representation of any subset of an n-element universe (n ≤ 30 typical, 64-bit for n ≤ 60).
Universe — the set of n distinct elements, indexed 0 to n−1; all masks are integers in [0, 2^n − 1].
Set operations via bitwise:
- Union:
A | B - Intersection:
A & B - Difference (A \ B):
A & ~B; in Python mask with& ((1 << n) - 1)after NOT - Complement:
FULL ^ AwhereFULL = (1 << n) - 1 - Membership of element i:
(mask >> i) & 1 - Insert element i:
mask | (1 << i) - Remove element i:
mask & ~(1 << i) - Cardinality:
bin(mask).count('1')
Subset ordering — mask A ⊆ mask B iff (A & B) == A; equivalently A | B == B. The subset lattice has 2^n nodes and 3^n edges (each element is in A only, B only, or both).
State compression — encoding a combinatorial state (e.g., "which cities have been visited") as a single integer, enabling it to serve as a DP table index or a graph node.
Full mask — (1 << n) - 1; represents the full universe; checking mask == FULL tests whether all elements are included.
Lowest set bit (LSB) — the rightmost 1-bit; mask & -mask; useful for iterating elements one at a time: pop with mask ^ (mask & -mask) or mask & (mask - 1).
Structural Invariants
| Invariant | Why it matters |
|---|---|
| All masks ∈ [0, 2^n − 1] | Accessing dp[mask] out of this range corrupts memory; always pre-check n |
If A ⊆ B then (A & B) == A | The subset containment test; breaking this means the mask doesn't encode a true subset |
Adding an element already in the mask is idempotent (mask | (1<<i)) | Unlike a list, double-insertion has no effect; safe to OR without checking membership |
| Removing an element not in the mask is safe | mask & ~(1<<i) is always safe; no bounds issue |
| SOS DP requires processing dimension by dimension | Switching loop order corrupts the prefix-subset accumulation |
Types / Variants
| Variant | What it is | When to use |
|---|---|---|
| Bitmask DP | DP state includes a bitmask; dp[mask] or dp[mask][i] | n ≤ 20; assignment, matching, TSP, covering problems |
| SOS DP (Sum over Subsets) | For each mask, aggregate f[sub] over all subsets sub ⊆ mask | Counting/summing properties of all subsets in O(n · 2^n) instead of O(3^n) |
| Profile DP | Bitmask represents the "frontier" between processed and unprocessed rows of a grid | Tiling, grid-filling, broken-profile DP |
| Bitmask BFS / Dijkstra | Graph node = (position, visited_set); bitmask encodes visited set | "Visit all nodes" shortest path (Hamiltonian path variants) |
| Meet-in-the-middle bitmask | Split n items into two halves; enumerate 2^(n/2) each | n up to 40; when full 2^n is infeasible |
Key Algorithms
Enumerate All Subsets of a Mask
sub = mask
while sub:
# process sub
sub = (sub - 1) & mask
Iterates all non-empty subsets of mask in decreasing order. The empty subset (sub = 0) is not processed in the loop; handle separately if needed. Total work across all masks of n bits: O(3^n).
Iterate Elements of a Mask One at a Time
while mask:
lsb = mask & -mask # isolate lowest set bit
i = lsb.bit_length() - 1 # index of that bit
mask ^= lsb # remove it
O(popcount) per mask. Use when transitions process one new element at a time.
Enumerate All Subsets of Size k (Gosper's Hack)
mask = (1 << k) - 1
while mask < (1 << n):
# process mask
c = mask & -mask
r = mask + c
mask = (((r ^ mask) >> 2) // c) | r
Produces every n-bit integer with exactly k set bits in ascending order. O(1) per step; C(n,k) total iterations.
Bitmask DP (Basic Template — Assignment / TSP)
State dp[mask][i] = optimal cost when the subset mask has been processed and the last element handled was i.
Transition: for each mask and last element i ∈ mask, try extending to element j ∉ mask:
dp[mask | (1 << j)][j] = min(dp[mask | (1 << j)][j], dp[mask][i] + cost[i][j])
Base: dp[1 << i][i] = start_cost[i] for each starting element i.
Answer: min(dp[FULL][i] + return_cost[i] for i in range(n)).
Complexity: O(2^n · n²) time, O(2^n · n) space.
Bitmask DP (Subset Coverage — Assign Items to Groups)
State dp[mask] = minimum resources to cover exactly the subset mask.
Transition: pick the lowest unset element j not in mask; try every valid group g that covers j; extend with dp[mask | g].
j = lowest unset bit of ~mask (within universe)
for each valid group g containing j:
dp[mask | g] = min(dp[mask | g], dp[mask] + 1)
Complexity: O(2^n · 2^n) worst case; in practice O(2^n · number_of_groups).
SOS DP (Sum over Subsets)
Compute F[mask] = sum of f[sub] for all sub ⊆ mask in O(n · 2^n):
F = f[:] # copy base values
for i in range(n):
for mask in range(1 << n):
if mask >> i & 1:
F[mask] += F[mask ^ (1 << i)]
Each dimension i relaxes the constraint "bit i must be set" one step at a time. After processing all n dimensions, F[mask] is the sum over all subsets. The same skeleton works for min, max, OR, AND by changing the aggregation operator.
When to use over brute-force subset enumeration: when you need F[mask] for every mask simultaneously; brute force is O(3^n), SOS is O(n · 2^n). For a single mask, the sub = (sub-1) & mask loop is simpler.
Profile DP (Grid Tiling / Broken Profile)
For m×n grid problems: process cell by cell, left-to-right top-to-bottom. State = a bitmask of n bits representing which cells in the current column boundary are already filled. Transition: given the current profile, decide how the current tile placement changes the profile. Complexity: O(m · n · 2^n) for n-wide grids; only feasible for small n (≤ 20).
Bitmask BFS (Shortest Path Visiting All Nodes)
Node = (position, visited_mask). Start: (start, 1 << start). Goal: visited_mask == FULL. Use BFS (unweighted) or Dijkstra (weighted). State space: O(V · 2^V); only feasible for V ≤ 20.
dist[node][mask] = shortest distance to reach node with visited set mask
# transition: relax dist[nbr][mask | (1 << nbr)] via edge (node, nbr)
Advanced Techniques
SOS DP for AND/OR/XOR Convolution
Solves: For each mask, compute aggregate over all masks that are subsets (or supersets) of it. Also enables the subset-sum convolution: h[S] = sum_{T ⊆ S} f[T] * g[S \ T].
Mechanism: Three passes — (1) zeta transform (SOS forward), (2) pointwise multiply, (3) Möbius inversion (SOS inverse). Each pass is O(n · 2^n). The Möbius inverse uses subtraction instead of addition in the SOS loop.
When to prefer over subset enumeration: when you need all F[mask] values simultaneously (n·2^n vs 3^n); and for subset convolution where enumerating pairs of subsets per mask is O(4^n).
Complexity: O(n · 2^n) per transform.
Meet in the Middle (Bitmask)
Solves: Enumerate all subsets of n ≤ 40 elements when 2^n is too large; e.g., subset sum, subset XOR target, balanced partition.
Mechanism: Split elements into halves L and R. Enumerate all 2^(n/2) subsets of each half, compute their values, then for each value in L search for a complementary value in sorted R using binary search or hash set.
When to prefer over bitmask DP: n is 30–40 (full bitmask DP infeasible, but 2 × 2^(n/2) ≈ 2^21 is fine).
Complexity: O(2^(n/2) · n/2) time, O(2^(n/2)) space.
Bitmask DP with Subset Transitions (Covering / Partition)
Solves: Partition a set into groups satisfying constraints (set cover, exact cover, coloring); assign items to bins where each bin must be a valid sub-pattern.
Mechanism: Precompute all valid sub-masks (groups); then iterate masks in increasing order of popcount. For each mask, try all valid groups g ⊆ mask that include the lowest-index uncovered element, and transition to dp[mask ^ g].
When to prefer over per-element transitions: when a valid "move" assigns multiple elements at once (e.g., a row in a grid tiling, a team assignment, a set cover step).
Complexity: O(3^n) worst case (every subset is a valid group), but in practice much less due to constraint pruning.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Minimum cost assignment | Assign n workers to n tasks; minimize total cost | Bitmask DP, dp[mask][last] |
| TSP / Hamiltonian path | Visit all nodes exactly once; minimize path length | Bitmask DP + Dijkstra |
| Set cover / exact cover | Cover all elements using minimum number of groups | Bitmask DP with subset transitions |
| Subset with property | Count or find subsets satisfying a constraint | Bitmask enumeration or SOS DP |
| Grid tiling | Tile an m×n grid with given shapes | Profile DP with row bitmask |
| "Visit all" shortest path | Shortest path visiting all k targets | Bitmask BFS/Dijkstra, state = (pos, visited) |
| Counting subsets by aggregate | For each mask, sum/count over all its subsets | SOS DP |
| Partition into valid groups | Partition set into groups each satisfying a rule | Bitmask DP with precomputed valid masks |
| Scheduling with conflicts | Assign tasks to slots; bitmask encodes conflicts | Bitmask DP or matching |
Problem Signal → Technique
| Signal | Technique |
|---|---|
| n ≤ 20 and "visit all" / "assign all" / "cover all" | Bitmask DP |
| "minimum cost to assign workers to tasks" | Bitmask DP dp[mask][last] |
| "shortest path visiting all nodes" | Bitmask Dijkstra/BFS |
| "count subsets where sum/XOR/AND of subset equals k" for all masks | SOS DP |
| "partition into groups", each group must satisfy property | Bitmask DP with subset transitions |
| n ≤ 40, "subset sum", "balanced split" | Meet in the middle |
| Grid with n ≤ 15 columns, fill/tile row by row | Profile DP with n-bit bitmask |
| "how many masks have all elements of another mask" | SOS DP (superset sum) |
| "assign each item to exactly one of k categories" | Bitmask DP, state encodes filled categories |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Fix lowest unset element | Avoid counting permutations as distinct subsets | At each DP state, always process the lowest unset bit; skip states where that bit is already covered |
| Precompute valid sub-masks | Subset transitions with complex validity checks | Before DP, enumerate all masks that are valid groups; store in a list indexed by which element they contain |
| Iterate subsets in popcount order | DP where state depends only on smaller subsets | for mask in sorted(range(1<<n), key=bin(mask).count) or BFS by popcount |
| Bitmask as adjacency | Represent graph neighbors as a bitmask | neighbors[u] is a bitmask; fast intersection / union for multi-hop reachability |
| State = (position, bitmask) | Graph search where history matters | Dijkstra or BFS with composite state; 2D distance array dist[node][mask] |
| SOS forward + pointwise + SOS inverse | Subset convolution | Zeta transform → multiply → Möbius inversion; all O(n·2^n) |
| Complement mask for superset queries | Sum over all supersets of a mask | Run SOS in reverse bit order (or flip all bits before/after) |
Edge Cases & Pitfalls
-
Forgetting the empty-subset case in
sub = (sub-1) & mask: the loop exits when sub reaches 0, sosub = 0is never processed inside the loop. If the empty subset is a valid DP state (e.g.,dp[0]is the base), initialise it before the loop, not inside it. -
Off-by-one on universe size:
FULL = (1 << n) - 1, not1 << n. Accessingdp[1 << n]is an out-of-bounds write if the array has1 << nentries (0-indexed 0 to(1<<n)-1). -
Allocating 2^n arrays when n = 25+: 2^25 ints ≈ 128 MB of 32-bit ints; Python lists of that size use ~1 GB. Always check n ≤ 20 (safe), n ≤ 23 (marginal), n ≥ 25 (likely infeasible) before allocating.
-
Counting assignments vs. counting subsets: bitmask DP counts orderings unless you fix a canonical element. To count unordered subsets or assignments, always process "the lowest unset element" at each step to avoid counting the same subset multiple times in different visit orders.
-
Transition direction in SOS DP: the inner loop must process
maskin increasing order for the forward (subset sum) transform, and decreasing order for the reverse (superset sum). Swapping the order silently gives wrong answers. -
Complement in Python:
~maskgives-(mask+1), not the n-bit complement. For n-bit complement usemask ^ FULLwhereFULL = (1 << n) - 1. -
Profile DP row-order dependency: when processing a grid cell by cell (broken profile), the transition must handle the exact boundary between processed and unprocessed regions consistently. Switching to a column-first vs. row-first sweep mid-problem corrupts the profile meaning.
-
Meet-in-the-middle: not sorting the second half: for binary-search lookup, the second half's values must be sorted. Forgetting to sort gives O(2^n) lookups instead of O(2^(n/2) log).
-
Bitmask BFS: not checking visited
(node, mask)pairs: it's easy to only trackvisited[node]and miss that the same node can be reached with different visited sets, leading to infinite loops or wrong shortest paths.
Implementation Notes
(1 << n) - 1for the full mask; precompute once and reuse asFULL.mask.bit_count()(Python 3.10+) orbin(mask).count('1')for popcount; avoid a manual loop.- For Dijkstra with bitmask state, use a 2D array
dist[node][mask]initialized to infinity; heap entries are(cost, node, mask). - SOS DP in Python: iterating
range(1 << n)is fast; the bottleneck is n levels × 2^n iterations; for n = 20 this is ~20M iterations, acceptable. - Bitmask DP arrays in Python:
[[inf] * n for _ in range(1 << n)]— the outer dimension is the mask, inner is the last element. Do not transpose, because the mask dimension grows as 2^n and is always the outer loop. - When precomputing valid sub-masks, store them as a
list[list[int]]keyed by the lowest element they cover:valid[j]= all masks containing element j and satisfying the constraint. This enables the "fix lowest element" optimization. int.bit_length() - 1gives the index of the highest set bit;(mask & -mask).bit_length() - 1gives the index of the lowest set bit.- For meet-in-the-middle,
itertools.combinationscan enumerate subsets up to a given size, but the bitmask enumerationfor mask in range(1 << half)is faster for all subsets.
Cross-Topic Relations
| Related topic | Connection |
|---|---|
| Bit Manipulation | Provides the primitive operations (AND, OR, XOR, shifts) that bitmask technique is built on; see bit-manipulation.md |
| Dynamic Programming | Bitmask DP is a specialisation of DP where the state space is the power set; all standard DP principles apply |
| Backtracking | Bitmask DP is often the memoised equivalent of a backtracking search over subsets; backtracking explores the same O(2^n · n) space |
| Graph Theory | Bitmask BFS/Dijkstra creates a layered graph where nodes are (vertex, visited_set) pairs; see graph-theory.md |
| Shortest Path | Bitmask Dijkstra is the canonical approach for "minimum cost Hamiltonian path"; see shortest-path.md |
| Combinatorics | Subset counting, SOS DP, and meet-in-the-middle all rest on combinatorial identities; 3^n = sum over k of C(n,k)·2^k |
Interview Reference
Must-Know Problems
Basic bitmask DP (assignment)
- Assign Cookies (455) — warm-up; not bitmask but motivates assignment thinking
- Maximum Students Taking Exam (1349) — profile DP on grid rows
- Number of Ways to Wear Different Hats to Each Other (1434)
- Minimum Cost to Connect Two Groups of Points (1595)
TSP / visit-all-nodes
- Shortest Path Visiting All Nodes (847) — bitmask BFS on graph
- Find the Shortest Superstring (943) — TSP with string overlap costs
- Minimum Cost to Reach Destination in Time (2959)
Set cover / partition
- Partition to K Equal Sum Subsets (698)
- Maximum AND Sum of Array (2172)
- Minimum Number of Work Sessions to Finish the Tasks (1986)
- Fair Distribution of Cookies (2305)
Profile DP (grid)
- Tiling a Rectangle with the Fewest Squares — profile DP variant
- Number of Ways to Place Houses (2320) — simpler row-based bitmask
SOS DP and subset aggregation
- Sum of All Subset XOR Totals (1863) — entry-level
- Maximum OR (2680) — greedy + bitmask
- Count Number of Maximum Bitwise-OR Subsets (2044)
- Minimum XOR Sum of Two Arrays (1879) — bitmask DP assignment
Meet in the middle
- Partition Array Into Two Arrays to Minimize Sum Difference (2035)
- Closest Subsequence Sum (1755)
Clarification Questions to Ask
- What is n exactly? Confirm n ≤ 20 before committing to bitmask DP; n > 20 requires a different approach or meet-in-the-middle.
- Are elements distinct? If elements repeat, the bitmask indexes positions, not values — clarify whether two identical elements are interchangeable.
- Is the assignment one-to-one (each element used exactly once) or can elements repeat? This affects whether
dp[FULL]is the answer or whether subsets of size k are needed. - For grid problems: what are the exact grid dimensions? Profile DP is feasible only if min(rows, cols) ≤ 20.
- For "visit all nodes": is the graph directed or undirected? Does the path need to return to start (Hamiltonian cycle) or not (path)?
Common Interview Mistakes
- Choosing bitmask DP when n = 25 and then hitting memory limits — always verify n ≤ 20 before coding.
- Iterating
dp[mask]without fixing the canonical "next unset element", which causes the same unordered assignment to be counted multiple times as different permutations. - Confusing the
sub = (sub-1) & maskloop (processes non-empty subsets only) with a loop that includes the empty subset — the empty subset requires a special case. - In profile DP, defining the bitmask profile inconsistently (sometimes meaning "already filled", sometimes "to be filled") across the transition code.
- Forgetting to initialise unreachable states to
inf(or−inffor max problems) — defaulting to 0 silently produces wrong optimal values. - Using
dp[mask ^ (1 << i)]instead ofdp[mask | (1 << i)]when extending a state — XOR toggles a bit (could clear it), OR always adds; use OR for "add element to visited set". - In SOS DP, iterating the outer loop over bits and the inner over masks in the wrong order, corrupting the accumulation.
Typical Follow-Up Escalations
- "Now minimise cost instead of just checking feasibility" → change the DP from boolean to min-cost; same state transitions.
- "Now n = 30, same problem" → meet in the middle: split into two halves of 15, enumerate 2^15 each, combine.
- "Now find the number of valid assignments, not just the minimum cost" → change DP value from cost to count; use modular arithmetic.
- "Now the grid is 4×n instead of fixed n×n" → profile DP: iterate columns, state = 4-bit mask of the current column boundary.
- "Now queries are online — for each query mask, give the sum over all its subsets" → precompute SOS DP once; answer each query in O(1).
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles