Trie
Type: Data Structure — 200+ LeetCode problems
Core Concepts
Trie (prefix tree / digital tree) — an ordered tree where each path from root to a node represents a unique string prefix; the root represents the empty prefix.
TrieNode — a node storing a mapping from characters to child nodes and an end-of-word flag; a node at depth d represents a prefix of exactly d characters.
is_end — a boolean on each node marking that the prefix spelled out from root to this node is a complete inserted word; a node with is_end=True can still have children (when the word is a prefix of another inserted word).
Prefix — any leading substring of a string; every ancestor of a node (including the node itself) is a prefix of the strings in its subtree.
Common prefix — the shared leading substring of two or more strings; represented by the shared path from the root in the trie.
Alphabet size σ — the number of distinct characters (26 for lowercase letters); determines the branching factor and space per node.
Depth — the number of edges from root to a node; equals the length of the prefix that node represents.
Space — n strings of average length m → O(n · m) nodes worst case with no sharing; best case O(σ · n) when strings share long common prefixes.
Structural Invariants
| Invariant | What breaks if violated |
|---|---|
Every root-to-is_end path spells exactly one inserted word | search returns true for strings that were never inserted |
| No two children of the same node share the same character key | Ambiguous traversal; a prefix maps to multiple subtrees |
| A node at depth d represents a prefix of exactly d characters | Length-based computations (word break, depth pruning) produce wrong results |
A node with is_end=True may have children | Deleting an is_end node by removing the node orphans all words that extend it |
| Every inserted word's characters form an unbroken path from root | startsWith returning true for a prefix that was never a prefix of any inserted string |
Internal Representation
Array of σ children (children = [None] * 26) — fixed-size array indexed by ord(c) - ord('a'). O(1) access, O(26 · nodes) space. Fast in practice for dense lowercase-only inputs; wastes memory when the alphabet is large or sparse.
HashMap children (children = {}) — keys are characters, values are child nodes. O(1) average access, space proportional to actual branching. Preferred for unicode, mixed-case, or alphanumeric alphabets. Slightly slower due to hashing overhead.
Compressed / Patricia / Radix Trie — chains of single-child nodes are collapsed into edge labels (strings rather than single characters). O(n) nodes for n strings instead of O(n · m). Insert and search are more complex (must split edges on mismatch). Use when memory is critical and strings share few short prefixes.
Implicit array (for fixed short strings) — the trie can be flattened into a single array where node IDs are indices. Used in competitive programming when node count is known in advance.
Types / Variants
| Variant | What it is | When to prefer |
|---|---|---|
| Standard Trie | Array or hashmap children, is_end per node | General prefix operations on known alphabet |
| Compressed / Radix Trie | Edges carry substrings; single-child chains collapsed | Space-critical with sparse branching |
| XOR (Binary) Trie | Each node has exactly 2 children (bit 0, bit 1); built over integer bit representations | Any problem asking for maximum/minimum XOR of integers |
| Suffix Trie | All suffixes of a single string inserted | Single-string pattern matching — but almost always replaced by Suffix Array or Suffix Automaton in practice |
| Ternary Search Trie (TST) | Three children per node (less, equal, greater) | Variable/large alphabets; space-time middle ground between sorted list and full trie |
Traversals & Access Patterns
DFS (recursive or iterative) — visits every node in the trie; used to enumerate all stored words, collect all words under a prefix, or validate structure. Iterative DFS uses an explicit stack of (node, prefix_so_far) pairs.
Prefix-first then DFS — walk the trie character by character to reach the node for a given prefix, then DFS the subtree to collect all words rooted there. The walk is O(prefix_len); the DFS is O(output). This is the auto-complete pattern.
BFS (level-order) — visits nodes depth by depth; useful when collecting words of increasing length or when the first match at the shallowest depth is needed (e.g., "shortest root" problems). Use a deque of (node, prefix_so_far).
Pruning during traversal — when the current trie node has no child matching the next character, immediately return. This makes trie-backed grid DFS orders of magnitude faster than checking all words at every cell.
Topic-Specific Operations
insert(word) — walk character by character from root; create a new node whenever a child is missing; set is_end = True on the final node. O(m) time, O(m) space worst case (no shared prefix).
search(word) — walk character by character; return False on any missing child; return node.is_end at the final character. O(m). The critical difference from startsWith: checking is_end, not just reachability.
startsWith(prefix) — identical to search but return True on reaching the final character regardless of is_end. O(m).
delete(word) — three cases, handled recursively or with a stack:
- Word doesn't exist in trie → no-op (return without changes).
- Word is a prefix of another inserted word → set
is_end = Falseon the word's final node; do not remove the node (it has children). - Word's suffix path has no other words branching from it → set
is_end = Falseand prune dead nodes bottom-up (a node is prunable ifis_end=Falseandchildrenis empty/all-None).
O(m) time.
count words with prefix(prefix) — augment every node with a count_passing counter incremented during each insert traversal; decrement on delete. Walk to the prefix node in O(prefix_len) and read count_passing. Without augmentation this requires a full DFS of the subtree.
find (shortest root / first match) — walk the word character by character; return the current prefix as soon as a node with is_end=True is encountered. Used in "Replace Words" style problems.
Key Algorithms
Longest Common Prefix of a Set of Strings
Insert all strings, then walk from the root as long as the current node has exactly one child and is_end=False. Stop at the first branch or is_end. O(total characters) insert, O(LCP length) walk.
The key insight: a branching node (more than one child) is exactly where two strings first differ; an is_end node marks where one string ends and is itself the LCP if others extend it.
Auto-Complete / Prefix Suggestions
Search to the prefix endpoint node (O(prefix_len)); then DFS the subtree collecting every is_end node's accumulated prefix. O(prefix_len + output_size). Works online per query; build the trie once over all words.
Replace Words (Shortest Root)
Build a trie from all roots. For each word in the sentence, walk the trie character by character and return the prefix at the first is_end encountered. O(total_chars_in_sentence + total_chars_in_roots).
Word Search II (Grid + Trie)
Build a trie from all target words. DFS from every grid cell; at each step check whether the current character is a child of the current trie node — if not, prune the DFS immediately. When a trie node with is_end=True is reached, record the word and set is_end=False to prevent duplicates. O(rows · cols · 4^max_word_len) worst case, but trie pruning makes the average far better.
The key insight: a single DFS pass checks all words simultaneously via the shared trie prefix structure, instead of running a separate DFS per word.
Maximum XOR of Two Numbers in an Array (XOR Trie)
Build a binary trie inserting all integers MSB-first (bit 31 down to bit 0). For each number, query by greedily choosing the opposite bit at each level (if the opposite child exists, go there). O(32 · n) = O(n) time and space.
bit = (num >> i) & 1
node = node.children[1 - bit] if node.children[1 - bit] else node.children[bit]
Word Break via Trie
Build a trie from the word dictionary. DP where dp[i] is True if s[:i] can be segmented: for each position i where dp[i] is True, walk the trie forward from s[i] and mark dp[i + len] True at every is_end encountered. O(n² + dictionary_total_chars).
Advanced Techniques
XOR Binary Trie for Bit-Level Optimization
Solves: Maximum or minimum XOR between any two elements, XOR queries with constraints (online or offline).
Mechanism: Treat each integer as a fixed-length binary string (32 or 64 bits, MSB first). Build a standard trie with exactly 2 children per node (bits 0 and 1). For a max-XOR query, greedily navigate to the opposite bit at each level (prefer 1 when query bit is 0).
When to use over alternatives: Use instead of brute-force O(n²) XOR enumeration whenever n > a few thousand. For max XOR of a pair in an array, this is the canonical solution. Do NOT use when XOR constraints involve complex offline ordering — see the offline variant (sort by constraint, insert incrementally).
Complexity: O(32 · n) time and space.
Aho-Corasick (Trie + Failure Links)
Solves: Multi-pattern matching — find all occurrences of k patterns in a text of length T simultaneously.
Mechanism: Build a standard trie from all patterns. Then compute failure links (BFS, like KMP's failure function but on the trie): each node's failure link points to the longest proper suffix of its prefix that is also a prefix of some pattern. During text scanning, follow the failure link on mismatch instead of restarting from root. Output links collect all patterns ending at each position.
When to use over alternatives: Use when you need to match k ≥ 2 patterns in a single text pass. For k = 1, KMP or Z-algorithm suffice and are simpler. For static pattern sets with many text queries, Aho-Corasick is the standard approach. O(sum_of_pattern_lengths + text_length + matches).
Count-Augmented Trie
Solves: Prefix frequency queries, rank of a string among all inserted strings, k-th lexicographically smallest string.
Mechanism: Augment each node with count_ending (words ending here) and count_passing (words passing through, i.e., using this node). On insert, increment count_passing at every node along the path and count_ending at the final node; mirror on delete. For rank query, walk the target string and accumulate counts of subtrees not taken. For k-th string, walk greedily: subtract subtree counts until k falls in the current branch.
When to use over alternatives: Use over sorting + binary search when insertions and deletions are interleaved with rank/k-th queries. Static string sets can just sort once. O(m) per insert/delete/query.
Problem Taxonomy
By Problem Category
| Category | What it asks | Key technique |
|---|---|---|
| Prefix existence / match | Does this word or prefix exist in a dynamic set? | Standard trie insert + search |
| Prefix enumeration | Return all words matching a given prefix | Prefix walk + DFS subtree collect |
| Shortest matching root | Replace each word with its shortest prefix in a dictionary | Trie walk returning first is_end |
| Longest common prefix | Shared prefix of a given set of strings | Single-child chain walk |
| Word segmentation | Can a string be partitioned into dictionary words? | Trie + DP |
| Grid word search | Find which dictionary words appear in a 2D character grid | Trie + DFS backtracking |
| Maximum XOR | Find the pair (or constrained element) maximizing XOR | XOR binary trie |
| Multi-pattern text matching | Find all occurrences of k patterns in a text | Aho-Corasick |
| String ranking / k-th order | Rank a string or find k-th among all inserted strings | Count-augmented trie |
Problem Signal → Technique
| Signal in problem statement | Technique |
|---|---|
| "starts with", "prefix", "hasPrefix" | Standard trie |
| "replace each word with its root / shortest prefix" | Trie returning first is_end on traversal |
| "maximum XOR", "bitwise maximum of any two" | XOR binary trie |
| "find all words hidden in a board" | Trie + DFS backtracking on grid |
| "multiple pattern strings", "find all patterns in a text" | Aho-Corasick |
| "k-th lexicographically smallest", "rank of a string" | Count-augmented trie |
| "add a word, find with wildcard '.' " | Trie + DFS across all children on '.' |
| "word list" + "partition a string into dictionary words" | Trie + DP |
| "longest word where every prefix is also a word" | Trie: words where every ancestor node has is_end=True |
Common Patterns
| Pattern | When it applies | Mechanism |
|---|---|---|
| Build-then-query | Static word set, many queries | Insert all words once; answer all queries against the built trie |
| Trie + DFS backtracking | Grid word search | At each cell, advance the trie node; recurse to neighbors; restore cell and trie state after recursion |
Soft delete (is_end toggle) | Delete + re-insert without structural changes | Flip is_end rather than pruning nodes; correct when words may be re-inserted later |
| Prune-on-find | Word Search II to avoid duplicate results | After recording a found word, immediately set is_end = False on its trie node |
| Count-augmented node | Prefix frequency or rank queries with updates | Maintain count_passing at every node; read it in O(prefix_len) |
| Binary trie on integers | Bit-level max/min XOR, prefix-sum on bits | Insert integers MSB-first as 32/64-bit strings |
| Trie + DP | Word break / concatenated words | dp[i] = True if reachable; walk trie from position i, mark dp at every is_end position |
| Incremental trie build | Offline XOR with a constraint (sorted order) | Sort queries and elements by constraint; insert elements into XOR trie as they become valid |
Edge Cases & Pitfalls
-
Node existence ≠ word existence. A node exists whenever any inserted word passes through it. Checking
node is not Noneinstead ofnode.is_endreturnsTruefor all prefixes, even non-words. Always checkis_endinsearch. -
Empty string
"". Inserting""setsis_end=Trueon the root itself.startsWith("")must returnTruealways (every string has the empty prefix).search("")returnsTrueonly if""was explicitly inserted. Forgetting this causes wrong results when the input word list contains"". -
is_end=Truenode with children (word is prefix of another). Deleting this word by removing the node orphans all words extending it. Delete means toggleis_end=False, not remove the node. -
Deleting a word with no branching suffix. Nodes that are no longer on any word's path (no children,
is_end=False) waste memory if not pruned. Pruning requires checking children emptiness bottom-up after settingis_end=False. -
Case sensitivity. Mixing lowercase and uppercase input without normalizing corrupts the array index or hashmap key. Always lowercase (or apply a consistent transform) before inserting and querying.
-
Non-lowercase characters with array-based children.
ord(c) - ord('a')on digits, uppercase, or unicode produces negative indices or indices ≥ 26, causing anIndexError. Use hashmap children for any non-pure-lowercase alphabet. -
Grid word search: not restoring visited cells. Marking a cell visited (e.g., setting it to
'#') and then not restoring it after recursion permanently removes that cell from future DFS paths. Always restore the original character after the recursive call returns. -
Word Search II: duplicate results. When multiple paths in the grid lead to the same word, not removing
is_endfrom the trie node after the first find causes the word to be added to results multiple times. Setnode.is_end = Falseimmediately on finding a word. -
XOR trie: bit iteration order. Inserting integers LSB-first and then greedily choosing the opposite bit gives wrong results because the greedy choice must be made from the most significant bit first. Always iterate
range(31, -1, -1)for 32-bit integers. -
XOR trie: negative integers. Python integers have arbitrary precision; treating negative numbers as 32-bit requires masking:
num & 0xFFFFFFFF. Forgetting this causes unbounded recursion or wrong prefix paths. -
Rebuilding the trie per query. Constructing a new trie inside a query function called n times is O(n · m · queries); build once outside the query loop.
Implementation Notes
- Python's default recursion limit is 1000; tries over strings longer than ~900 characters require
sys.setrecursionlimitor an iterative implementation. - Array-based children should be allocated as
[None] * 26on node creation, not lazily — lazy allocation causesAttributeErrorwhen checkingnode.children[i]on a node created without the attribute. - Hashmap-based children: use
node.children.get(c)instead ofnode.children[c]to avoidKeyErroron missing keys. - In grid DFS, use a sentinel value (e.g.,
'#') to mark the current cell visited, then restore the original value after the recursive call — this avoids allocating a separatevisitedmatrix. - For XOR trie over 32-bit integers, extract each bit with
(num >> i) & 1whereigoes from 31 down to 0. - In Word Search II, removing a word from the trie after finding it (
is_end = Falseand optionally unlinking dead nodes) prevents duplicates and speeds up subsequent searches. __slots__ = ('children', 'is_end')on the TrieNode class significantly reduces per-node memory overhead in Python when storing millions of nodes.- Count fields (
count_passing,count_ending) must be decremented on delete exactly as they were incremented on insert; a mismatch causes permanently incorrect rank and frequency queries.
Cross-Topic Relations
| Related Topic | Connection |
|---|---|
| Depth-First Search | Mechanism for traversing trie subtrees and enumerating words; also used in grid word search |
| Backtracking | State restoration (grid cell, trie pointer) after recursive DFS in Word Search II |
| Dynamic Programming | Trie + DP for word break / concatenated word problems; trie replaces set lookup with efficient prefix enumeration |
| Bit Manipulation | XOR binary trie operates on integer bit representations; trie structure mirrors the binary decision tree |
| Hash Table | Alternative for O(1) prefix existence checks, but cannot efficiently enumerate all words under a prefix |
| Graph Theory | Aho-Corasick failure links form a finite automaton (BFS-built DAG) on top of the trie |
| Suffix Array / Suffix Automaton | Replace the suffix trie for single-string pattern matching — same logical problem, far more space-efficient |
| String Hashing | Alternative for multi-pattern problems when exact match (not prefix match) is sufficient |
Interview Reference
Must-Know Problems
Implement Trie
- Implement Trie (Prefix Tree) (LC 208)
-
- Implement Trie — canonical insert / search / startsWith
-
- Design Add and Search Words Data Structure — wildcard '.' matching via DFS on all children
-
- Map Sum Pairs — value-augmented trie summing subtree values
Prefix Search
- 14. Longest Common Prefix — single-child chain walk (also solvable without trie, but trie is cleanest)
-
- Longest Word in Dictionary — words where every ancestor node has
is_end=True
- Longest Word in Dictionary — words where every ancestor node has
-
- Search Suggestions System — prefix walk + sorted DFS to return top 3
-
- Prefix and Suffix Search — augmented trie storing both prefix and suffix information
Word Formation
- 648. Replace Words — trie returning first
is_endduring word traversal -
- Word Break — trie + DP
-
- Word Break II — trie + backtracking + memoization
-
- Concatenated Words — trie + recursive word-break check per word
Grid Word Search
- 79. Word Search — DFS backtracking (no trie needed; single pattern)
-
- Word Search II — trie essential; prune on mismatch, remove found words in-place
XOR Trie
- 421. Maximum XOR of Two Numbers in an Array — canonical XOR binary trie
-
- Maximum XOR With an Element From Array — offline sort + incremental XOR trie build
Clarification Questions to Ask
- Is the alphabet fixed lowercase, or can it include uppercase, digits, or unicode? (Determines array vs. hashmap children.)
- Can the word list contain empty strings? (Affects root
is_endhandling.) - Is the word set static (build once, many queries) or dynamic (interleaved inserts, deletes, queries)?
- Does "search" mean exact match or prefix match? (Determines whether
is_endmust be checked.) - Are delete operations required? (Determines whether count augmentation or node pruning is needed.)
- For grid problems: can the same cell be reused in a single path?
- For XOR problems: are integers 32-bit? Can they be negative?
Common Interview Mistakes
- Checking
node is not Noneinstead ofnode.is_endinsearch— returns true for any prefix, not just complete words. - Forgetting
node.is_end = Trueat the end of insert — everysearchreturns false. - Returning
Trueat the end ofstartsWithbut returningnode.is_endforsearch— is correct but candidates often swap these. - In Word Search II, not setting
is_end = Falseafter finding a word — produces duplicate entries and causes TLE on large grids. - Building a new trie per query call instead of building it once outside — turns O(n·m + q·m) into O(q·n·m).
- In XOR trie, iterating bits from LSB to MSB — the greedy opposite-bit choice is wrong when made from the less significant end first.
- Deleting a node that has children — orphans all words extending through that node.
Typical Follow-Up Escalations
- "Now support wildcard '.' matching" → DFS across all children whenever the character is '.'; prune only when no children exist.
- "Now support delete" → toggle
is_end = False; optionally prune dead (no children, notis_end) nodes bottom-up. - "The alphabet can be any unicode character" → switch to hashmap children; array indexing no longer applies.
- "Optimize space" → compressed (Patricia/Radix) trie; or for read-only datasets, a sorted array of strings + binary search as a prefix-existence substitute.
- "Find all occurrences of k patterns in a text simultaneously" → Aho-Corasick failure links; O(patterns_total + text_length + matches).
- "Find the k-th lexicographically smallest inserted string" → count-augmented trie; walk greedily subtracting subtree counts until k lands in the current branch.
Read Next
Start reading to get personalized recommendations
Explore Unread
Great job! You've read all available articles