Word Search on a Grid (DFS Backtracking + Trie) — Practice Tasks¶
All tasks must be solved in Go, Java, and Python. Each task ships with starter code or a precise spec in all three languages. Implement the DFS-backtracking (and, for the multi-word tasks, the Trie) logic and validate against the evaluation criteria. Always test against a brute-force oracle on small inputs before trusting an optimized version. For Word Search II, the oracle is "run Word Search I once per dictionary word."
Beginner Tasks (5)¶
Task 1 — Word exists (Word Search I)¶
Problem. Implement exist(board, word) returning true iff word can be traced through orthogonally adjacent, non-reused cells. Use DFS backtracking with in-place marking.
Input / Output spec. - Read M, N, then M rows of the grid (a string each), then word. - Print true or false.
Constraints. - 1 ≤ M, N ≤ 200, 1 ≤ |word| ≤ 1000. Letters are uppercase A–Z.
Hint. Three base cases (matched all / out of bounds / mismatch), then mark '#', recurse 4 neighbors, unmark.
Starter — Go.
package main
import "fmt"
func exist(board [][]byte, word string) bool {
// TODO: DFS from each cell; mark/explore/unmark
return false
}
func main() {
var m, n int
fmt.Scan(&m, &n)
board := make([][]byte, m)
for i := 0; i < m; i++ {
var row string
fmt.Scan(&row)
board[i] = []byte(row)
}
var word string
fmt.Scan(&word)
fmt.Println(exist(board, word))
}
Starter — Java.
import java.util.*;
public class Task1 {
static int rows, cols;
static boolean exist(char[][] board, String word) {
// TODO
return false;
}
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int m = sc.nextInt(), n = sc.nextInt();
char[][] board = new char[m][];
for (int i = 0; i < m; i++) board[i] = sc.next().toCharArray();
String word = sc.next();
System.out.println(exist(board, word));
}
}
Starter — Python.
import sys
def exist(board, word):
# TODO: DFS from each cell; mark/explore/unmark
return False
def main():
data = sys.stdin.read().split()
it = iter(data)
m, n = int(next(it)), int(next(it))
board = [list(next(it)) for _ in range(m)]
word = next(it)
print(str(exist(board, word)).lower())
if __name__ == "__main__":
main()
Evaluation criteria. - Correct existence result, verified by hand on a small board. - The board is unchanged after the call (restoration invariant). - No out-of-bounds access; O(1) extra space via in-place marking.
Task 2 — Word exists with a separate visited array¶
Problem. Re-implement exist using a separate M × N boolean visited array instead of in-place marking. Confirm it gives identical results.
Input / Output spec. Same as Task 1.
Constraints. 1 ≤ M, N ≤ 200, 1 ≤ |word| ≤ 1000.
Hint. Set visited[r][c] = true on the way in, false on backtrack. The board is read-only this time.
Evaluation criteria. - Matches Task 1 on every input. - The board is never mutated (only visited changes). - Demonstrates the thread-safe-board alternative to the sentinel trick.
Task 3 — Count starting cells that can begin the word¶
Problem. Given a board and a word, count how many distinct starting cells (r, c) can begin a successful trace of the word.
Input / Output spec. - Read M, N, the grid, then word. Print the count.
Constraints. 1 ≤ M, N ≤ 50, 1 ≤ |word| ≤ 20.
Hint. Run the DFS from each cell, but instead of short-circuiting on the first success, tally how many starts return true.
Worked check. On a board where the word appears via two different starting cells, the count is 2. On a board where it does not appear at all, the count is 0.
Evaluation criteria. - Counts distinct starts, not distinct paths. - Matches a brute-force enumeration on small boards. - Board unchanged after counting.
Task 4 — Build a Trie and test prefix/word membership¶
Problem. Implement a Trie with insert(word), search(word) (exact word present), and startsWith(prefix) (some word has this prefix). This Trie powers Word Search II.
Input / Output spec. - Read a list of words to insert, then a list of (op, string) queries; print true/false per query.
Constraints. Up to 10^4 words, lowercase a–z.
Hint. Use array[26] children and a boolean end flag (or store the word at the terminal node). search checks end; startsWith just checks the path exists.
Evaluation criteria. - search distinguishes a full word from a mere prefix. - startsWith returns true for any inserted prefix. - O(L) per operation.
Task 5 — Single-word search on a Boggle board (8 directions)¶
Problem. Implement existBoggle(board, word) allowing all 8 neighbors (including diagonals). Same mark/explore/unmark, larger direction set.
Input / Output spec. Same as Task 1; print true/false.
Constraints. 1 ≤ M, N ≤ 50, 1 ≤ |word| ≤ 50.
Hint. Use a direction array of 8 offsets (±1, 0), (0, ±1), (±1, ±1). Bounds-check every neighbor.
Evaluation criteria. - Correctly uses 8 directions (diagonal traces succeed where 4-dir would fail). - Board unchanged after the call. - Matches a brute-force enumeration on small boards.
Intermediate Tasks (5)¶
Task 6 — Word Search II (find all dictionary words)¶
Problem. Implement findWords(board, words) returning all dictionary words that can be traced. Build a Trie and DFS the grid once with prefix pruning. De-duplicate results.
Input / Output spec. - Read the grid, then the dictionary. Print the found words sorted, one per line.
Constraints. 1 ≤ M, N ≤ 12, up to 3·10^4 words, 1 ≤ |word| ≤ 10, lowercase.
Hint. Carry a Trie node in the DFS. Prune when the node lacks a child for the cell's letter. Store the word at the terminal node and clear it on report.
Reference oracle (Python) — use this to validate.
def brute_find_words(board, words):
def exist(word):
rows, cols = len(board), len(board[0])
def dfs(r, c, i):
if i == len(word):
return True
if not (0 <= r < rows and 0 <= c < cols) or board[r][c] != word[i]:
return False
s, board[r][c] = board[r][c], "#"
ok = (dfs(r-1,c,i+1) or dfs(r+1,c,i+1) or
dfs(r,c-1,i+1) or dfs(r,c+1,i+1))
board[r][c] = s
return ok
return any(dfs(r, c, 0) for r in range(rows) for c in range(cols))
return sorted({w for w in words if exist(w)})
Evaluation criteria. - Output equals brute_find_words on small boards/dictionaries. - Each word reported at most once. - Single grid sweep (not one DFS per word).
Task 7 — Word Search II with leaf pruning¶
Problem. Extend Task 6 to prune dead Trie leaves: after recording a word, detach any Trie node that has no children and no word, propagating up. Measure the speedup on a large dictionary.
Constraints. 1 ≤ M, N ≤ 12, up to 5·10^4 words.
Hint. Pass the parent (and the child index) into the DFS, or return a "prune me" signal upward. Detach only when a node is truly dead (no children, no word).
Evaluation criteria. - Identical results to Task 6. - Trie node count decreases as words are found (assert/log it). - Measurable speedup vs Task 6 on a large dictionary.
Task 8 — Count and score (Boggle scoring)¶
Problem. On a Boggle board (8 directions, minimum word length 3), find all dictionary words and compute the total score using the table below. Each distinct word counts once.
| Word length | Points |
|---|---|
| 3–4 | 1 |
| 5 | 2 |
| 6 | 3 |
| 7 | 5 |
| 8+ | 11 |
Constraints. 1 ≤ M, N ≤ 8, up to 10^4 words.
Hint. Reuse the Trie sweep with 8 directions; filter words shorter than 3; map each found word's length to points and sum.
Reference oracle (Python).
def score_of(word):
L = len(word)
if L < 3: return 0
if L <= 4: return 1
if L == 5: return 2
if L == 6: return 3
if L == 7: return 5
return 11
Evaluation criteria. - Total score matches summing score_of over the brute-force found set. - Words shorter than 3 are excluded. - Uses 8-direction adjacency.
Task 9 — Longest word on the board¶
Problem. Return the longest dictionary word that can be traced (ties broken lexicographically smallest). Use the Trie sweep.
Constraints. 1 ≤ M, N ≤ 12, up to 3·10^4 words.
Hint. Track the best terminal hit during the sweep; update when a found word is longer, or equal length but lexicographically smaller.
Evaluation criteria. - Matches the max-length (tie: smallest) word from the brute-force found set. - Single sweep. - Handles the case where no word is found (return empty string).
Task 10 — Word Search I that returns the path¶
Problem. Modify Word Search I to return the list of cells [(r,c), …] forming a successful trace, or an empty list if none. Useful for highlighting in a UI.
Constraints. 1 ≤ M, N ≤ 50, 1 ≤ |word| ≤ 30.
Hint. Push (r, c) onto a path list before recursing and pop it on backtrack (parallel to mark/unmark). On the success base case, return a copy of the path.
Evaluation criteria. - Returned path is adjacent, distinct, and spells the word. - Empty list iff the word does not exist. - Path length equals |word|.
Advanced Tasks (5)¶
Task 11 — Thread-safe Word Search II (immutable Trie)¶
Problem. Solve Word Search II without mutating the Trie or the board, so the same Trie and board can be searched concurrently. Use a per-solve visited array and a per-solve found-set.
Constraints. 1 ≤ M, N ≤ 12, up to 5·10^4 words, multiple boards solved in parallel.
Hint. Do not clear terminal markers or prune leaves on the shared Trie. Collect found words in a per-solve set. Each goroutine/thread owns its visited grid.
Evaluation criteria. - Results identical to the single-threaded Task 6. - The shared Trie is provably unmodified after all solves (checksum or re-search test). - No data races (run with the race detector / thread sanitizer where available).
Task 12 — Board-letter pre-filter¶
Problem. Before the Trie sweep, drop dictionary words that cannot possibly appear: words using a letter absent from the board, using a letter more often than available, or longer than M·N. Report how many words were filtered.
Constraints. 1 ≤ M, N ≤ 12, up to 10^5 words.
Hint. Build a 26-length count of board letters; for each word build its own count and compare. The filter is O(board + word).
Evaluation criteria. - Filtered set produces the same found words as the unfiltered sweep. - Reports a correct filtered count. - Measurable speedup on a large dictionary with a small board.
Task 13 — Aho-Corasick along straight lines (variant)¶
Problem. For the straight-line variant (words must appear along a full row, column, or diagonal, reading forward), find all dictionary words using Aho-Corasick on each line as a 1D text. This is a different, easier problem than free-form grid paths.
Constraints. 1 ≤ M, N ≤ 500, up to 10^4 words.
Hint. Extract each row, column, and diagonal as a string; run a single Aho-Corasick automaton over each. No backtracking is needed because there is no turning and no visited-set constraint.
Evaluation criteria. - Finds all words that appear along a straight line. - Does not find words requiring a turn (that is Word Search II, not this variant). - O(lines · length + matches); documents why backtracking is unnecessary here.
Task 14 — DAWG-style suffix sharing (memory)¶
Problem. Build a Trie, then measure node count; implement a simple suffix-merging pass (or use a known DAWG construction) and re-measure. Run Word Search II on the compacted structure, reconstructing matched words from the DFS path.
Constraints. Up to 10^5 words, lowercase.
Hint. A DAWG shares identical suffix subtrees, so you cannot store a per-node word — rebuild the word from the path during the DFS and collect into a set.
Degree/size sanity check. For a natural-language dictionary, the DAWG should have far fewer nodes than the Trie (English shares many suffixes like -ing, -ed). Assert dawg_nodes < trie_nodes.
Evaluation criteria. - Same found words as the plain-Trie sweep. - Fewer nodes than the plain Trie (report both counts). - Word reconstruction from the path is correct (subset of dictionary).
Task 15 — Decide when matrix exponentiation (walks) is the wrong tool¶
Problem. You are given a problem statement and must decide whether Word Search backtracking applies or whether the problem actually allows cell reuse (walks), which is a different polynomial problem. Implement classify(problem) returning one of: WORD_SEARCH_DFS (distinct cells, single/few words), WORD_SEARCH_II_TRIE (distinct cells, dictionary), AHO_CORASICK_LINES (straight-line matching), or WALK_COUNT_MATRIX (cell reuse allowed → not backtracking; see 11-graphs/32-paths-fixed-length).
Constraints / cases to handle. - Distinct-cell trace of one word → WORD_SEARCH_DFS. - Distinct-cell trace of a dictionary → WORD_SEARCH_II_TRIE. - Words along rows/cols/diagonals only → AHO_CORASICK_LINES. - Count length-k label sequences with reuse allowed → WALK_COUNT_MATRIX.
Hint. The pivotal question is the no-reuse (distinct cells) constraint. If reuse is allowed, the problem is a memoryless walk count (polynomial, matrix power); if reuse is forbidden, it is a simple-path backtracking search.
Evaluation criteria. - Correctly classifies all four canonical cases. - The WALK_COUNT_MATRIX branch explicitly cites the memoryless-vs-visited-set distinction. - Justification references the right complexity (O(M·N·4^L) for DFS, etc.).
Benchmark Task¶
Task B — Naive per-word loop vs Trie sweep crossover¶
Problem. Empirically find the dictionary size W at which the Trie sweep (Word Search II) overtakes the naive "Word Search I per word" loop on a fixed board. Implement two workloads on a random board (fixed seed):
- (a) Naive loop: run Word Search I for each of
Wwords —O(W · M·N · 4^Lmax). - (b) Trie sweep: build a Trie once, DFS the grid once —
O(K + M·N · 4^Lmax).
Measure wall-clock time for a 12×12 board across W ∈ {1, 10, 100, 1000, 10000}. Report a table and identify the crossover W.
Starter — Python.
import random
import time
import string
def gen_board(m, n, seed):
r = random.Random(seed)
return [[r.choice("abcde") for _ in range(n)] for _ in range(m)]
def gen_words(count, seed):
r = random.Random(seed + 1)
return ["".join(r.choice("abcde") for _ in range(r.randint(3, 6)))
for _ in range(count)]
def word_search_i(board, word):
# TODO: single-word DFS backtracking
return False
def naive_loop(board, words):
# TODO: run word_search_i for each word; return found set
return set()
def trie_sweep(board, words):
# TODO: build Trie, single DFS sweep; return found set
return set()
def bench(fn, board, words):
start = time.perf_counter()
fn(board, words)
return (time.perf_counter() - start) * 1000.0
def main():
board = gen_board(12, 12, 42)
print("W naive_ms trie_ms")
for w in [1, 10, 100, 1000, 10000]:
words = gen_words(w, 7)
na = bench(naive_loop, [row[:] for row in board], words)
tr = bench(trie_sweep, [row[:] for row in board], words)
print(f"{w:<8d} {na:<12.2f} {tr:<10.2f}")
if __name__ == "__main__":
main()
Evaluation criteria. - Same seed produces the same board and word list across Go, Java, and Python. - Naive loop (a) and Trie sweep (b) return the same found set at every W. - The naive loop's time grows ~linearly in W; the Trie sweep stays roughly flat. Report the crossover W. - Report includes the mean across at least 3 runs and does not time board/word generation or cloning. - Writeup: a short note on the measured crossover and why the Trie sweep is essentially independent of W.
Stretch Tasks (3)¶
Task S1 — Word Search I that returns ALL distinct paths¶
Problem. Given a board and a word, return every distinct spelling path (each as a list of cells). Unlike Task 10 (which returns one path), enumerate them all.
Input / Output spec. - Read M, N, the grid, then word. Print each path on its own line as (r,c) (r,c) ….
Constraints. 1 ≤ M, N ≤ 8, 1 ≤ |word| ≤ 8 (small, because the count can be large).
Hint. Do not short-circuit on the first success. On the success base case, append a copy of the current path to a results list, then continue exploring (still unmarking on the way out so later paths are found).
Evaluation criteria. - Returns every distinct path, each spelling the word with distinct cells. - The count matches a brute-force enumeration on small boards. - Board unchanged after the call.
Task S2 — Boggle solver with Qu tile¶
Problem. Solve Word Search II on a Boggle board where one tile may be the two-letter Qu. A Qu tile matches two characters of the word in a single grid step. Find all dictionary words (8 directions, length ≥ 3).
Constraints. 1 ≤ M, N ≤ 5, up to 10^4 words.
Hint. When the current cell is a Qu tile, descend the Trie by two characters (q then u) in one step; mark the single cell as usual. Bounds-check 8 neighbors.
Evaluation criteria. - Words crossing a Qu tile (e.g. "quit") are found. - The single Qu cell is marked once and restored once. - Matches a brute-force solver that treats Qu as an atomic two-letter step.
Task S3 — Parallel multi-board scoring service¶
Problem. Given one shared dictionary and B boards, score every board (Boggle scoring) in parallel against an immutable shared Trie, using a per-board visited array and found-set. Return the score of each board.
Constraints. 1 ≤ M, N ≤ 5, up to 10^4 words, 1 ≤ B ≤ 10^3 boards.
Hint. Build the Trie once. Launch one worker per board (goroutine / thread / process). No Trie mutation — collect found words per board in a local set, then score. Merge nothing across boards.
Evaluation criteria. - Per-board scores match a single-threaded reference. - The shared Trie is provably unmodified after all boards are scored. - No data races (run with the race detector / thread sanitizer where available). - Throughput scales with worker count up to the core count.
General Guidance for All Tasks¶
- Always validate against the brute-force oracle first. For single-word tasks, hand-check small boards; for Word Search II, run Word Search I per word and compare the result sets.
- Get the unmark right. The most common bug is failing to restore the cell on every return path. Assert the board is unchanged after each solve.
- Pick a sentinel outside the alphabet (
'#'fora–z/A–Z), or use avisitedarray when the board must stay read-only (e.g. concurrency). - One word → Word Search I; a dictionary → Word Search II (Trie). Never loop Word Search I over a dictionary in production.
- De-duplicate found words (clear the terminal marker or use a set) — one word can be spelled by multiple paths.
- Keep the Trie shareable when concurrent: do not mutate it (no leaf pruning on a shared Trie); track per-solve state externally.
- Distinct cells only. Word Search forbids reuse; if a problem allows reuse it is a walk problem (matrix power / layered DP), not backtracking — see
11-graphs/32-paths-fixed-length.
Suggested progression¶
Work the tasks in this order to build the topic from the ground up:
- Tasks 1–2 establish the single-word DFS in both marking styles (in-place and
visited); confirm they agree. - Tasks 3, 10 vary the output of the single-word search (count starts, return the path) — same engine, different reporting.
- Task 4 builds the Trie in isolation, decoupled from the grid, so its bugs are caught before the sweep.
- Tasks 5, 6 introduce the two big generalizations: 8-direction adjacency and the Trie sweep.
- Tasks 7–9 layer on the production optimizations (leaf pruning, scoring, longest word).
- Tasks 11–14 add concurrency, pre-filtering, the line-only Aho-Corasick variant, and memory (DAWG).
- Task 15 and S1–S3 test judgment (which tool applies) and advanced engineering (all-paths,
Qutile, parallel service). - Task B ties it together empirically: measure the naive-vs-Trie crossover and confirm it matches the theory.
Common cross-task pitfalls¶
- Forgetting to unmark on the success path (not just the failure path) — the early
return truemust still restore the cell. - Counting paths instead of starts (Task 3) or paths instead of words (Tasks 6, 9) — clarify which the spec wants.
- Mutating a shared Trie in the concurrent tasks (11, S3) — use a per-solve set instead.
- Off-by-one in adjacency when switching to 8 directions (Tasks 5, 8, S2) — bounds-check every neighbor.
Each task must be implemented and tested in Go, Java, and Python, with identical outputs across all three on the same seeded inputs.