The Three Laws of TDD — Practice Tasks¶
Category: Craftsmanship Disciplines — write production code only to make a failing test pass, in the tightest possible loop.
10 graded TDD katas. The point is not the final code — it's the sequence of red-green-refactor steps that gets you there. Each task gives a starter, an ordered test list, and a worked walkthrough showing the steps. Do each one by hand before expanding the solution, and resist reading ahead: write one failing test at a time.
How to use these: open a test file and a source file side by side, run tests on a keystroke, and obey the three laws strictly — even when it feels silly. The discomfort is the lesson. For why these specific exercises, see Kata & Deliberate Practice.
Table of Contents¶
- Task 1: Drive a Stack
- Task 2: FizzBuzz
- Task 3: Roman Numerals
- Task 4: Prime Factors
- Task 5: String Calculator
- Task 6: Fix a Bug Test-First
- Task 7: Triangulate an Average
- Task 8: Bowling Score
- Task 9: Inject a Boundary (Fast Tests)
- Task 10: Characterize Then TDD Legacy
- Practice Tips
Task 1: Drive a Stack¶
Goal: Build a Stack purely from failing tests. Add exactly one capability per nano-cycle. No method exists until a test demands it.
Test list (write one at a time, in order):
[ ] new stack is empty
[ ] after one push, not empty
[ ] push then pop returns the pushed value
[ ] pop on empty raises / errors
[ ] LIFO order: push 1,2 then pop gives 2 then 1
Walkthrough + Solution
### Python — the step sequence# RED: assert Stack().is_empty()
class Stack:
def is_empty(self): return True # GREEN (fake it)
# RED: s.push(1); assert not s.is_empty()
class Stack:
def __init__(self): self._items = []
def push(self, x): self._items.append(x)
def is_empty(self): return len(self._items) == 0 # GREEN (real state forced)
# RED: s.push(1); assert s.pop() == 1
def pop(self): return self._items.pop() # GREEN
# RED: pytest.raises(IndexError): Stack().pop() → already passes (list raises)
# RED: push 1,2 → pop==2, pop==1 → already passes (LIFO via list)
# REFACTOR: nothing to clean; the design emerged minimally.
type Stack[T any] struct{ items []T }
func (s *Stack[T]) IsEmpty() bool { return len(s.items) == 0 }
func (s *Stack[T]) Push(x T) { s.items = append(s.items, x) }
func (s *Stack[T]) Pop() (T, bool) {
var zero T
if len(s.items) == 0 { return zero, false }
x := s.items[len(s.items)-1]
s.items = s.items[:len(s.items)-1]
return x, true
}
Task 2: FizzBuzz¶
Goal: The classic warm-up. Print 1..n but Fizz for multiples of 3, Buzz for 5, FizzBuzz for both. Drive every rule from a test.
Test list:
[ ] 1 -> "1"
[ ] 2 -> "2" (forces str(n), not a constant)
[ ] 3 -> "Fizz"
[ ] 5 -> "Buzz"
[ ] 15 -> "FizzBuzz"
Walkthrough + Solution
### Python — step sequence# RED: fizzbuzz(1) == "1" → return "1" (fake)
# RED: fizzbuzz(2) == "2" → return str(n) (triangulated)
# RED: fizzbuzz(3) == "Fizz"→ if n%3==0: return "Fizz"; return str(n)
# RED: fizzbuzz(5) == "Buzz"→ add if n%5==0: return "Buzz"
# RED: fizzbuzz(15)=="FizzBuzz" → handle both first
def fizzbuzz(n):
if n % 15 == 0: return "FizzBuzz"
if n % 3 == 0: return "Fizz"
if n % 5 == 0: return "Buzz"
return str(n)
# REFACTOR (green): remove the magic 15 and the duplication
def fizzbuzz(n):
out = ("Fizz" if n % 3 == 0 else "") + ("Buzz" if n % 5 == 0 else "")
return out or str(n)
Task 3: Roman Numerals¶
Goal: Convert an integer to Roman numerals. A famous kata because the test order dramatically changes how easily each step is reached.
Test list (order matters):
[ ] 1 -> "I"
[ ] 2 -> "II" (forces repetition / a loop)
[ ] 3 -> "III"
[ ] 4 -> "IV" (forces the subtractive pair)
[ ] 5 -> "V"
[ ] 9 -> "IX"
[ ] 58 -> "LVIII" (forces the value table)
[ ] 1994 -> "MCMXCIV"(general)
Walkthrough + Solution
### Java — the emergent design// After 1,2,3: you have a loop appending "I".
// 4 ("IV") tempts a special case — RESIST. Instead introduce a value/symbol table.
// 5,9,58,1994 all fall out of the table once "IV","IX","XL","XC","CD","CM" are entries.
String toRoman(int n) {
int[] values = {1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1};
String[] syms = {"M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I"};
StringBuilder out = new StringBuilder();
for (int i = 0; i < values.length; i++) {
while (n >= values[i]) {
out.append(syms[i]);
n -= values[i];
}
}
return out.toString();
}
Task 4: Prime Factors¶
Goal: Return the prime factors of n as a list (12 -> [2,2,3]). Robert Martin's signature kata: the algorithm emerges from triangulation without you ever "designing" it.
Test list:
[ ] 1 -> []
[ ] 2 -> [2]
[ ] 3 -> [3]
[ ] 4 -> [2,2] (forces a loop)
[ ] 6 -> [2,3] (forces dividing by the next factor)
[ ] 8 -> [2,2,2]
[ ] 9 -> [3,3]
Walkthrough + Solution
### Python — watch the algorithm grow one test at a time# 1 -> [] : return []
# 2 -> [2] : if n>1: return [n]; return []
# 4 -> [2,2] : while n%2==0: append 2; n//=2 ... then if n>1 append n
# 6 -> [2,3], 9->[3,3]: generalize the divisor from 2 to a loop
def prime_factors(n):
factors = []
divisor = 2
while n > 1:
while n % divisor == 0:
factors.append(divisor)
n //= divisor
divisor += 1
return factors
Task 5: String Calculator¶
Goal: add("") is 0, add("1") is 1, add("1,2") is 3, then support newlines and custom delimiters. Roy Osherove's kata — great for practicing incremental requirements.
Test list:
[ ] "" -> 0
[ ] "1" -> 1
[ ] "1,2" -> 3
[ ] "1,2,3,4" -> 10 (arbitrary count)
[ ] "1\n2,3" -> 6 (newline delimiter)
[ ] "//;\n1;2" -> 3 (custom delimiter)
Walkthrough + Solution
### Python **Step notes:** `""->0` is a guard you add first. `"1"->1` forces `int(...)`. `"1,2"->3` forces splitting + sum. The arbitrary-count test forces the comprehension over a split (not a hard-coded two-operand parse). Newline and custom delimiter each get driven in as a separate red, with a refactor pass to keep the parsing readable. **Lesson:** TDD shines for *incrementally specified* requirements — each new rule is a new red, and the earlier tests guard you against breaking prior behavior as you extend the parser.Task 6: Fix a Bug Test-First¶
Goal: Practice the single most valuable everyday application of the laws: never fix a bug without a failing test that reproduces it first.
Given bug report: "average([]) crashes with a divide-by-zero in production."
Walkthrough + Solution
### Step 1 — reproduce as a failing test (RED), before touching the code ### Step 2 — minimum fix (GREEN) ### Step 3 — confirm the existing happy-path test still passes; refactor if needed (nothing here). **Lesson:** the failing test proves you understand the bug, proves the fix works, and *stays in the suite forever* as a regression guard. A fix without a failing-first test proves nothing and invites the bug to return. This is the habit to adopt before any other TDD habit. (See [Professional](professional.md) — "reproduce every bug with a failing test.")Task 7: Triangulate an Average¶
Goal: Practice the fake-it → triangulate → generalize progression explicitly, narrating each law.
Test list:
[ ] mean([4]) == 4 (fake it: return 4)
[ ] mean([2,4]) == 3 (triangulate: forces sum/len)
[ ] mean([]) raises (degenerate guard)
Walkthrough + Solution
# RED: mean([4]) == 4
def mean(xs): return 4 # GREEN — fake it (Law 3 minimum)
# RED: mean([2,4]) == 3
# return 4 fails → the second example forces the real formula
def mean(xs): return sum(xs) / len(xs) # GREEN — triangulated
# RED: mean([]) should raise ValueError
def mean(xs):
if not xs:
raise ValueError("mean of empty sequence")
return sum(xs) / len(xs)
Task 8: Bowling Score¶
Goal: A harder kata (spares and strikes) where the test list and small steps matter most. Compute a ten-pin bowling game's total.
Test list:
[ ] gutter game (all 0) -> 0
[ ] all ones -> 20
[ ] one spare then a 3 then zeros -> 16
[ ] one strike then 3,4 then zeros -> 24
[ ] perfect game (12 strikes) -> 300
Walkthrough + Solution
### Python — score by frames, bonuses driven in one test at a timeclass Game:
def __init__(self):
self.rolls = []
def roll(self, pins):
self.rolls.append(pins)
def score(self):
total, i = 0, 0
for _ in range(10): # ten frames
if self.rolls[i] == 10: # strike
total += 10 + self.rolls[i+1] + self.rolls[i+2]
i += 1
elif self.rolls[i] + self.rolls[i+1] == 10: # spare
total += 10 + self.rolls[i+2]
i += 2
else:
total += self.rolls[i] + self.rolls[i+1]
i += 2
return total
Task 9: Inject a Boundary¶
Goal: Experience the design pressure of TDD directly — make untestable code testable by injecting a boundary, then TDD the logic.
Given — untestable: constructs its own clock, so "expired" can't be tested deterministically.
import datetime
class Token:
def is_expired(self):
return datetime.datetime.now() > self.expiry # depends on wall-clock!
Walkthrough + Solution
### Step 1 — the test pressure: you can't test expiry without controlling "now". Inject a clock.class Token:
def __init__(self, expiry, clock=datetime.datetime.now):
self.expiry, self.clock = expiry, clock
def is_expired(self):
return self.clock() > self.expiry
Task 10: Characterize Then TDD Legacy¶
Goal: Practice the professional legacy on-ramp — you can't TDD untested code directly; you characterize, seam, then TDD.
Given — legacy, no tests, hard to read:
def fee(amount, tier, weekend):
f = amount * 0.02
if tier == "gold": f *= 0.5
if weekend: f += 1.0
return round(f, 2)
Walkthrough + Solution
### Step 1 — characterize current behavior (test-AFTER, pin what it does today) ### Step 2 — with the net in place, NOW TDD a new requirement (red first) ### Step 3 — refactor on green, protected by both the characterization and the new tests. **Lesson:** the three laws can't start until there's a seam and a safety net. Characterization tests (test-after) come *first* on legacy; only then do you TDD the change. Trying to skip straight to test-first on untestable legacy is how teams wrongly decide TDD "doesn't work." See [Professional](professional.md) and [Working with Legacy Code](../../working-with-legacy-code/README.md).Practice Tips¶
- One failing test at a time. Keep a written test list; don't write the next test until the current one is green.
- Always see red before green. If a new test passes immediately, break the code to confirm it can fail.
- Use the right path to green: fake-it + triangulate when unsure, obvious-implementation when confident.
- Never skip the refactor beat — that's where FizzBuzz and Roman numerals go from ladder to clean.
- Order tests for the smallest code jump — degenerate case first, hardest case last. The order is design.
- Fix every bug test-first (Task 6) — adopt this habit before any other.
- When code is hard to test, fix the design, not the test (Task 9) — listen to the design pressure.
- On legacy, characterize first (Task 10) — the laws come last, not first.
← Interview · Craftsmanship Disciplines · Roadmap · Next: Find-Bug
In this topic