Fragile Tests — Exercises¶
Category: Testing Anti-Patterns → Fragile Tests — hands-on practice making tests survive a refactor.
These are fix-it exercises, not recognition quizzes. For each one you get a problem statement, a brittle starting test (in Go, Java, or Python — the language varies on purpose), acceptance criteria, and a collapsible solution. The point is to make the change: turn a test that pins implementation details into one that pins the contract, so a behavior-preserving refactor leaves it green.
How to use this file. Read the problem, rewrite the test in your editor before opening the solution, then compare. The "why it's better" note matters more than the diff — a robust test is one that fails for exactly one reason: the behavior broke. Refer back to
junior.mdfor the shapes andmiddle.mdfor the countermoves.
Table of Contents¶
| # | Exercise | Fragility source | Lang | Difficulty |
|---|---|---|---|---|
| 1 | Stop reading private state | Private state | Python | ★ easy |
| 2 | Drop the over-specified equals | Over-specification | Java | ★ easy |
| 3 | Parse, don't string-match the JSON | Output format | Go | ★★ medium |
| 4 | Outcome over interaction | Mock interactions | Java | ★★ medium |
| 5 | Kill the order and log assertions | Order + log text | Python | ★★ medium |
| 6 | Replace the god mock with a fake + contract test | White-box mocking | Go | ★★★ hard |
Exercise 1 — Stop reading private state¶
Fragility source: private state · Language: Python · Difficulty: ★ easy
The test reaches into _balance and _history. A behavior-preserving change to how the account stores its data breaks it.
class Account:
def __init__(self):
self._balance = 0
self._history = []
def deposit(self, amount):
self._balance += amount
self._history.append(("deposit", amount))
def balance(self):
return self._balance
# The brittle test:
def test_deposit_brittle():
acc = Account()
acc.deposit(100)
assert acc._balance == 100 # private field
assert acc._history == [("deposit", 100)] # private field, exact shape
Acceptance criteria - No access to _-prefixed names. - The test still verifies that a deposit increases the balance. - The test would survive changing _history to a different structure (e.g. a list of dataclass events) or dropping it entirely.
Hint: drive the object through its public methods and assert on the public result.
Solution
**Why it's better.** The robust test exercises `deposit` and asserts on `balance()` — the only thing a caller can observe. The internal `_history` list is an implementation detail with no public promise, so the test says nothing about it. You can now refactor storage freely (record events as objects, drop history, cache the balance) and this test stays green — it fails only if a deposit *actually* stops increasing the balance. If history is genuinely part of the contract (e.g. there's a public `statement()`), test *that* public method, not the private list.Exercise 2 — Drop the over-specified equals¶
Fragility source: over-specification · Language: Java · Difficulty: ★ easy
The test asserts the whole User object, including a generated id, a timestamp, and a version field that the "registration sets status ACTIVE" behavior doesn't promise.
record User(long id, String email, Status status, Instant createdAt, int version) {}
class Registration {
User register(String email) {
return new User(IdGen.next(), email, Status.ACTIVE, Clock.now(), 1);
}
}
// The brittle test:
@Test
void register_brittle() {
User u = new Registration().register("sam@x.io");
assertThat(u).isEqualTo(
new User(1L, "sam@x.io", Status.ACTIVE, Instant.parse("2026-06-10T00:00:00Z"), 1));
}
Acceptance criteria - The test passes regardless of the generated id value, the current time, and the version scheme. - It still verifies the two things registration actually promises: the email is carried through and the status is ACTIVE. - It would survive adding a new field to User.
Hint: assert field-by-field on only the fields this behavior is responsible for; assert existence (not value) for generated ones.
Solution
@Test
void register_carriesEmailAndActivates() {
User u = new Registration().register("sam@x.io");
assertThat(u.email()).isEqualTo("sam@x.io"); // input echoed — a real promise
assertThat(u.status()).isEqualTo(Status.ACTIVE); // the behavior under test
assertThat(u.id()).isPositive(); // an id was assigned (value not pinned)
// createdAt and version are incidental to THIS behavior → not asserted here.
}
Exercise 3 — Parse, don't string-match the JSON¶
Fragility source: output format · Language: Go · Difficulty: ★★ medium
The test compares the serialized JSON byte-for-byte, pinning key order and whitespace — both of which a serializer can change without altering the data.
type Cart struct {
Items []Item `json:"items"`
Total int `json:"total"`
}
type Item struct {
SKU string `json:"sku"`
Qty int `json:"qty"`
}
func Serialize(c Cart) ([]byte, error) { return json.Marshal(c) }
// The brittle test:
func TestSerialize_brittle(t *testing.T) {
out, _ := Serialize(Cart{Items: []Item{{SKU: "A", Qty: 2}}, Total: 20})
assert.Equal(t, `{"items":[{"sku":"A","qty":2}],"total":20}`, string(out))
}
Acceptance criteria - The test passes regardless of key order or whitespace in the output. - It still verifies that total and the items serialize with the right values. - It would survive adding a new (optional) field to the JSON.
Hint: unmarshal the output into a map[string]any (or a struct) and assert on the parsed values.
Solution
func TestSerialize_includesItemsAndTotal(t *testing.T) {
out, err := Serialize(Cart{Items: []Item{{SKU: "A", Qty: 2}}, Total: 20})
require.NoError(t, err)
var got map[string]any
require.NoError(t, json.Unmarshal(out, &got))
assert.EqualValues(t, 20, got["total"])
assert.Equal(t, []any{
map[string]any{"sku": "A", "qty": float64(2)},
}, got["items"])
}
Exercise 4 — Outcome over interaction¶
Fragility source: mock interactions · Language: Java · Difficulty: ★★ medium
The test verifies that the repository and mailer were called, in order — pinning the internal choreography. It would pass even if the user were saved with the wrong data, and it breaks the moment you reorder or batch the internal calls.
class Signup {
private final UserRepo repo;
private final Mailer mailer;
Signup(UserRepo repo, Mailer mailer) { this.repo = repo; this.mailer = mailer; }
void register(String email) {
repo.save(new User(email, Status.ACTIVE));
mailer.sendWelcome(email);
}
}
// The brittle test:
@Test
void register_brittle() {
UserRepo repo = mock(UserRepo.class);
Mailer mailer = mock(Mailer.class);
new Signup(repo, mailer).register("sam@x.io");
InOrder o = inOrder(repo, mailer);
o.verify(repo).save(any(User.class)); // asserts the call, not the data
o.verify(mailer).sendWelcome("sam@x.io"); // asserts call order
verifyNoMoreInteractions(repo, mailer); // freezes the implementation
}
Acceptance criteria - The test asserts on observable outcomes: the user is persisted as ACTIVE, and a welcome email exists for the address. - It would survive reordering the two internal calls, or sending the welcome via a different mechanism that still results in a welcome email. - It would fail if the user were saved with the wrong email or status (the brittle one wouldn't catch that).
Hint: replace the verifying mocks with simple in-memory fakes that record state, then assert on that state.
Solution
// Simple fakes — real, inspectable implementations of the seams.
class FakeUserRepo implements UserRepo {
private final Map<String, User> byEmail = new HashMap<>();
public void save(User u) { byEmail.put(u.email(), u); }
Optional<User> findByEmail(String e) { return Optional.ofNullable(byEmail.get(e)); }
}
class FakeMailer implements Mailer {
final List<String> welcomed = new ArrayList<>();
public void sendWelcome(String email) { welcomed.add(email); }
}
@Test
void register_persistsActiveUserAndSendsWelcome() {
FakeUserRepo repo = new FakeUserRepo();
FakeMailer mailer = new FakeMailer();
new Signup(repo, mailer).register("sam@x.io");
// Outcome 1: the user is persisted, ACTIVE, with the right email.
assertThat(repo.findByEmail("sam@x.io"))
.get()
.extracting(User::status)
.isEqualTo(Status.ACTIVE);
// Outcome 2: a welcome email exists for that address.
assertThat(mailer.welcomed).containsExactly("sam@x.io");
}
Exercise 5 — Kill the order and log assertions¶
Fragility source: order + log text · Language: Python · Difficulty: ★★ medium
Two fragilities in one test: it pins the order of a result whose contract is "the set of notified users," and it greps the log prose, which any rewording breaks.
def notify_active(users, notifier, logger):
notified = []
for u in users:
if u.active:
notifier.send(u.id)
notified.append(u.id)
logger.info(f"Notified {len(notified)} active users: {notified}")
return notified
# The brittle test:
def test_notify_brittle(caplog):
users = [User(1, True), User(2, False), User(3, True)]
notifier = FakeNotifier()
result = notify_active(users, notifier, logging.getLogger())
assert result == [1, 3] # pins order
assert notifier.sent == [1, 3] # pins order
assert "Notified 2 active users: [1, 3]" in caplog.text # pins exact log prose
Acceptance criteria - The test verifies the set of notified users (1 and 3), not a specific order. - It does not assert on the log message text at all. - It would survive changing iteration order, reformatting the log, or switching the log to structured fields.
Hint: compare as sets (or use sorted), and drop the log assertion — assert the behavior the log describes instead.
Solution
def test_notify_sends_to_active_users():
users = [User(1, True), User(2, False), User(3, True)]
notifier = FakeNotifier()
result = notify_active(users, notifier, logging.getLogger())
assert set(result) == {1, 3} # the SET notified — order is incidental
assert set(notifier.sent) == {1, 3} # same, on the observable side effect
# No log-text assertion: the count/list is already covered by `result`.
Exercise 6 — Replace the god mock with a fake + contract test¶
Fragility source: white-box mocking · Language: Go · Difficulty: ★★★ hard
A service is tested with a mock Store that scripts return values and verifies the exact call sequence. The test is a transcript of the implementation: every refactor retranscribes it. Replace the god mock with a fake, and add a contract test so the fake is proven equivalent to the real store.
type Store interface {
Get(key string) (int, bool)
Set(key string, val int)
Incr(key string) int
}
// Subject: a counter service that increments and returns the new value.
type Counter struct{ store Store }
func (c *Counter) Bump(key string) int { return c.store.Incr(key) }
// The brittle test — scripts and verifies the exact interaction:
func TestBump_brittle(t *testing.T) {
store := new(MockStore)
store.On("Get", "x").Return(0, false) // mirrors internal calls...
store.On("Set", "x", 1).Return() // ...that Bump might not even make
store.On("Incr", "x").Return(1)
c := &Counter{store: store}
got := c.Bump("x")
assert.Equal(t, 1, got)
store.AssertExpectations(t) // fails if internal call pattern changes
}
Acceptance criteria - TestBump asserts on the outcome (the returned counter value, and the stored state), not on which Store methods were called. - A FakeStore (real in-memory implementation) replaces the mock. - A reusable StoreContract(t, newStore) runs against both FakeStore and the real implementation, proving they agree — so fake-based tests aren't lying.
Hint: write the fake as a plain map-backed Store; write the contract test once and parameterize it over a func() Store factory.
Solution
// 1. A real, inspectable fake.
type FakeStore struct{ m map[string]int }
func NewFakeStore() *FakeStore { return &FakeStore{m: map[string]int{}} }
func (f *FakeStore) Get(k string) (int, bool) { v, ok := f.m[k]; return v, ok }
func (f *FakeStore) Set(k string, v int) { f.m[k] = v }
func (f *FakeStore) Incr(k string) int { f.m[k]++; return f.m[k] }
// 2. The subject test — asserts OUTCOME, not interaction.
func TestBump_returnsIncrementedValue(t *testing.T) {
store := NewFakeStore()
c := &Counter{store: store}
assert.Equal(t, 1, c.Bump("x")) // first bump → 1
assert.Equal(t, 2, c.Bump("x")) // second → 2
got, ok := store.Get("x") // observable state in the store
assert.True(t, ok)
assert.Equal(t, 2, got)
}
// 3. The contract test — run against EVERY Store implementation.
func StoreContract(t *testing.T, newStore func() Store) {
t.Run("incr from absent starts at 1", func(t *testing.T) {
s := newStore()
assert.Equal(t, 1, s.Incr("k"))
})
t.Run("incr accumulates", func(t *testing.T) {
s := newStore()
s.Incr("k")
assert.Equal(t, 2, s.Incr("k"))
})
t.Run("set then get round-trips", func(t *testing.T) {
s := newStore()
s.Set("k", 42)
v, ok := s.Get("k")
assert.True(t, ok)
assert.Equal(t, 42, v)
})
}
func TestFakeStore_satisfiesContract(t *testing.T) { StoreContract(t, func() Store { return NewFakeStore() }) }
func TestRedisStore_satisfiesContract(t *testing.T) { StoreContract(t, func() Store { return NewRedisStore(testClient) }) }
Summary — the moves you practiced¶
Across these six exercises, the same handful of transformations turn fragile into robust:
- Drive through the public API; never read private state (Ex. 1). The internal data structure is not the contract.
- Assert the minimum the behavior promises (Ex. 2). Drop incidental fields — generated ids, timestamps, versions — and assert existence not value for generated ones.
- Parse, don't string-match serialized output (Ex. 3). Pin values, not byte layout, unless the wire format genuinely is the contract.
- Assert outcomes, not interactions (Ex. 4). Replace verifying mocks with fakes that record state; this also catches wrong-data bugs the interaction test missed.
- Compare sets when order isn't the contract, and don't assert on log prose (Ex. 5). Assert the behavior the log describes, through the return value or structured fields.
- Replace god mocks with a fake + a contract test (Ex. 6). The contract test proves the fake matches the real implementation, so fast fake-based tests stay trustworthy and refactor-resilient.
The unifying check, applied to every solution: would this test survive a behavior-preserving refactor, and would it still fail if the behavior actually broke? A robust test answers yes to both.
Related Topics¶
junior.md— what a fragile test looks like and the three first habits.middle.md— the four creep patterns and the contract-vs-implementation rule.find-bug.md— spot the coupling in brittle test snippets.optimize.md— refactor a whole over-specified test file end-to-end.- Over-Mocking — exercises on mock-induced fragility.
- The
mocking-strategies,unit-testing-patterns, andtest-data-managementskills — fakes vs mocks, contract tests, builders.
In this topic