Over-Mocking — Refactoring Practice¶
Category: Testing Anti-Patterns → Over-Mocking — mocking so much that the test verifies the mocks, not the behavior.
This file is not "spot the smell" — find-bug.md does that. Here you take a brittle, mock-everything test that already passes and refactor it into a test that verifies behavior: fakes for stateful collaborators, outcome/state assertions, interaction checks only at the genuine boundaries — plus a contract test so the boundary the fake hides is verified against reality. The skill on display is the transformation and the trade-off reasoning, not just the destination.
The discipline:
- Name what each collaborator is — value object, stateful port, boundary you own, third-party, side-effect-only port. The treatment follows from the kind (the
middle.mddecision table). - Replace mocks with the least powerful double that works — real object > fake > stub > mock. Pull the assertion toward observable state.
- Back the remaining boundary doubles with a higher-fidelity test — an integration or contract test — so the fast test's speed doesn't cost you fidelity.
- Prove the new test is stronger: introduce a real production bug and confirm the new test reds where the old one stayed green.
Each solution weighs the classicist/mockist trade-off explicitly, because right-sizing doubles is a judgment call, not a rule you can apply blind.
How to use this file: read the "Before" test, plan your move sequence yourself before expanding the solution, then compare. The gap between your plan and the worked one is where the learning is.
Table of Contents¶
| # | Exercise | Move | Lang |
|---|---|---|---|
| 1 | From verify-the-save to assert-the-state | Mock → fake + state | Python |
| 2 | Un-mock the value object | Mock → real object | Java |
| 3 | Wrap the third party, then contract-test it | Mock SDK → port + fake + integration | Go |
| 4 | Right-size a mock-everything service test | Whole-suite right-sizing | Java |
| 5 | Make the boundary honest with a consumer-driven contract | Mock → CDC | (process + pseudocode) |
Exercise 1 — From verify-the-save to assert-the-state¶
Move: mock → fake + state assertion. Goal: a test that fails when the deposit math is wrong.
Before — mock-everything, green over a real bug¶
from unittest.mock import MagicMock
def test_deposit():
repo = MagicMock()
account = MagicMock(); account.balance = 100
repo.get.return_value = account
Wallet(repo).deposit("acc-1", 50)
repo.get.assert_called_once_with("acc-1")
repo.save.assert_called_once_with(account) # only checks calls
This passes even if deposit does account.balance += 0. It asserts the conversation (get, save), never the result (balance == 150).
Refactored solution
**Plan:** (1) replace the `MagicMock` repo with an in-memory **fake** that holds real state; (2) drop the `assert_called` interaction checks; (3) assert on the balance read back from the fake; (4) prove it by breaking the production line.class FakeAccountRepo:
def __init__(self, accounts):
self._a = {a.id: a for a in accounts}
def get(self, account_id):
return self._a[account_id]
def save(self, account):
self._a[account.id] = account
def test_deposit_increases_balance():
repo = FakeAccountRepo([Account(id="acc-1", balance=100)])
new_balance = Wallet(repo).deposit("acc-1", 50)
assert new_balance == 150 # outcome
assert repo.get("acc-1").balance == 150 # state, read back from the fake
Exercise 2 — Un-mock the value object¶
Move: mock → real object. Goal: test the actual computation, not a scripted return.
Before — mocks the Money value object¶
@Test void invoiceTotal_mocked() {
Money line1 = mock(Money.class);
Money line2 = mock(Money.class);
Money partial = mock(Money.class);
when(line1.add(line2)).thenReturn(partial);
when(partial.amount()).thenReturn(new BigDecimal("30"));
Invoice inv = new Invoice(List.of(line1, line2));
assertThat(inv.total().amount()).isEqualByComparingTo("30"); // tests the stub
}
The "total" is whatever partial.amount() was stubbed to return. The real summation never runs.
Refactored solution
**Plan:** (1) `Money` is a value object — delete every mock of it; (2) construct real `Money` values; (3) assert on the real `total()`; (4) prove it by breaking `Money.add` or `Invoice.total`. **Proof it's stronger.** Inject a bug into `Invoice.total()` — skip the last line, or into `Money.add` — subtract instead of add. The old test can't notice (it scripted `30`); the new test fails because the real sum is now wrong. **Trade-off note.** There is *no* trade-off here — mocking a value object is never the right call. It has no boundary to isolate and no I/O to avoid; the arithmetic *is* the unit under test. Both schools (classicist and disciplined mockist) agree: **don't mock value objects.** If mocking `Money` felt necessary, that was a signal the test was reaching for `mock()` reflexively rather than asking "is this a boundary?"Exercise 3 — Wrap the third party, then contract-test it¶
Move: mock the SDK → own a port, fake it in unit tests, integration-test the adapter.
Before — mocks the third-party S3 client directly¶
// Production code touches the AWS SDK type directly.
type ReportArchiver struct{ s3 *awss3.Client }
func (a *ReportArchiver) Archive(name string, data []byte) (string, error) {
_, err := a.s3.PutObject(context.TODO(), &awss3.PutObjectInput{
Bucket: aws.String("reports"), Key: aws.String(name), Body: bytes.NewReader(data),
})
if err != nil {
return "", err
}
return "s3://reports/" + name, nil
}
// Test mocks the AWS SDK — a type we don't own.
func TestArchive_MockingS3(t *testing.T) {
m := new(MockS3Client)
m.On("PutObject", mock.Anything, mock.Anything).Return(&awss3.PutObjectOutput{}, nil)
// ...assert PutObject was called with some input...
}
The test freezes a guess about the AWS SDK and never proves the real upload works.
Refactored solution
**Plan:** (1) define a narrow port `BlobStore` in *our* terms; (2) move the SDK call into an adapter that's the only place importing the AWS SDK; (3) unit-test `ReportArchiver` against a **fake** `BlobStore` and assert behavior (the returned URL, the stored bytes); (4) write one **integration test** for the adapter against real S3 / LocalStack.// 1. Port we own.
type BlobStore interface {
Put(ctx context.Context, key string, data []byte) error
}
// 2. Adapter — the ONLY file importing the AWS SDK.
type s3BlobStore struct{ client *awss3.Client; bucket string }
func (s *s3BlobStore) Put(ctx context.Context, key string, data []byte) error {
_, err := s.client.PutObject(ctx, &awss3.PutObjectInput{
Bucket: aws.String(s.bucket), Key: aws.String(key), Body: bytes.NewReader(data),
})
if err != nil {
return fmt.Errorf("s3 put %q: %w", key, err)
}
return nil
}
// Production code now depends on the port.
type ReportArchiver struct{ store BlobStore }
func (a *ReportArchiver) Archive(ctx context.Context, name string, data []byte) (string, error) {
if err := a.store.Put(ctx, name, data); err != nil {
return "", err
}
return "s3://reports/" + name, nil
}
// 3. Unit test: a recording FAKE of our port. Fast, honest, asserts behavior.
type fakeBlobStore struct{ puts map[string][]byte }
func newFakeBlobStore() *fakeBlobStore { return &fakeBlobStore{puts: map[string][]byte{}} }
func (f *fakeBlobStore) Put(_ context.Context, key string, data []byte) error {
f.puts[key] = data
return nil
}
func TestArchive_ReturnsURLAndStoresBytes(t *testing.T) {
store := newFakeBlobStore()
arch := &ReportArchiver{store: store}
url, err := arch.Archive(context.Background(), "q1.pdf", []byte("PDFDATA"))
require.NoError(t, err)
require.Equal(t, "s3://reports/q1.pdf", url) // behavior: URL shape
require.Equal(t, []byte("PDFDATA"), store.puts["q1.pdf"]) // behavior: stored content
}
// 4. Integration test: the adapter against real S3 / LocalStack. Verifies the seam.
//go:build integration
func TestS3BlobStore_Put(t *testing.T) {
store := &s3BlobStore{client: localStackClient(t), bucket: "reports"}
require.NoError(t, store.Put(context.Background(), "k", []byte("x")))
require.Equal(t, []byte("x"), getObject(t, "reports", "k")) // real round-trip
}
Exercise 4 — Right-size a mock-everything service test¶
Move: whole-suite right-sizing — pick the correct double per collaborator.
Before — five mocks, asserts only interactions¶
@Test void checkout_mocked() {
OrderRepository repo = mock(OrderRepository.class);
PricingService pricing = mock(PricingService.class);
Clock clock = mock(Clock.class);
PaymentGateway payment = mock(PaymentGateway.class);
EmailSender emailer = mock(EmailSender.class);
when(pricing.total(any())).thenReturn(new BigDecimal("20"));
when(clock.instant()).thenReturn(Instant.parse("2026-01-01T12:00:00Z"));
new CheckoutService(repo, pricing, clock, payment, emailer)
.checkout(new Cart(List.of(new Item("widget", 10, 2)), "tok", "a@b.com"));
verify(payment).charge(any());
verify(repo).save(any());
verify(emailer).send(any()); // all interaction, no outcome
}
Five mocks, including pricing (pure logic) and repo (stateful). Every assertion is verify(...any()) — wrong totals, wrong recipients, wrong persisted state all pass.
Refactored solution
**Plan — classify each collaborator and assign the right double:** | Collaborator | Was | Should be | Why | |---|---|---|---| | `pricing` | mock | **real** | pure logic — exercise it | | `repo` | mock | **fake** | stateful — assert persisted state | | `clock` | mock | **stub** (fixed) | non-determinism only | | `payment` | mock | **mock + args** | boundary command, no local state | | `emailer` | mock | **mock + args** (or recording fake) | side-effect-only port |@Test void checkout_charges_persists_and_emails_with_correct_data() {
var repo = new InMemoryOrderRepository();
var pricing = new RealPricingService(); // real pure logic
var clock = Clock.fixed(Instant.parse("2026-01-01T12:00:00Z"), UTC);
var payment = mock(PaymentGateway.class); // boundary
var emailer = mock(EmailSender.class); // side-effect-only port
var svc = new CheckoutService(repo, pricing, clock, payment, emailer);
var cart = new Cart(List.of(new Item("widget", 10, 2)), "tok", "a@b.com");
Order order = svc.checkout(cart);
// outcomes / state
assertThat(order.total()).isEqualByComparingTo("20");
assertThat(repo.findById(order.id()).placedAt())
.isEqualTo(Instant.parse("2026-01-01T12:00:00Z"));
// boundary command — pin the ARGUMENTS, not mere occurrence
verify(payment).charge(argThat(c -> c.amount().compareTo(new BigDecimal("20")) == 0
&& c.token().equals("tok")));
// side-effect-only port — pin recipient
verify(emailer).send(argThat(e -> e.to().equals("a@b.com")));
}
Exercise 5 — Make the boundary honest with a consumer-driven contract¶
Move: a lone mock of an external service → a consumer-driven contract the provider verifies.
Before — a mock of another team's API, standing alone¶
# Your service calls the Inventory team's HTTP API.
def test_reserve_stock():
inventory_api = MagicMock()
inventory_api.reserve.return_value = {"status": "reserved", "ref": "R1"}
result = FulfillmentService(inventory_api).fulfill(order_id="o1", sku="s1", qty=2)
assert result.reservation_ref == "R1"
This encodes your guess about the Inventory API's response shape (status, ref). Nothing checks that the real service still returns that shape. When Inventory renames ref → reservation_id, your suite stays green and production breaks.
Refactored solution
**Plan:** keep the fast test, but turn the mock's assumptions into a **consumer-driven contract** (Pact-style) that the *provider* verifies against their real service in *their* pipeline. Now the mock can't silently drift.# Consumer side: define the interaction as a contract, run the test against
# Pact's local mock server (which RECORDS the contract), then publish it.
def test_reserve_stock_contract(pact):
(pact
.given("sku s1 has stock")
.upon_receiving("a reserve request for 2 units of s1")
.with_request("POST", "/reservations", body={"sku": "s1", "qty": 2})
.will_respond_with(200, body={"status": "reserved", "ref": Like("R1")}))
with pact:
client = InventoryClient(pact.uri) # points at the mock server
result = FulfillmentService(client).fulfill(order_id="o1", sku="s1", qty=2)
assert result.reservation_ref == "R1"
# On success, Pact writes the contract (the request/response shape) to the broker.
Provider side (Inventory team's CI, separate repo):
- Pull every consumer contract from the broker.
- Replay each recorded request against the REAL Inventory service.
- Assert the real responses satisfy the consumers' expectations.
- If they rename `ref` -> `reservation_id`, the provider build FAILS here,
telling them — before deploy — that they'd break Fulfillment.
Summary — the refactoring playbook¶
Across all five, the transformation followed the same shape:
- Classify the collaborator (value object / pure logic / stateful port / owned boundary / third-party / side-effect-only port). The kind dictates the double.
- Downgrade to the least powerful double that lets you assert behavior: real > fake > stub > mock. Most mocks become fakes or disappear.
- Pull the assertion onto observable behavior — return value, fake state, persisted row — and where the effect leaves the system, verify the interaction with its arguments, never bare
any(). - Back each remaining boundary double with higher fidelity proportional to the cost of it being wrong: nothing for values/logic, an in-process contract for owned adapters, an integration test for the real seam, a consumer-driven contract for external services.
- Prove the new test is stronger by injecting a real production bug and watching the new test red where the old one stayed green. If breaking the logic doesn't break a test, the test was never testing the logic.
The trade-off you weigh every time is isolation vs fidelity — the classicist/mockist axis. The professional answer is never global ("always mock" / "never mock") but per-collaborator: assert state where there's state, verify interactions only where the interaction is the whole point, and never let a boundary double stand without something honest behind it.
Related Topics¶
junior.md/middle.md/senior.md/professional.md— the level files.find-bug.md— spot the over-mocking these refactor away.tasks.md— smaller, focused versions of these moves.- Fragile Tests — refactoring the closely related anti-pattern.
- Design → Coupling and State — when the refactor must reach into production design.
- The
mocking-strategies,dependency-injection, andintegration-testingskills.
In this topic