Designing for Testability — Middle Level¶

Focus: "Why?" and "When does it bend?" — testability as a proxy for good design, the trade-offs of injection, seams, pure cores, and the humble object, and the cases where chasing testability damages the production design instead of improving it.

Table of Contents¶

Testability as a proxy — and where the proxy lies
Test-induced design damage: the counter-force
How to inject: constructor vs container vs manual wiring
Seams without over-abstraction
Pure core, imperative shell — minimizing the mocking surface
The Humble Object boundary
Sociable vs solitary tests, and the design that enables each
Don't mock what you don't own — when integration tests win
Designing for observability
Common Mistakes
Test Yourself
Cheat Sheet
Summary
Further Reading
Related Topics

Testability as a proxy — and where the proxy lies¶

The central claim of this chapter: hard-to-test code is usually badly designed code. When a unit resists testing, the test is reporting a structural problem — hidden dependencies, tight coupling to globals, or logic fused to a side effect. The pain you feel writing the test is the same pain the next maintainer will feel changing the code.

The friction maps almost one-to-one onto design defects:

Test pain	The design defect it reveals
"I can't construct this object without a live database."	Construction does real work; collaborators are hidden, not injected.
"The test passes alone but fails in the suite."	Shared global mutable state.
"I have to mock six things to test one behavior."	The unit reaches across too many boundaries; low cohesion.
"The result is only visible as a log line."	No observable return value or event — the method is a black hole.
"It returns a different answer every run."	Un-injected `time.Now()`, randomness, or I/O.

So far, so clean. But the word that matters in "usually badly designed" is usually. Testability is a proxy metric, and every proxy metric can be gamed or misread. Optimizing the proxy instead of the goal is where teams go wrong — which is the entire next section.

The honest framing: testability is a symptom reader, not a design goal. Good decoupling makes code testable. The reverse — contorting code to satisfy a test harness — does not reliably make it well designed, and sometimes makes it worse.

Test-induced design damage: the counter-force¶

David Heinemeier Hansson (DHH) coined test-induced design damage: changes made to production code purely to enable a particular testing style, which degrade the design that would otherwise be natural. This is the necessary counterweight to "make everything testable."

Symptoms of damage:

An interface with exactly one implementation, created so a test can supply a mock — adding indirection nobody else needs.
Public methods that exist only so a test can reach internal state — breaking encapsulation to assert on a private intermediate.
Constructor parameters multiplying because every collaborator, however trivial, is injected "for testability," turning a 2-argument object into an 8-argument one.
Logic spread thin across many tiny classes so each is "unit testable in isolation," when one cohesive class would have been clearer and a single sociable test would have covered it.

# Damage: a one-impl interface + injected formatter purely to mock it in a test.
class Greeter:
    def __init__(self, name_formatter: "NameFormatter", clock: "Clock", repo: "UserRepo"):
        self._fmt = name_formatter
        self._clock = clock
        self._repo = repo

    def greeting(self, user_id: str) -> str:
        user = self._repo.get(user_id)
        return f"Hello, {self._fmt.format(user.name)}"  # formatting is pure string work

# Healthier: keep pure formatting inline; inject only the real boundary (the repo).
class Greeter:
    def __init__(self, repo: UserRepo):
        self._repo = repo

    def greeting(self, user_id: str) -> str:
        user = self._repo.get(user_id)
        return f"Hello, {user.name.title()}"  # test by calling greeting() with a fake repo

The discriminating question: would I keep this seam if no test existed? If the abstraction earns its place by enabling substitution that production genuinely needs (a second payment provider, a swappable storage backend), keep it. If its only consumer is the test, you are paying production complexity to subsidize a test — and a better-shaped test (a fake, a sociable test, an integration test) would not demand it.

The resolution between "design for testability" and "don't damage design for tests" is not a contradiction. Both say the same thing: let good design make code testable; do not let test mechanics dictate bad design. When they conflict, the test is usually asking for the wrong kind of isolation.

How to inject: constructor vs container vs manual wiring¶

Dependency injection is the load-bearing technique for testability — it converts hidden collaborators into substitutable parameters. But how you inject has trade-offs.

Constructor injection (the default)¶

Dependencies are mandatory parameters of the constructor. The object is impossible to build in an invalid state, dependencies are explicit, and tests pass fakes directly with zero framework.

type OrderService struct {
    repo    OrderRepo
    payment PaymentGateway
    clock   func() time.Time // inject time as a function — no global time.Now()
}

func NewOrderService(repo OrderRepo, payment PaymentGateway, clock func() time.Time) *OrderService {
    return &OrderService{repo: repo, payment: payment, clock: clock}
}

// Test: wire fakes by hand. No container, no magic.
svc := NewOrderService(
    &FakeOrderRepo{},
    &StubPaymentGateway{approve: true},
    func() time.Time { return time.Date(2026, 1, 1, 0, 0, 0, 0, time.UTC) },
)

When it bends: a constructor with 8+ parameters is a smell — but the smell is low cohesion, not "too much DI." The fix is to split the class, not to hide the dependencies in a container.

Framework container (Spring, Guice, Dagger, dependency-injector)¶

A container resolves and wires the graph at startup from configuration or annotations.

@Service
class OrderService {
    private final OrderRepo repo;
    private final PaymentGateway payment;
    private final Clock clock;

    OrderService(OrderRepo repo, PaymentGateway payment, Clock clock) {
        this.repo = repo; this.payment = payment; this.clock = clock;
    }
}

Trade-off: containers shine for wiring large graphs and managing lifecycles/scopes. But note the class above is still constructor-injected — the container only chooses arguments. The anti-pattern is field/setter injection driven by the container (@Autowired on a private field), which makes the object un-testable without the container and hides dependencies. Prefer constructor injection even when a container is present; then unit tests need no container at all.

Manual wiring (composition root)¶

A single place — main, a factory, an app bootstrap — constructs the whole graph by hand.

Trade-off: maximally explicit and dependency-free, scales poorly for very large graphs but is ideal up to mid-size apps and is the most test-friendly. Many teams use manual wiring in tests and a container in production, served by the same constructor-injected classes.

Rule: the class should be unit-testable with hand-wired fakes regardless of how production wires it. If your class only works under the container, the container has become a hidden dependency.

Seams without over-abstracting¶

A seam (Michael Feathers, Working Effectively with Legacy Code) is a place where you can alter behavior without editing the code at that point — typically by substituting a collaborator. Seams are what make isolation possible.

The middle-level skill is knowing when to introduce one. The failure modes sit at both extremes:

No seam: a method news its collaborator or calls a static, so there is no point to substitute. You cannot test the logic without the real thing.
Too many seams: every collaboration is abstracted behind an interface "just in case," producing a forest of single-implementation interfaces (the test-induced damage above).

The heuristic: add a seam when you need to substitute now — a real boundary you must fake (clock, network, DB, randomness), or a genuine variation point production needs (two providers). Do not add an interface speculatively for a type that has one implementation and crosses no boundary. You can introduce the seam the moment a second case appears; YAGNI applies to seams too.

# Over-abstraction: TaxCalculator is pure arithmetic with one impl. No seam needed.
class ITaxCalculator(Protocol):       # delete this
    def tax(self, amount: Decimal) -> Decimal: ...

# Right: call the pure function directly; test it directly. The seam that DOES
# earn its place is the boundary to an external rates service:
class TaxRatesProvider(Protocol):     # keep this — it hits the network
    def rate_for(self, region: str) -> Decimal: ...

Boundaries are the natural seam locations. See ../07-boundaries/README.md: the line between your code and the outside world (DB, HTTP, filesystem, time) is exactly where a seam pays for itself, and exactly where it does not need an excuse.

Pure core, imperative shell — minimizing the mocking surface¶

The most reliable way to need fewer mocks is to have less to mock. Push decision logic into pure functions that take data and return data; confine side effects to a thin outer layer.

Pure core: deterministic, no I/O, no clock, no globals. Tested by passing inputs and asserting outputs — no doubles at all.
Imperative shell: does the I/O, then hands data to the core and acts on the result. Thin enough that a few integration tests cover it.

// Pure core: all the rules, zero I/O. Trivially testable with plain values.
func DecideDiscount(order Order, now time.Time) Decision {
    if order.Total > 100 && isWeekend(now) {
        return Decision{Discount: order.Total * 0.10, Reason: "weekend-bulk"}
    }
    return Decision{Discount: 0}
}

// Imperative shell: I/O in, pure call, I/O out.
func (s *OrderService) ApplyDiscount(id string) error {
    order, err := s.repo.Get(id)        // side effect (read)
    if err != nil { return err }
    decision := DecideDiscount(order, s.clock())  // pure
    return s.repo.SaveDiscount(id, decision)      // side effect (write)
}

Tests for DecideDiscount need no fakes, run in microseconds, and cover the entire rules matrix exhaustively. Only the shell — small and boring — needs the repo faked. Compare this with logic interleaved with I/O, where every test must mock a chain of calls.

Preference order for isolating dependencies: real object → in-memory fake → stub → mock. Reach for deep mock chains last. A mock.repo.expects.get().returns(...).then.save().verify() ritual couples the test to the implementation's call sequence; a hand-written in-memory fake couples it to behavior. See ../15-pure-functions/README.md for why pure functions eliminate the question entirely.

The Humble Object boundary¶

Some code genuinely cannot be unit-tested cheaply: a UI event handler, a framework callback, a main, a thread entry point, a thin DB adapter. The Humble Object pattern (Gerard Meszaros) says: make that layer so dumb it barely needs testing, and move all the logic it would have held into a testable plain object next to it.

# Humble: the framework handler does no logic — it adapts and delegates.
class CheckoutHandler:                       # framework-bound, "humble"
    def __init__(self, checkout: Checkout):  # Checkout is a plain, testable object
        self._checkout = checkout

    def on_request(self, http_request):                 # untestable framework edge
        cmd = parse(http_request)                       # trivial mapping
        result = self._checkout.place(cmd)              # ALL logic lives here
        return to_http_response(result)                 # trivial mapping

The handler stays at "extract input, call core, format output" — three lines you cover with one or two integration tests. Checkout holds the rules and is unit-tested in full isolation. The humble layer is the deliberate home for the un-testable, keeping it small instead of letting logic leak into a place you can't reach.

This is the same shape as pure-core/imperative-shell, viewed from the boundary: the Humble Object is the imperative shell at a specific framework edge.

Sociable vs solitary tests, and the design that enables each¶

Martin Fowler distinguishes:

Solitary test: the unit under test is isolated; all collaborators are replaced with doubles.
Sociable test: the unit is tested together with its real collaborators (the ones that are cheap and in-process), substituting only true boundaries.

Neither is universally correct, and the design dictates which is even possible:

Design property	Enables
Collaborators are pure / cheap / in-memory	Sociable tests — use the real ones, fewer doubles, tests survive refactors
Collaborators cross expensive boundaries (DB, net)	Solitary tests for the unit + integration tests for the boundary
Logic fused to a side effect	Forces over-mocked solitary tests (a smell)

A pure-core/imperative-shell design naturally produces sociable core tests: the core's collaborators are other pure functions, so you run them for real and only fake the shell's boundaries. Over-mocked solitary tests are often a sign that the unit is reaching across a boundary it shouldn't — the design, not the test, is wrong.

Trade-off: solitary tests pinpoint the failing unit precisely but couple to call structure (refactor-fragile). Sociable tests are refactor-robust and catch integration bugs but localize failures less precisely. Use solitary where isolation is genuinely needed (true boundaries); prefer sociable for clusters of cheap, pure collaborators.

Don't mock what you don't own — when integration tests win¶

A well-known rule (Steve Freeman & Nat Pryce, Growing Object-Oriented Software): don't mock types you don't own. Mocking a third-party client (the AWS SDK, a payment library, a DB driver) bakes your assumptions about its behavior into the test. When the real library behaves differently — a retry, a wrapped error, a pagination quirk — your mock-based test stays green while production breaks.

The design response is to wrap the third party behind a thin port you own (an interface in your domain language), and:

Unit-test your code against the port with a fake you control.
Integration-test the adapter (the real implementation of the port) against the real dependency or a high-fidelity substitute (Testcontainers, an embedded server, a sandbox).

// Port you own — mock this in unit tests.
interface PaymentGateway { PaymentResult charge(Money amount, Card card); }

// Adapter you own, wrapping Stripe (which you do NOT mock).
class StripeGateway implements PaymentGateway {
    private final com.stripe.StripeClient stripe;   // third party
    public PaymentResult charge(Money amount, Card card) { /* translate + call */ }
}
// StripeGateway is verified by an INTEGRATION test against Stripe's test mode,
// never by mocking com.stripe.StripeClient.

This is where designing for testability connects directly to ../07-boundaries/README.md: the boundary wrapper is simultaneously your seam, your humble object, and the line between unit and integration testing. Integration tests are the right tool when the contract you depend on lives outside your code — and no amount of mocking can verify a contract you don't own.

Designing for observability¶

A behavior you cannot observe, you cannot assert. Methods that do work but expose nothing — mutating private state, writing only a log line, firing-and-forgetting — are testable only through brittle, indirect means (parsing logs, reflecting into privates).

Design for observability by giving each meaningful behavior an assertable output:

Return a value instead of mutating hidden state. A Decision, a Result, a computed object you can compare.
Emit a domain event the test can capture (a recorded event, a callback, a message on an in-memory bus) rather than a side effect buried in a vendor SDK.
Make state queryable through a public accessor when a return value doesn't fit.

// Hard to observe: only a log line proves anything happened.
func (s *Svc) Process(o Order) {
    if o.Fraudulent { log.Printf("rejected %s", o.ID); return }  // assert how?
    s.charge(o)
}

// Observable: return a result the test compares directly.
func (s *Svc) Process(o Order) ProcessResult {
    if o.Fraudulent {
        return ProcessResult{Status: Rejected, Reason: "fraud"}
    }
    return ProcessResult{Status: Charged, Amount: o.Total}
}

Returning a result also tends to improve the production design: it makes the outcome explicit to callers, composes better, and removes the hidden-side-effect coupling. Observability is another case where the testable shape and the well-designed shape coincide.

Common Mistakes¶

Adding an interface with one implementation "for testability." If only the test consumes it, it's test-induced design damage. Add the seam when production needs substitution, or fake the concrete via a different technique.
Field/setter injection under a container (@Autowired private field). The object becomes un-testable without the framework. Use constructor injection even with a container.
Mocking what you don't own. Wrap the third party in a port; unit-test against the port, integration-test the adapter against the real thing.
Deep mock chains (when(a.b()).thenReturn(c); when(c.d())...). The test now asserts call structure, not behavior, and breaks on every refactor. Prefer an in-memory fake or a sociable test.
Exposing privates / adding getters only for tests. This is the test reaching into intermediates. Assert on the public outcome; if there isn't one, the method needs a return value, not the test needs a getter.
Un-injected time.Now(), random, uuid, env reads. Non-deterministic units. Inject a clock/RNG/ID source — usually as a simple function, not a heavyweight interface.
God constructors doing real work (opening connections, hitting the network on new). Construction should only assign; real work happens in methods. Otherwise the object can't be instantiated in a test.
Splitting one cohesive class into many tiny ones so each is "isolatable." This trades clarity for an isolation you didn't need. A sociable test over the cohesive unit is often better.
Treating every solitary-test mock as virtuous. Over-mocking is a design smell signaling the unit reaches across a boundary it shouldn't own.

Test Yourself¶

You introduce an interface so a unit test can supply a mock. The interface has exactly one production implementation and will likely never have another. Good design?

Answer

This is the textbook case of **test-induced design damage**. The seam exists solely to serve the test; production gets indirection it doesn't need. Better options: write an in-memory **fake** of the concrete type, use a **sociable** test if the collaborator is cheap and pure, or restructure so the logic is a pure function you call directly. Keep the interface only if production genuinely needs to substitute implementations (a real variation point or an external boundary). The discriminating question: *would I keep this seam if no test existed?*

A unit test passes when run alone but fails inside the full suite. What design defect is the test reporting, and is the fix in the test or the production code?

Answer

**Global mutable state** that tests leak into each other (a static cache, a singleton registry, a module-level variable, a shared DB row). The robust fix is in production code: remove the global, make the dependency injected so each test gets its own instance. Resetting the global in test teardown is a band-aid that papers over the coupling and breaks again under parallel test execution.

When is mocking a dependency the wrong choice, and what replaces it?

Answer

When you **don't own the dependency** (a third-party SDK, a DB driver), mocking bakes your assumptions about its behavior into the test; the test stays green while production breaks on the real behavior. Wrap the third party behind a **port you own**, unit-test your code against a fake of that port, and **integration-test the adapter** against the real dependency (or a high-fidelity substitute like Testcontainers). You can only meaningfully verify a contract you own.

A method does its work but its only output is a log line. How do you make it testable, and why does that also improve the design?

Answer

Give it an **assertable output** — return a result object, emit a capturable domain event, or expose queryable state — instead of relying on a side effect. Tests then compare a value directly rather than parsing logs. This improves the production design too: the outcome becomes explicit to callers, the method composes better, and you remove a hidden side-effect coupling. Observable and well-designed are the same shape here.

Your OrderService constructor has grown to 8 parameters. Is the answer to move them into a DI container so the constructor "looks cleaner"?

Answer

No. The container would only *hide* the dependencies, not remove them — the object still does eight things. Eight constructor parameters is a **low-cohesion** signal: the class has too many responsibilities. The fix is to **split the class** (or group genuinely-related collaborators into a smaller object), reducing each piece to a focused unit. Constructor injection is doing its job here: it surfaced a design smell the container would have concealed.

You have a pure pricing calculation and a service that reads an order from the DB, prices it, and saves the result. How do you structure this to minimize mocking?

Answer

Apply **pure core, imperative shell**. Put all pricing rules in a pure function (`Decision = price(order, now)`) that takes plain data and returns plain data — tested exhaustively with zero doubles. Keep the service thin: read (I/O), call the pure function, write (I/O). Only the shell needs the repo faked, and a couple of tests cover it. This collapses the mocking surface to the single real boundary (the repo) instead of mocking a chain of calls inside interleaved logic.

When should you prefer a sociable test over a solitary one?

Answer

Prefer **sociable** when the collaborators are cheap, in-process, and ideally pure — run them for real, fake only true boundaries (DB, network, clock). Sociable tests are refactor-robust and catch integration bugs between your own units. Prefer **solitary** when a collaborator crosses an expensive or external boundary, or when you need to pinpoint exactly which unit failed. If you find yourself forced into heavy mocking for a solitary test, that's usually the *design* telling you the unit reaches across a boundary it shouldn't.

Why is a "God constructor" that opens a DB connection on new a testability problem, and what's the fix?

Answer

You cannot instantiate the object in a test without a live database — construction has a hidden side effect. It also couples object lifetime to resource lifetime. The fix: constructors should only **assign** injected dependencies; real work (connecting, fetching) happens in methods, called when needed. Inject the connection (or a factory/port for it) so tests supply a fake. Construction becomes pure and free.

Cheat Sheet¶

Situation	Do	Avoid
Need to substitute a true boundary (DB, net, clock, RNG)	Inject a port/seam	Calling the concrete/static directly
Type has one impl, crosses no boundary	Call it directly; no interface	An interface "just for the mock"
Wiring dependencies	Constructor injection (default)	Field/setter injection under a container
Large object graph in production	Container or composition root	Making the class depend on the container
Logic mixed with I/O	Pure core + imperative shell	Mocking a chain inside interleaved code
Framework edge (handler, `main`)	Humble Object — delegate to plain object	Logic trapped in the untestable layer
Third-party dependency	Port you own + adapter; integration-test adapter	Mocking the type you don't own
Cheap, pure collaborators	Sociable test with real objects	Mocking everything reflexively
Behavior with no visible output	Return a value / emit an event	Asserting via logs or private state
Constructor has 8+ params	Split the class (low cohesion)	Hiding params in a container

The one-line test: Would I keep this seam / abstraction / split if no test existed? If yes, it's design. If only the test needs it, it's damage.

Summary¶

Testability is a proxy for good design, not a goal in itself. Hard-to-test code is usually badly coupled — but optimizing the proxy blindly produces test-induced design damage.
The discriminating question for any seam, interface, or split is: would I keep it if no test existed? Production-justified substitution earns its place; test-only indirection does not.
Constructor injection is the default; containers and manual wiring are how you supply arguments, and the class must stay testable without either.
Add seams at real boundaries, when you need substitution now — not speculatively for single-implementation types.
Pure core, imperative shell minimizes the mocking surface; prefer real objects → fakes → stubs → mocks, with deep mock chains last.
Humble Object keeps untestable framework edges trivially thin so all logic lives in testable plain objects.
Don't mock what you don't own — wrap it in a port, unit-test the port, integration-test the adapter. Integration tests are the right tool for contracts that live outside your code.
Design for observability: give behaviors assertable return values or events; the observable shape and the well-designed shape usually coincide.

Designing for Testability — Middle Level¶

Table of Contents¶

Testability as a proxy — and where the proxy lies¶

Test-induced design damage: the counter-force¶

How to inject: constructor vs container vs manual wiring¶

Constructor injection (the default)¶

Framework container (Spring, Guice, Dagger, dependency-injector)¶

Manual wiring (composition root)¶

Seams without over-abstracting¶

Pure core, imperative shell — minimizing the mocking surface¶

The Humble Object boundary¶

Sociable vs solitary tests, and the design that enables each¶

Don't mock what you don't own — when integration tests win¶

Designing for observability¶

Common Mistakes¶

Test Yourself¶

Cheat Sheet¶

Summary¶

Further Reading¶

Related Topics¶