Abstraction Failures — Refactoring Practice¶
Category: Design Anti-Patterns → Abstraction Failures — the chosen abstraction fights the problem instead of fitting it. Covers (collectively): Golden Hammer · Inner-Platform Effect · Interface Bloat · Premature Abstraction
These are not "spot the smell" puzzles — find-bug.md does that. Here the code is abstracted in the wrong shape but already working, and your job is to bend it back toward the problem without changing behavior. The skill on display is the process, not just the destination:
- Pin behavior first. Write a characterization test — a test that records what the code does now, bugs and quirks included. You are freezing behavior, not blessing it. When the whole point is performance, add a benchmark alongside, so "faster" is a measured claim, not a hope.
- Take small, reversible steps. One named refactoring at a time (Inline Class, Extract Interface, Replace Conditional with Lookup Table, …), tests green after each, commit.
- Separate structural from behavioral commits. A refactor must preserve behavior; a measured perf win is allowed precisely because the characterization test proves output is unchanged while the benchmark proves the cost dropped.
Abstraction failures are subtle because the code runs. The Golden Hammer works — slowly. The Inner-Platform rule engine evaluates rules — unmaintainably. The bloated interface compiles — until an implementer throws UnsupportedOperationException. The premature abstraction has exactly one subclass — forever. Each worked solution names the canonical refactoring moves (Fowler's Refactoring, Feathers' Working Effectively with Legacy Code) so you build the vocabulary, not just the instinct.
One counter-case matters. Not every abstraction is wrong. Exercise 11 hands you a "premature-looking" abstraction that is actually justified — and the correct refactoring is to keep it and document why. Knowing when not to inline is as important as knowing how.
How to use this file: read the "Before" code, write down the move sequence yourself (and, for the perf ones, predict the speedup) before expanding the solution, then compare.
Table of Contents¶
| # | Exercise | Anti-pattern(s) | Lang | Key moves |
|---|---|---|---|---|
| 1 | Regex where a substring will do | Golden Hammer | Go | Replace Algorithm; benchmark |
| 2 | Map used as a tiny lookup | Golden Hammer | Java | Replace Data Structure; benchmark |
| 3 | ORM N+1 → one bulk query | Golden Hammer | Python | Replace Algorithm (bulk fetch); benchmark |
| 4 | Inline a one-implementation interface | Premature Abstraction | Go | Inline Class / Remove Speculative Generality |
| 5 | Collapse a one-case Strategy | Premature Abstraction | Python | Inline Function, Remove Middle Man |
| 6 | Segregate a bloated interface | Interface Bloat | Java | Extract Interface (ISP), role interfaces |
| 7 | Kill the UnsupportedOperationException | Interface Bloat | Go | Interface Segregation, Replace Throw with Type |
| 8 | Retire a config-driven mini-language | Inner-Platform Effect | Python | Replace Interpreter with Code |
| 9 | Rule engine → plain tested functions | Inner-Platform Effect | Java | Replace Conditional with Polymorphism / table |
| 10 | In-DB scripting → host code | Inner-Platform + Golden Hammer | Python | Move logic to host; bulk query |
| 11 | The abstraction you KEEP | Premature Abstraction (counter-case) | Go | Recognize a justified seam; do not inline |
| 12 | Plugin instead of a DSL | Inner-Platform Effect | Python | Replace DSL with constrained plugin API |
| 13 | The full combo cleanup | All four | Java | The whole toolbox, in order |
Exercise 1 — Regex where a substring will do¶
Anti-pattern: Golden Hammer (regex for everything). Goal: replace a compiled regex used for a literal prefix/suffix check with plain string operations and show the speedup. Constraints: identical boolean results for every input; this is a hot path, so prove the win with a benchmark.
// Before — a regex object built to answer "does this look like an image filename?"
var imageRe = regexp.MustCompile(`(?i)\.(jpg|jpeg|png|gif)$`)
func IsImage(name string) bool {
return imageRe.MatchString(name)
}
Refactored
**Move sequence** 1. **Characterize.** Table test the exact truth values today: `"a.JPG"`→true, `"a.jpeg"`→true, `"a.png "`→false (trailing space), `"a.txt"`→false, `""`→false, `"jpg"`→false. The `(?i)` and the anchored `$` are part of the contract — record them. 2. **Replace Algorithm** (Fowler). The pattern is just *case-insensitive suffix membership*. Lower-case once, then `strings.HasSuffix` against a small set. No regex engine, no backtracking, no per-call allocation. 3. **Benchmark before and after** on the same inputs; keep the regex version behind a build tag or in the test file so the benchmark can compare them directly.// After — plain suffix check over a fixed set.
var imageExts = []string{".jpg", ".jpeg", ".png", ".gif"}
func IsImage(name string) bool {
lower := strings.ToLower(name)
for _, ext := range imageExts {
if strings.HasSuffix(lower, ext) {
return true
}
}
return false
}
Exercise 2 — Map used as a tiny lookup¶
Anti-pattern: Golden Hammer (HashMap as the universal collection). Goal: replace a heap-allocating HashMap rebuilt on every call with a flat array/slice lookup over a tiny, fixed key space. Constraints: same return value per input; prove the allocation and time win.
// Before — a 7-entry map rebuilt on every call to translate a weekday index to a name.
String dayName(int dow) { // dow in 0..6
Map<Integer, String> names = new HashMap<>();
names.put(0, "Sun"); names.put(1, "Mon"); names.put(2, "Tue");
names.put(3, "Wed"); names.put(4, "Thu"); names.put(5, "Fri");
names.put(6, "Sat");
String n = names.get(dow);
if (n == null) throw new IllegalArgumentException("bad dow: " + dow);
return n;
}
Refactored
**Move sequence** 1. **Characterize.** Test `0..6` → exact names, and an out-of-range value (`7`, `-1`) → the same `IllegalArgumentException` message contract. 2. **Replace Data Structure** (Fowler: *Change Reference to Value* in spirit). The key space is a dense, contiguous `0..6` — that's the *definition* of an array index. A `HashMap` here pays for hashing, boxing of `Integer` keys, and bucket allocation to model what an array does for free. 3. **Hoist the constant.** The table is immutable; build it once as a `static final` array, not per call.// After — a hoisted, immutable array; lookup is one bounds check + one load.
private static final String[] DAY_NAMES =
{ "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
String dayName(int dow) {
if (dow < 0 || dow >= DAY_NAMES.length)
throw new IllegalArgumentException("bad dow: " + dow);
return DAY_NAMES[dow];
}
Exercise 3 — ORM N+1 → one bulk query¶
Anti-pattern: Golden Hammer (the ORM is the only data tool). Goal: replace a per-row lazy-load loop (the classic N+1) with a single bulk query. Constraints: the returned data must be identical; prove the query-count and latency win.
# Before — one query for orders, then one query PER order for its customer (N+1).
def order_summaries(session):
orders = session.query(Order).all() # 1 query
out = []
for o in orders:
cust = session.query(Customer).get(o.customer_id) # N queries (one per order)
out.append({"order_id": o.id, "customer": cust.name, "total": o.total})
return out
Refactored
**Move sequence** 1. **Characterize.** Seed a fixture DB (say 3 customers, 5 orders) and snapshot the returned list of dicts. *Also* assert the **query count** — that's the real subject. Wrap the call in a query counter (SQLAlchemy: `event.listen(engine, "before_cursor_execute", ...)`). 2. **Replace Algorithm** with a set-based fetch. Instead of asking the DB N times "who is customer X?", ask once: "give me every customer in this id set." Build an `id → name` dict in memory, then join locally. 3. (ORM-idiomatic alternative: a single `join`/`selectinload` eager load — note both, pick the one that keeps the contract.)# After — two queries total, regardless of order count.
def order_summaries(session):
orders = session.query(Order).all() # 1 query
cust_ids = {o.customer_id for o in orders}
customers = session.query(Customer).filter(Customer.id.in_(cust_ids)).all() # 1 query
name_by_id = {c.id: c.name for c in customers}
return [
{"order_id": o.id, "customer": name_by_id[o.customer_id], "total": o.total}
for o in orders
]
Exercise 4 — Inline a one-implementation interface¶
Anti-pattern: Premature Abstraction (speculative interface). Goal: remove an interface + factory introduced "in case we need another backend," when there is exactly one implementation and no second one in sight. Constraints: no behavior change; callers may be updated.
// Before — an interface, a factory, and a single concrete type. No second impl, ever.
type Clock interface {
Now() time.Time
}
type systemClock struct{}
func (systemClock) Now() time.Time { return time.Now() }
func NewClock() Clock { return systemClock{} }
// usage scattered through the package:
// c := NewClock()
// t := c.Now()
Refactored
**Move sequence** 1. **Confirm it's speculative, not seam-bearing.** Two questions: *Is there a second implementation?* (`grep` for `Clock` implementers → only `systemClock`.) *Does a test need to fake it?* If a test *does* substitute a fake clock, this is a **legitimate seam — stop, this is Exercise 11's situation, keep it.** Only proceed if nothing fakes it and nothing else implements it. 2. **Inline Class / Remove Speculative Generality** (Fowler). Collapse interface + factory + concrete into the one function that does the work. 3. **Update call sites** from `NewClock().Now()` to the direct call. Compile, run tests. **What improved & how to verify.** Three declarations (interface, struct, factory) collapse to one function; readers no longer chase an indirection that has a single destination. **Verify**: behavior is literally unchanged (`time.Now()` either way), and the build compiles with call sites updated. **The judgment, not the mechanics, is the lesson** — the *Rule of Three* says you extract an abstraction when the third concrete case reveals the shape, not before. A guessed interface around one implementation almost always guesses wrong, then ossifies. If a real second backend (or a test seam) appears later, *re-extract then* — against a real need, the interface shape will be right. > If you found a test faking the clock in step 1, you do **not** do this refactoring. See [Exercise 11](#exercise-11--the-abstraction-you-keep-counter-case).Exercise 5 — Collapse a one-case Strategy¶
Anti-pattern: Premature Abstraction (Strategy pattern with one strategy). Goal: inline a Strategy hierarchy that has exactly one concrete strategy and a registry that only ever holds it. Constraints: same computed discount; callers may change.
# Before — a Strategy interface, one strategy, and a "selector" that always returns it.
from abc import ABC, abstractmethod
class DiscountStrategy(ABC):
@abstractmethod
def apply(self, total: float) -> float: ...
class StandardDiscount(DiscountStrategy):
def apply(self, total: float) -> float:
return total * 0.9 if total > 100 else total
def select_strategy(_customer) -> DiscountStrategy:
return StandardDiscount() # never returns anything else
def final_price(customer, total):
strategy = select_strategy(customer)
return round(strategy.apply(total), 2)
Refactored
**Move sequence** 1. **Characterize.** Test the boundary (`total = 100` → no discount, `total = 100.01` → 10% off) and a couple of values. Snapshot. 2. **Confirm singularity.** `grep` subclasses of `DiscountStrategy` → only `StandardDiscount`; `select_strategy` ignores `customer` and always returns the same type. This is speculative generality, not a real polymorphic point. 3. **Inline Function + Remove Middle Man** (Fowler). Fold `apply` into a plain function; delete the ABC, the class, and the selector. **What improved & how to verify.** A three-part ceremony (abstract base, concrete class, selector) collapses to the four lines that actually compute anything. The `customer` parameter is kept in the signature only if callers rely on it — if it's genuinely unused, removing it is a *behavioral/API* change and belongs in its own commit. **Verify**: the boundary characterization test (`> 100`) is byte-identical. **When this is the wrong move:** if a *second* discount rule (loyalty, seasonal) is in the current sprint's backlog with real requirements, the Strategy seam is about to earn its keep — leave it. The anti-pattern is abstraction *ahead of need*, not abstraction itself.Exercise 6 — Segregate a bloated interface¶
Anti-pattern: Interface Bloat. Goal: split a fat interface that no implementer fully needs into small role interfaces (Interface Segregation Principle). Constraints: every existing call site keeps working; implementers stop being forced to satisfy methods they don't use.
// Before — a fat "repository" no caller uses in full; some impls fake half of it.
interface UserStore {
User findById(long id);
List<User> findAll();
void save(User u);
void delete(long id);
void bulkImport(InputStream csv); // only the admin importer needs this
byte[] exportCsv(); // only the reporting job needs this
void reindexSearch(); // only the search subsystem needs this
Stats computeStats(); // only the dashboard needs this
}
Refactored
**Move sequence** 1. **Characterize the clients, not the interface.** For each *caller*, list which methods it actually invokes. The read-path handler uses `findById`/`findAll`; the admin importer uses `bulkImport`; the dashboard uses `computeStats`; etc. This client-by-method matrix *is* the segregation map. 2. **Extract Interface** (Fowler), once per cohesive role. Each new interface contains only the methods one kind of client needs. 3. The big concrete implementation can `implements` all the small interfaces (it still does everything); **but each consumer now depends only on the narrow role it uses.** That's the ISP payoff — recompilation and test-doubling shrink to the role.// After — role interfaces; clients depend on the slice they use.
interface UserReader {
User findById(long id);
List<User> findAll();
}
interface UserWriter {
void save(User u);
void delete(long id);
}
interface UserBulkIo {
void bulkImport(InputStream csv);
byte[] exportCsv();
}
interface UserAnalytics {
void reindexSearch();
Stats computeStats();
}
// One concrete class can still satisfy all roles:
class JdbcUserStore implements UserReader, UserWriter, UserBulkIo, UserAnalytics { /* ... */ }
// Consumers narrow to a role:
class ProfileHandler {
private final UserReader users; // can't even see bulkImport / exportCsv
ProfileHandler(UserReader users) { this.users = users; }
}
Exercise 7 — Kill the UnsupportedOperationException¶
Anti-pattern: Interface Bloat (the smoking gun: implementers throw "not supported"). Goal: an interface so wide that a real implementer can't honor it and throws at runtime — segregate so the type system, not an exception, expresses capability. Constraints: callers that only use supported methods are unaffected; the runtime throw disappears.
// Before — a read-only view is forced to "implement" mutation, so it panics.
type Collection interface {
Get(i int) (string, bool)
Len() int
Add(s string)
Remove(i int)
}
type immutableView struct{ data []string }
func (v immutableView) Get(i int) (string, bool) {
if i < 0 || i >= len(v.data) { return "", false }
return v.data[i], true
}
func (v immutableView) Len() int { return len(v.data) }
func (v immutableView) Add(s string) { panic("immutable: Add not supported") } // bloat tell
func (v immutableView) Remove(i int) { panic("immutable: Remove not supported") } // bloat tell
Refactored
**Move sequence** 1. **Characterize.** Test `Get`/`Len` on the view, and that callers which only read never trigger the panics. (The panics themselves are the bug we're designing *out*, so we don't preserve them — we make them unreachable.) 2. **Interface Segregation** along the capability boundary: reading and mutating are different roles. **Extract Interface** `Reader` (the methods every collection supports) and `Mutator` (the methods only some support). 3. **Replace Throw with Type.** `immutableView` implements only `Reader`. A caller that needs mutation accepts a `Mutator` (or the combined `MutableCollection`); passing an `immutableView` where mutation is required is now a **compile error**, not a runtime panic.// After — capability is in the type; the panic is unrepresentable.
type Reader interface {
Get(i int) (string, bool)
Len() int
}
type Mutator interface {
Add(s string)
Remove(i int)
}
// A full collection composes both roles.
type MutableCollection interface {
Reader
Mutator
}
type immutableView struct{ data []string }
func (v immutableView) Get(i int) (string, bool) {
if i < 0 || i >= len(v.data) { return "", false }
return v.data[i], true
}
func (v immutableView) Len() int { return len(v.data) }
// No Add / Remove — and that's the point: it simply isn't a Mutator.
// A function that mutates demands the capability in its signature:
func appendAll(m Mutator, items []string) {
for _, it := range items { m.Add(it) }
}
// appendAll(immutableView{...}, ...) // ← does not compile
Exercise 8 — Retire a config-driven mini-language¶
Anti-pattern: Inner-Platform Effect. Goal: replace a tiny home-grown expression language (parsed from config strings at runtime) with plain, tested host code. Constraints: the rules currently in config must produce identical decisions; new rules become code changes (accepted trade-off).
# Before — a string "DSL" interpreted at runtime to decide free shipping.
# config: "total > 50 AND country == US OR vip == true"
def eligible_for_free_shipping(order, rule_string):
# a hand-rolled tokenizer + evaluator (~80 lines) that supports
# ==, >, <, AND, OR over a fixed set of order attributes.
tokens = tokenize(rule_string)
return evaluate(tokens, {
"total": order.total, "country": order.country, "vip": order.vip,
})
Refactored
**Move sequence** 1. **Inventory the real rules.** The "language" exists to express the *rules actually configured in production*. Pull every distinct `rule_string` from config/history. If there are three of them and they've changed twice a year, the interpreter's flexibility is theoretical — and it reimplements Python's own boolean expressions, badly (the Inner-Platform tell). 2. **Characterize.** For each real rule string, a test: representative orders → the eligibility decision the interpreter returns today. This pins behavior across the rewrite. 3. **Replace Interpreter with Code.** Each rule string becomes a named Python predicate — the host language *already has* `>`, `==`, `and`, `or`, tested and fast. Where rules vary by deployment, select among named predicates by key, not by parsing a string.# After — rules are ordinary, tested functions; the interpreter is deleted.
def standard_free_shipping(order) -> bool:
return (order.total > 50 and order.country == "US") or order.vip
# If the deployment really needs to choose among a few known rules:
RULES = {
"standard": standard_free_shipping,
"eu_promo": lambda o: o.total > 30 or o.vip,
}
def eligible_for_free_shipping(order, rule_name="standard") -> bool:
return RULES[rule_name](order)
Exercise 9 — Rule engine → plain tested functions¶
Anti-pattern: Inner-Platform Effect (a data-driven "rule engine"). Goal: migrate a generic condition/action rule engine (rules stored as data, dispatched by a loop) back to explicit, polymorphic code. Constraints: the same actions fire for the same inputs; the migration is incremental.
// Before — a generic engine: rules are rows; the engine is a soup of ifs over "operators".
class Rule {
String field; String op; Object value; String action; // stringly-typed everything
}
class RuleEngine {
void run(Map<String,Object> facts, List<Rule> rules, Actions actions) {
for (Rule r : rules) {
Object f = facts.get(r.field);
boolean match;
if (r.op.equals("eq")) match = f.equals(r.value);
else if (r.op.equals("gt")) match = ((Number) f).doubleValue() > ((Number) r.value).doubleValue();
else if (r.op.equals("lt")) match = ((Number) f).doubleValue() < ((Number) r.value).doubleValue();
else throw new IllegalStateException("unknown op " + r.op);
if (match) actions.fire(r.action); // action also stringly dispatched
}
}
}
Refactored
**Move sequence** 1. **Enumerate the real rules** (as in Ex. 8): the engine's generality only matters if the rule *set* is large and genuinely volatile. Usually it's a dozen fixed business rules wearing a generic costume. 2. **Characterize.** For the current rule rows, drive `run` with representative `facts` and assert which `actions.fire(...)` calls happen, in order. Mock `Actions`. 3. **Replace the engine with explicit rules.** Each rule becomes a small typed object implementing a `Rule` interface with a real `boolean matches(Facts)` and a typed action — **Replace Conditional with Polymorphism**. The stringly-typed `op`/`field`/`value` triple and the `if (op.equals(...))` ladder both disappear; the type system replaces them.// After — each rule is a typed, tested object; no interpreter, no stringly-typed dispatch.
interface Rule {
boolean matches(Facts f);
void fire(Actions a);
}
final class HighValueRule implements Rule {
public boolean matches(Facts f) { return f.orderTotal() > 1000; }
public void fire(Actions a) { a.flagForReview(); }
}
final class VipRule implements Rule {
public boolean matches(Facts f) { return f.isVip(); }
public void fire(Actions a) { a.applyVipPerks(); }
}
class Rules {
private final List<Rule> rules = List.of(new HighValueRule(), new VipRule());
void run(Facts f, Actions a) {
for (Rule r : rules) if (r.matches(f)) r.fire(a);
}
}
Exercise 10 — In-DB scripting → host code¶
Anti-pattern: Inner-Platform Effect + Golden Hammer. Goal: move business logic encoded as interpreted strings inside database rows (a "scripting" column the app evals) out into versioned host code, and fix the row-by-row access while you're there. Constraints: same computed fees; same totals.
# Before — each fee rule is a Python expression stored in a DB column and eval'd per row.
# table fee_rules(id, name, expr) e.g. expr = "amount * 0.029 + 0.30"
def total_fees(session, txns):
rules = session.query(FeeRule).all() # the "scripts"
total = 0.0
for t in txns:
for r in rules:
fee = eval(r.expr, {"amount": t.amount}) # interpreting DB-stored code (!!)
total += fee
return round(total, 2)
Refactored
**Move sequence** 1. **Name the two anti-patterns.** `eval` of DB-stored strings is the Inner-Platform Effect *and* a security hole (arbitrary code execution). Looping per-txn × per-rule with logic in data is the Golden Hammer (treating the DB as a code store). Both must go. 2. **Characterize.** Snapshot `total_fees` for a fixture set of txns against the *current* rule rows, so the migration is provably equivalent. 3. **Replace Interpreter with Code.** Each `expr` becomes a named, tested Python function. Fee rules live in version control, reviewed and unit-tested — not in a table where a bad `UPDATE` ships arbitrary code to production. 4. **Fix the access pattern.** The fee functions are pure; compute over the txns directly with a comprehension. No per-row DB navigation, no `eval` per row.# After — fee rules are versioned functions; eval and the inner loop are gone.
def standard_processing_fee(amount: float) -> float:
return amount * 0.029 + 0.30
FEE_RULES = [standard_processing_fee] # add named, tested functions here
def total_fees(txns) -> float:
total = sum(rule(t.amount) for t in txns for rule in FEE_RULES)
return round(total, 2)
Exercise 11 — The abstraction you KEEP (counter-case)¶
Anti-pattern: Premature Abstraction — but here the abstraction is justified. Goal: practice recognizing a legitimate seam and not inlining it. The discipline is knowing when to stop. Constraints: behavior unchanged — because you change nothing structural except adding a clarifying comment/test.
// "Before" — looks like Exercise 4: an interface with one production implementation.
type PaymentGateway interface {
Charge(amountCents int64, token string) (chargeID string, err error)
}
type stripeGateway struct{ apiKey string }
func (g stripeGateway) Charge(amountCents int64, token string) (string, error) {
return stripe.Charge(g.apiKey, amountCents, token) // real network call to Stripe
}
// And in tests:
type fakeGateway struct{ lastAmount int64 }
func (f *fakeGateway) Charge(amountCents int64, token string) (string, error) {
f.lastAmount = amountCents
return "test_charge_123", nil
}
func TestCheckoutChargesCorrectAmount(t *testing.T) {
fake := &fakeGateway{}
checkout(fake, cart) // ← the seam earns its keep
if fake.lastAmount != 4200 { t.Fatal("wrong amount") }
}
Refactored
**Decision sequence — and the decision is: KEEP IT.** 1. **Run the same two questions as Exercise 4.** *Is there a second implementation?* Yes — `fakeGateway` in tests. *Does removing the interface break a test seam?* Yes — `TestCheckoutChargesCorrectAmount` substitutes a fake to avoid a real Stripe charge during tests. 2. **Recognize the difference from Exercise 4.** A `Clock` that nothing fakes is speculative. A `PaymentGateway` that a test fakes — to avoid hitting a paid external API with side effects (real money!) — is a **legitimate test seam**. The interface here isn't speculation about a *future* second implementation; the second implementation **already exists** (the fake) and serves a present need (fast, deterministic, side-effect-free tests). 3. **Do nothing structural. Make the justification explicit instead.** The correct "refactoring" is to *not* inline, and to record *why*, so the next reader doesn't mistake it for Exercise 4 and delete it.// After — unchanged code, plus a comment that states the seam's reason.
// PaymentGateway is an intentional seam: it is faked in tests to avoid
// real Stripe charges (network + money side effects). Do NOT inline —
// the fake implementation is the justifying second case (see gateway_test.go).
type PaymentGateway interface {
Charge(amountCents int64, token string) (chargeID string, err error)
}
Exercise 12 — Plugin instead of a DSL¶
Anti-pattern: Inner-Platform Effect. Goal: the requirement is real runtime extensibility — but instead of a home-grown DSL (Ex. 8/9's cure doesn't apply when configurability is genuinely needed), expose a constrained plugin API in the host language. Constraints: existing built-in behaviors unchanged; third parties can add behaviors without editing core.
# Before — a half-built DSL grown to support "custom" export formats at runtime.
# Users write strings like: "FIELD name | UPPER | PAD 10 | FIELD price | MULT 1.1"
# parsed and interpreted by ~150 lines of pipeline-language code.
def export_row(row, pipeline_string):
ops = parse_pipeline(pipeline_string) # custom tokenizer + op table
return run_pipeline(ops, row) # custom interpreter
Refactored
**Move sequence** 1. **Distinguish from Exercise 8.** There, configurability was *theoretical* (three static rules) → inline to code. **Here, extensibility is a real product requirement**: third parties must add export formats *without a core deploy*. So we don't delete the extension point — we *replace the DSL with a plugin protocol in the host language.* 2. **Characterize.** Pin the built-in pipelines' output for fixture rows. 3. **Replace DSL with a constrained plugin API.** Define a narrow `Formatter` protocol (one method). Built-ins implement it; third parties implement it in real Python — with their language's full tooling (tests, types, debugger) and *no* hand-rolled parser/interpreter. Register plugins by name. The "language" is gone; the extensibility remains.# After — a registry of host-language plugins; no parser, no interpreter.
from typing import Protocol
class Formatter(Protocol):
def format(self, row: dict) -> str: ...
class NameUpperFormatter:
def format(self, row: dict) -> str:
return row["name"].upper().ljust(10)
_REGISTRY: dict[str, Formatter] = {}
def register(name: str, formatter: Formatter) -> None:
_REGISTRY[name] = formatter
register("name_upper", NameUpperFormatter()) # built-in
# third parties: register("my_format", MyFormatter()) — real code, their own tests
def export_row(row: dict, formatter_name: str) -> str:
return _REGISTRY[formatter_name].format(row)
Exercise 13 — The full combo cleanup¶
Anti-pattern: all four at once. Goal: one component exhibiting a Golden Hammer (regex + per-call map), Interface Bloat, a tiny Inner-Platform rule string, and a Premature Abstraction. Tackle them in a safe order. Constraints: preserve the routing decision and the audit string; one anti-pattern per commit.
// Before — a notification router with a bit of every abstraction failure.
interface Channel { // BLOAT: nobody implements all of this
void send(String to, String body);
void sendBulk(List<String> to, String body); // only the newsletter job uses this
byte[] renderPreview(String body); // only the admin UI uses this
Stats deliveryStats(); // only the dashboard uses this
}
abstract class AbstractRouter { // PREMATURE: one subclass, ever
abstract Channel pick(String userPref);
}
class Router extends AbstractRouter {
Channel pick(String userPref) {
// GOLDEN HAMMER 1: regex to test a fixed literal
if (Pattern.matches("sms", userPref)) return sms;
// GOLDEN HAMMER 2: a map rebuilt every call for 2 keys
Map<String,Channel> m = new HashMap<>();
m.put("email", email); m.put("push", push);
// INNER-PLATFORM: a stringly "rule" parsed inline
if (userPref.startsWith("rule:")) {
String[] kv = userPref.substring(5).split("="); // "channel=email"
return m.get(kv[1]);
}
return m.getOrDefault(userPref, email);
}
}
Refactored
**Move sequence — safest, most-isolated moves first; one anti-pattern per commit.** 1. **Characterize.** Table test `pick` for `"sms"`, `"email"`, `"push"`, `"rule:channel=email"`, and an unknown pref (→ default `email`). Assert the chosen `Channel` for each. This locks routing. 2. **Inline the Premature Abstraction** (`AbstractRouter` → `Router`). One subclass, no test fake (apply Ex. 11's two-question test → both no) → **Inline Class / Remove Speculative Generality**. Own commit; pure structural. 3. **Segregate the bloated `Channel`** (ISP). `send` is the only method `Router` uses — extract `MessageSender { send(...) }`; move `sendBulk`/`renderPreview`/`deliveryStats` to role interfaces their real clients depend on. `Router` now depends only on `MessageSender`. Own commit. 4. **Replace the Inner-Platform rule string** with the plain dispatch it's faking — `"rule:channel=email"` is a baroque way to say `"email"`. Normalize at the edge (or drop the format if no caller sends it — check first). Own commit. 5. **Fix the two Golden Hammers.** Regex-for-literal → `userPref.equals("sms")`; per-call `HashMap` → a hoisted `static final` table. Own commit, with a micro-benchmark if this is hot.// After — flat dispatch over a hoisted table; narrow dependency; no interpreter.
interface MessageSender { // the only role Router needs
void send(String to, String body);
}
// (sendBulk / renderPreview / deliveryStats live on BulkSender / Previewer /
// StatsSource — implemented by the concrete channels, used by their real clients.)
class Router {
private static final Set<String> KNOWN = Set.of("sms", "email", "push");
private final Map<String, MessageSender> channels; // injected once, not rebuilt
Router(Map<String, MessageSender> channels) { this.channels = channels; }
MessageSender pick(String userPref) {
String key = userPref.startsWith("rule:") // normalize legacy "rule:channel=x"
? userPref.substring(5).split("=")[1]
: userPref;
return KNOWN.contains(key) ? channels.get(key) : channels.get("email");
}
}
Refactoring discipline — the recap¶
Every exercise above runs the same loop. Internalize it and the specific moves become mechanical:
green → one named refactoring → green → commit → repeat
(a MEASURED perf win is allowed: characterization proves output,
benchmark proves cost — they are separate, recorded claims)
- Characterize before you change. A test that pins current behavior (quirks and all) is the seatbelt. For the perf exercises (1, 2, 3, 10) add a benchmark so "faster" is a number you can defend, and an assertion on the resource that was being wasted (query count in Ex. 3, allocations in Ex. 2) so the regression can't creep back.
- Match the tool to the problem (cure for Golden Hammer). Regex is for patterns, not fixed prefixes; a hash map is for sparse keys, not a dense
0..n; an ORM's per-object navigation is for single objects, not sets. The fix is almost always "use the smaller, more specific tool," and it's almost always faster. - Don't rebuild the platform inside the platform (cure for Inner-Platform Effect). If you're writing a tokenizer, an operator table, or an
evalof stored strings, you're reimplementing your host language — badly and insecurely. Push logic into real, tested, version-controlled code. When extensibility is a genuine requirement, expose a constrained plugin API, not a DSL (Ex. 12). - Segregate by client, not by convenience (cure for Interface Bloat). When an implementer throws "not supported," the interface is a union of capabilities that should be separate role interfaces. Split by who uses what together; let the type system, not a runtime exception, express capability.
- Honor the Rule of Three — both directions (cure for Premature Abstraction). Extract an abstraction only when a real case needs it; inline one that has a single destination. But count test doubles and real boundaries as cases — the one abstraction you keep (Ex. 11) is as important as the ones you inline.
- Name the move. "Replace Algorithm," "Replace Data Structure," "Inline Class," "Remove Speculative Generality," "Extract Interface," "Replace Conditional with Polymorphism," "Replace Interpreter with Code." A named move is a known-safe, often IDE-automatable transformation — not an improvisation.
- Separate structural from behavioral commits. A refactor preserves behavior; a perf win is allowed because the characterization test proves output is unchanged. Capability tightenings (the removed
panicin Ex. 7, the impossible-to-construct illegal call in the bloat fixes) get called out in the PR so reviewers expect them.
| Move | Cures | Exercises |
|---|---|---|
| Replace Algorithm (regex→strings, per-row→bulk) | Golden Hammer | 1, 3, 10 |
| Replace Data Structure (map→array, hoist constant) | Golden Hammer | 2 |
| Inline Class / Remove Speculative Generality | Premature Abstraction | 4, 5, 13 |
| Inline Function / Remove Middle Man | Premature Abstraction | 5 |
| Extract Interface → role interfaces (ISP) | Interface Bloat | 6, 7, 13 |
| Replace Throw with Type | Interface Bloat | 7 |
| Replace Interpreter with Code | Inner-Platform Effect | 8, 9, 10 |
| Replace Conditional with Polymorphism | Inner-Platform Effect | 9 |
| Replace DSL with constrained Plugin API | Inner-Platform Effect | 12 |
| Recognize a justified seam — keep it | Premature Abstraction (counter-case) | 11 |
Related Topics¶
tasks.md— guided exercises building these moves from scratch.find-bug.md— the spotting counterpart: identify the abstraction failure, don't fix it.junior.md·middle.md·senior.md·professional.md— recognize → countermove → migrate at scale → perf/testability implications.- Refactoring → Refactoring Techniques — the mechanical catalog for every move named above (Inline Class, Extract Interface, Replace Algorithm).
- Clean Code → Classes — the Interface Segregation Principle and cohesion, the target state for Exercises 6–7.
- Design Patterns → Structural — Adapter/Facade for narrowing wide interfaces; the positive counterparts to bloat.
- Design Patterns → Behavioral — Strategy done right (extracted at the Rule of Three), the counterpart to Exercises 5 and 9.
In this topic