Bloaters — Senior Level¶
Focus: "How to architect?" "How to optimize?" — system-scale impact, tooling, CI integration, and large-codebase migration patterns.
Table of Contents¶
- Bloaters at architectural scale
- Detection: linters, AST analysis, custom tooling
- Architectural cures
- Migration patterns for large codebases
- CI/CD integration
- Metrics that catch Bloaters early
- Code-review heuristics
- Architectural anti-patterns built on Bloaters
- The "rewrite vs. refactor" decision
- Review questions
Bloaters at architectural scale¶
A Long Method is the same smell as a Long Service or a Long Microservice. A Large Class is the same smell as a Monolithic Service. Bloaters scale up — the cures scale up too.
| Code-level smell | Architectural-level analog |
|---|---|
| Long Method | A microservice that does too many things in one HTTP handler |
| Large Class | A "monolith" microservice (sometimes called a "macroservice") |
| Primitive Obsession | A String userId flowing through service boundaries with no schema |
| Long Parameter List | An API endpoint with 14 query parameters |
| Data Clumps | Repeated address fields across 12 different REST endpoints |
The cures map the same way:
| Code-level cure | Architectural-level cure |
|---|---|
| Extract Method | Extract a private function / library |
| Extract Class | Extract a microservice (or a bounded context within a monolith) |
| Replace Data Value with Object | Define a schema-versioned domain type, share via .proto / .avsc / openapi.yaml |
| Introduce Parameter Object | Define a request DTO; version it |
| Extract Class (for clumps) | Extract a shared schema referenced by all endpoints |
Warning: scaling up the cure must be deliberate. "Extract Microservice" applied carelessly produces a distributed monolith — same code-level smells, plus latency and ops overhead.
Detection: linters, AST analysis, custom tooling¶
You can't ship a fix for what you can't measure. Production-grade Bloater detection uses several layers:
1. Built-in linter rules¶
| Tool | Long Method | Large Class | Long Param List | Notes |
|---|---|---|---|---|
| SonarQube | S138 (max method length) | S1820 (too many fields), S1448 (too many methods) | S107 (too many params) | Configurable thresholds; baseline mode for legacy code |
| IntelliJ inspections | "Method is too long" | "Class is too big" | "Method has too many parameters" | Severity tunable per project |
| Checkstyle (Java) | MethodLength | ClassDataAbstractionCoupling, ClassFanOutComplexity | ParameterNumber | XML-configured |
| PMD (Java) | ExcessiveMethodLength, CognitiveComplexity | GodClass (custom rule), ExcessiveClassLength | ExcessiveParameterList | Rule-based; CI-friendly |
| golangci-lint | funlen (function length), gocognit (cognitive complexity) | gocyclo (file-level only — Go has no classes) | revive: argument-limit | Compose multiple linters |
| Pylint | R0915 (too-many-statements), C0301 line length | R0902 (too-many-instance-attributes) | R0913 (too-many-arguments) | Configurable in .pylintrc |
| ruff (Python, fast) | PLR0915 | PLR0902 | PLR0913 | Drop-in replacement for Pylint with similar codes |
| ESLint (JS/TS) | max-lines-per-function, max-statements, complexity | max-classes-per-file | max-params | Plugin-extendable |
2. Cognitive complexity (preferred over cyclomatic)¶
Cyclomatic complexity counts decision points (if, &&, case). It's blind to nesting — a flat 20-branch switch scores the same as 20 deeply nested ifs.
Cognitive complexity (Sonar's invention) penalizes nesting. A 3-deep nested if scores higher than a flat sequence with the same branch count. This matches what humans actually find hard.
function classify(x) { // CC: 0
if (x > 0) { // CC: +1
if (x > 100) { // CC: +2 (nested)
if (x > 1000) { // CC: +3 (nested)
return "huge";
}
}
}
}
// Cyclomatic: 4. Cognitive: 6. The cognitive score reflects the actual reading cost.
Recommended thresholds: - Cognitive complexity per method: 15 (warn), 25 (fail) - Method length: 50 lines (warn) - Class field count: 15 (warn), 25 (fail) - Method parameter count: 5 (warn), 8 (fail)
3. Custom AST checks¶
For project-specific Bloaters, build a small AST analyzer:
- Java: Spoon, JavaParser, Eclipse JDT
- Python:
ast(built-in),libcst(preserves formatting) - Go:
go/ast,go/types,golang.org/x/tools/go/analysis - TypeScript: the TypeScript Compiler API
Useful custom checks:
- "Money as primitive" detector: any parameter or field named
*amount,*price,*cost,*totalwhose type isdouble/float/BigDecimalinstead of a project-definedMoneytype. - "ID as bare string" detector: any parameter or field whose name ends in
Id/_idwhose type isString/UUIDinstead of a typed ID. - "Address clump" detector: any class declaring 3+ of {street, city, state, zip, country} as separate fields.
- "Boolean parameter explosion" detector: any method with ≥3 boolean parameters.
4. Architectural fitness functions¶
For the architectural-scale Bloaters, ArchUnit (Java) and NetArchTest (.NET) let you assert architecture invariants in tests:
@ArchTest
static final ArchRule services_must_not_exceed_15_methods =
classes().that().resideInAPackage("..service..")
.should().haveLessThanOrEqualTo(15)
.methodsThatAreNotPrivate();
@ArchTest
static final ArchRule rest_endpoints_must_use_dto =
methods().that().areAnnotatedWith(PostMapping.class)
.should().haveRawParameterTypes(DescribedPredicate.describe(
"exactly one DTO parameter",
(params) -> params.size() == 1
));
These run as JUnit tests; they fail the build if the architecture drifts.
Architectural cures¶
From a 5,000-line OrderManager to a bounded context¶
A Large Class at module scale becomes a bounded context (DDD term). The transformation:
- Identify cohesion clusters inside the class. Use method-field usage matrix: rows = methods, columns = fields, mark cells where method touches field. Clusters appear as block-diagonal patterns.
- Extract one cluster at a time into its own package/module/service.
- Define a stable interface between the new module and the original. Avoid leaking internal types.
- Migrate callers to the new interface.
- Repeat until the original class is small or empty.
From a god service to microservices¶
The same process at service scale:
- Identify subdomain clusters via use-case analysis or event storming.
- Extract one subdomain into its own service. Define the anti-corruption layer that maps between old and new domain models.
- Migrate clients via strangler fig (route new calls to new service; old calls keep working).
- Eventually retire the corresponding code from the monolith.
Critical: the order matters. Extract the least-coupled cluster first. Extracting the most-coupled cluster first creates a chatty distributed system that's worse than the monolith.
Schema-driven cures for service-boundary Primitive Obsession¶
When a String userId flows through five services with five different validation rules:
- Define
UserIdin a shared schema (Protobuf, Avro, or JSON Schema). - Generate code per language.
- Each service uses the generated type — validation, serialization, and equality come for free.
- Schema versioning becomes the contract.
This is the architectural analog of Replace Data Value with Object.
Migration patterns for large codebases¶
Strangler fig¶
Originated by Martin Fowler. Wrap the bloater (a class, a service, a database table) in a new interface. Route new use cases to a new implementation. Migrate old callers gradually. The new code "strangles" the old, replacing it leaf by leaf.
When to use: large, change-prone bloaters. The bloater stays operational throughout.
Branch by abstraction¶
Introduce an interface over the bloater. Provide two implementations: the old (delegates to bloater) and the new (refactored). Switch via feature flag. Both implementations live side-by-side until rollout completes; then the old one is deleted.
When to use: when you need to roll out the change gradually with a kill-switch.
Mikado method¶
Attempt the desired refactoring. Record everything that breaks. Revert. Now the broken things become prerequisite refactorings. Apply them. Re-attempt. Recurse.
The output is a dependency tree of small refactorings (the "Mikado graph"), executed bottom-up in safe steps.
When to use: when the bloater is too tangled to attack head-on.
Characterization tests¶
Before refactoring a bloater that has no tests:
- Run the bloater on representative inputs in production.
- Capture inputs and outputs (use VCR-style recording).
- Replay them as tests against the current code — they pass by construction.
- Refactor; the captured tests now act as a safety net.
These tests don't document what should be true — they document what is true today. Once the refactoring is done, replace them with intent-based tests.
CI/CD integration¶
Baseline mode for legacy code¶
The hardest part of introducing a linter to a legacy codebase: the first run produces 10,000 violations. Don't fail the build on all of them — adopt a baseline:
- Run the linter once. Save the violations to
lint-baseline.json. - Future builds fail only on new violations or changes to existing violation lines.
- The baseline shrinks over time as violations are fixed.
SonarQube, ESLint, golangci-lint, and ruff all support baseline mode.
"Don't make it worse" gate¶
A weaker but practical gate: PRs cannot increase the violation count for changed files. This:
- Allows urgent fixes that don't worsen the code.
- Prevents drift toward more bloating.
- Doesn't require fixing the entire backlog.
Per-PR complexity diff¶
Some teams report cognitive complexity delta per PR:
This makes the cost of "just one more conditional" visible at review time.
Quality gates by responsibility area¶
A common compromise: stricter thresholds for new code, looser for legacy. Tag legacy directories in the linter config. New files get full enforcement; legacy files get the baseline approach.
Metrics that catch Bloaters early¶
| Metric | Catches | Trigger threshold |
|---|---|---|
| Lines of code per method | Long Method | > 50 |
| Cognitive complexity per method | Long Method (nuanced) | > 15 |
| Field count per class | Large Class | > 15 |
| Method count per class | Large Class | > 30 |
| Parameter count per method | Long Parameter List | > 5 |
| Boolean parameters per method | LPL with hidden behavior | ≥ 3 |
| Fan-out (classes referenced) | Large Class | > 20 |
| Fan-in (classes referencing this) | Possible god class | > 30 |
| Change frequency × file size | Bloater hotspots | top 5% × > 500 lines |
The change frequency × size metric (sometimes called "code hotspot") is especially useful: a 2,000-line class that hasn't changed in 3 years isn't worth refactoring; a 600-line class changed every week is the highest-value target.
Tooling:
git log --pretty=format:%H --name-only | sort | uniq -c | sort -nrgives a quick "files changed most often" list. Combine withwc -lfor size.
Code-review heuristics¶
Reviewers should flag:
- Adding a 6th parameter to an existing method. Suggest parameter object.
- Adding a 16th field to an existing class. Suggest extracting a sub-concept.
- Adding a third boolean parameter. Suggest split into separate methods.
- Adding a 4-field clump of
Strings with related names. Suggest class extraction. - Methods crossing 60 lines in the diff. Suggest extraction.
- A new function returning a tuple of 4+ unnamed values (Go especially). Suggest a result struct.
- Adding
// step 1,// step 2comments inside a method. The "steps" want to be methods.
Code review is the cheapest place to catch Bloaters — they're 5 lines now, not 500.
Architectural anti-patterns built on Bloaters¶
| Anti-pattern | Underlying smell |
|---|---|
| God Class / God Object | Large Class |
| Anemic Domain Model | Data Clumps + Primitive Obsession (data with no behavior) |
| Shotgun Architecture | Long Parameter List spreading across services |
| Magic Strings / Magic Numbers | Primitive Obsession |
| Distributed Monolith | Long Method scaled to "single workflow spread across N services" |
| Stringly-Typed APIs | Primitive Obsession at API surface |
| The Big Ball of Mud | All five Bloaters compounded |
Naming the anti-pattern often clarifies the underlying smell — and points to which cure to start with.
The "rewrite vs. refactor" decision¶
When a bloater is bad enough to consider a rewrite:
| Favors refactor | Favors rewrite |
|---|---|
| Bloater is < 10,000 lines | Bloater is > 100,000 lines |
| Tests exist or can be added | No tests, no chance of writing them |
| Changes are frequent (work is amortized) | Changes are rare |
| Domain understanding lives in the team | Domain understanding has been lost |
| Tech stack is current | Tech stack is dead (e.g., Java 6, Python 2) |
| Business logic is correct | Business logic itself is wrong |
Most decisions should favor refactoring. Rewrites famously fail more often than refactors — Joel Spolsky's "Things You Should Never Do" essay (2000) and many post-mortems confirm: the rewrite usually re-introduces every bug the original learned to handle, plus new ones, plus the business logic the rewriter didn't know about.
Compromise: strangler fig lets you incrementally rewrite without committing to a big-bang replacement. New functionality goes to the new system; old paths stay in the bloater until they age out.
Review questions¶
-
A 5,000-line class is changed once a year and has no test failures. Should you refactor it? Probably not on its own merits. Cost-benefit: 1 change/year × low pain = low ROI on refactoring. But check: is it blocking a needed change now? If yes, refactor enough to unblock the change. Otherwise leave it.
-
Your linter reports 8,000 Bloater violations on Day 1. What's the practical move? Adopt a baseline. Don't try to fix all 8,000 — you'll lose. Configure the linter to fail only on new violations and on changes to violation-bearing lines. The backlog shrinks naturally as files are touched.
-
A team argues that microservices "by design" prevent Large Class. Defend or refute. Refute. Microservices change where the smell appears. A 5,000-line class becomes a 5,000-line service. The smell follows the code. The cure (Extract Class → Extract Service) is the same idea applied at a different scale.
-
Cognitive complexity vs. cyclomatic complexity — when is cyclomatic still useful? Cyclomatic is still useful for test coverage planning — it estimates the number of distinct paths through a method, which roughly maps to the number of tests needed. For maintainability, cognitive complexity is closer to what humans experience.
-
You're asked to refactor a Bloater but only have 30% test coverage. Plan? Add characterization tests for the changed code paths first. Use VCR-style recording or property-based tests to capture current behavior. Then refactor inside the test net. Finally, replace characterization tests with intent-based tests that document desired behavior.
-
An architectural review claims "every endpoint should take exactly one DTO." Is that always right? Mostly yes. But health checks, simple GETs by ID, and metrics endpoints are pure overhead with a wrapper DTO. The rule's spirit is "no Long Parameter List at the boundary" — apply it where it pays.
-
What's the relationship between Bloaters and code coverage? Bloaters are usually undertested because they're hard to test. A 400-line method with 30 branches needs many tests; teams often write 1–2 happy-path tests and call it done. Splitting the bloater makes tests possible. Coverage as a metric follows the refactor, not precedes it.
-
A team uses
var args(String... names) to "avoid" Long Parameter List. Is that valid? No — it disguises the smell. Varargs is fine when the parameters are genuinely homogeneous (Math.max(int...)). It's wrong when used as a hatch to avoid naming individual parameters; the cure for hidden parameters is parameter objects, not making them invisible. -
Strangler fig vs. branch by abstraction — which when? Strangler fig: gradual replacement at use site level, often spanning months or years. Branch by abstraction: side-by-side implementations gated by a flag, usually completing in days/weeks. Strangler fig for big architectural moves; branch by abstraction for tactical refactors with rollback safety.
-
Why does cognitive complexity penalize nesting more than line count? Reading nested code requires holding the surrounding context in your head as you descend. Every nesting level adds a layer of "what's true at this point." A flat 100-line method imposes a fixed memory cost; a 5-deep nested 50-line method imposes a stack of conditions. The latter is harder despite being shorter.
Next: professional.md — runtime, JIT, GC, allocation patterns, and the actual measured cost of value objects vs. primitives.