Shotgun Surgery - Specification¶
Audience: engineers who want to detect shotgun surgery objectively, with numbers, instead of relying on the gut feeling that "this PR touches too many files." This file defines the metrics, the thresholds, and the tooling.
Fowler in Refactoring (2nd ed., ch. 3) names the smell but does not quantify it. Quantification came later, from the temporal-coupling research that Adam Tornhill compiled in Your Code as a Crime Scene (Pragmatic, 2015) and operationalized in the Codescene product. This file collects the metrics that actually matter, their thresholds, and the tools - open-source and commercial - that compute them.
1. Primary metric: change coupling percentage¶
Definition. For files A and B, let commits(A) be the set of commits that modified A, and commits(B) likewise. The change coupling between A and B is:
This is the Jaccard index over commit sets. Some tools use a directional variant:
which reads as "when A changes, what fraction of the time does B also change?" The directional form is more useful for spotting shotgun surgery, because it tells you the blast radius from a specific origin file.
Thresholds (empirical, from Codescene's published guidance and Tornhill's books):
| Coupling % | Interpretation | Action |
|---|---|---|
| 0 - 20% | Independent or weakly related files | No action |
| 20 - 40% | Moderate coupling; investigate if surprising | Add to watch list |
| 40 - 70% | Strong coupling; refactor candidate | Open a refactoring ticket |
| 70 - 100% | Shotgun surgery confirmed | Refactor this iteration; do not defer |
These thresholds assume a meaningful sample - at least 30 commits per file over at least 6 months. Below that, the signal is noise.
2. Secondary metric: bidirectional change count¶
The simpler, less statistical version. Over a window (typically 3 or 12 months), count how many distinct commits touched more than N files. The distribution is the signal:
| Files per commit | Interpretation |
|---|---|
| 1 - 3 | Normal change |
| 4 - 8 | Feature work or refactor |
| 9 - 20 | Suspicious - likely shotgun surgery |
| > 20 | Almost certainly shotgun or generated code |
Generated-code commits (formatter runs, license headers, codegen output) inflate this; filter them via .gitattributes linguist-generated=true or a path glob.
The metric to track over time is the p95 of files-per-commit. If it climbs from 6 to 14 over two quarters, shotgun surgery is accumulating somewhere even if no individual PR is alarming.
3. Hotspot metric (composite)¶
A file is a shotgun-surgery hotspot if it satisfies all three:
- It appears in more than 5% of all commits in the window.
- It has at least 3 partners with coupling >= 40%.
- Its cyclomatic complexity exceeds the codebase median by 2x.
The third filter discriminates between "boring high-traffic file" (e.g., a constants file, low complexity) and "central tangle" (high complexity + high coupling = the real target).
Codescene reports hotspots ranked by a composite of code health (a 1-10 metric) and effort spent. The output is a prioritized list, not a flat dump.
4. Tools¶
Codescene (commercial, hosted or on-prem). Walks the full git history, computes coupling per pair, ranks hotspots, integrates with PR checks. Output includes the X-Ray view (function-level coupling within a file), team-knowledge maps, and code-health trend lines. Free tier exists for OSS projects. Most teams end up here once they take coupling seriously.
CodeMaat (open-source, JVM CLI by Adam Tornhill). The predecessor to Codescene's analysis engine. Runs as a Clojure jar against git log output:
git log --all --numstat --date=short \
--pretty=format:'--%h--%ad--%aN' --no-renames > evo.log
java -jar code-maat-1.0.4-standalone.jar \
-l evo.log -c git2 -a coupling > coupling.csv
The CSV has columns entity, coupled, degree, average-revs. Filter degree >= 40 to find pairs to refactor. CodeMaat also offers summary, revisions, entity-effort, entity-ownership analyses. Free, scriptable, CI-friendly.
gitqualia (open-source Python). Lightweight, generates HTML reports of coupling and hotspots. Good for a first look without setting up a JVM tool.
git-of-theseus (open-source Python). Plots code-age over time, complementary metric. A file whose oldest lines keep getting younger is a shotgun-surgery target.
Custom git log scripts. For ad-hoc investigation:
# Top 20 most-changed files in the last 12 months
git log --since="12 months ago" --name-only --pretty=format: \
| grep -v '^$' | sort | uniq -c | sort -rn | head -20
# Files most often changed alongside Order.java
git log --since="12 months ago" --name-only --pretty=format:"=" \
--follow -- src/main/java/com/acme/order/Order.java \
| awk 'BEGIN{RS="="} {for(i=1;i<=NF;i++) if($i != "Order.java") print $i}' \
| sort | uniq -c | sort -rn | head -10
The second command answers the directional question: "when Order.java changes, what else changes?" The output ranks the shotgun radius.
5. Calibration warnings¶
Three traps that turn change coupling into noise:
5.1 Bulk renames and large reformats. A single commit that runs google-java-format on the whole repo creates artificial coupling between every pair of files in it. Filter such commits by author, message pattern, or size threshold (--shortstat filtering commits with > 100 files).
5.2 Monorepo skew. In a polyglot monorepo, package-lock.json, pom.xml, and BUILD.bazel files change with almost everything. They will dominate the top-coupled list. Either exclude them by path glob or analyze each language tree separately.
5.3 Short history. A new file with 5 commits and 5 coupled changes shows 100% coupling but no real signal. Require a minimum of 20-30 commits before trusting the percentage.
6. PR-level early warning¶
You can catch shotgun surgery before it lands by gating PRs:
- PR size warning. Flag PRs touching more than 15 files. Not block - warn.
- Coupling-aware diff bot. When a PR touches file A, the bot comments: "Historically, file A co-changes with files X, Y, Z. This PR does not touch them. Confirm intentional." Codescene ships this; CodeMaat plus a few hundred lines of Python reproduces it.
- Module boundary lint. ArchUnit rules that flag new edges between modules. Shotgun surgery rarely arrives in one PR; it accumulates one cross-module reference at a time.
7. Reporting cadence¶
A pragmatic rhythm:
| Cadence | Activity |
|---|---|
| Per PR | Files-touched count visible in CI summary |
| Weekly | Top-10 coupled pairs delta vs last week |
| Monthly | Hotspot refresh; pick one cluster to refactor |
| Quarterly | p95 files-per-commit trend; module-boundary review |
Without a cadence the data is just a graph. With one, it becomes the input to the refactoring backlog.
8. What you write down¶
For each detected shotgun-surgery cluster, the ticket should capture:
Cluster: Order, OrderDTO, OrderMapper, OrderValidator, OrderEventV2
Window: 2025-10-01 to 2026-04-30
Commits in window: 47
Files in cluster: 5
Average coupling (pairwise): 78%
Top business reason for co-change: adding a new field to Order
Proposed fix: Inline OrderDTO into Order; move validation into Order; sealed OrderEvent
Estimated ROI: 60% reduction in files-per-commit for order changes
This template, repeated for every cluster, turns the abstract smell into a tracked refactoring stream.
9. Canonical literature — where the smell and its cures are defined¶
The metrics above quantify the smell; the canonical text names it and prescribes the refactorings. Map every claim back to these sources.
| Claim | Authoritative source |
|---|---|
| The smell itself: "one change → many little edits in many classes" | Fowler, Refactoring, 2nd ed. (2018), ch. 3, "Shotgun Surgery" |
| Its mirror image: "one class → many reasons to change" | Fowler, Refactoring, 2nd ed., ch. 3, "Divergent Change" |
| Cure — gather scattered behaviour onto its data | Fowler, ch. 8, Move Function (was Move Method) and Move Field |
| Cure — fold a thin helper back into its owner | Fowler, ch. 7, Inline Class |
| Cure — give a smeared free-function family a class home | Fowler, ch. 6, Combine Functions into Class |
Cure — replace scattered switch/type-code with dispatch | Fowler, ch. 10, Replace Conditional with Polymorphism; ch. 12, Replace Type Code with Subclasses |
| The underlying principle both smells violate | Martin, Clean Code (2008), ch. 10, and Agile Software Development (2002), ch. 8 — Single Responsibility Principle |
| "Reason to change" = a single actor/stakeholder | Martin, Clean Architecture (2017), ch. 7 — SRP restated as "one actor" |
The two-sentence diagnosis Fowler gives is exact and worth memorising verbatim: Divergent Change occurs "when one class is commonly changed in different ways for different reasons"; Shotgun Surgery is "the opposite … when every time you make a kind of change, you have to make a lot of little changes to a lot of different classes." Both are SRP failures — Divergent Change crams many responsibilities into one class; Shotgun Surgery smears one responsibility across many. See ../../03-design-principles/01-solid-principles/.
10. Connascence — the precise coupling vocabulary¶
Meilir Page-Jones's connascence gives the smell a sharper name than "coupling". Two elements are connascent if a change to one requires a matching change to the other to preserve correctness. Shotgun Surgery is high degree connascence (many elements connascent on one fact) combined with low locality (those elements live far apart). The specific forms that produce scattered edits:
| Connascence form | How it causes Shotgun Surgery | Fix direction |
|---|---|---|
| Connascence of Name | A field/enum-constant name (Currency.EUR, status == "SHIPPED") is repeated across N call sites; renaming forces N edits | Encapsulate; let one type own the name |
| Connascence of Position | The same positional argument order or tuple layout (street, city, zip) recurs in many signatures; adding a field reorders all of them | Introduce a value object / record (see ../08-data-clumps/) |
| Connascence of Algorithm | The same rule (a regex, a tax formula) is copy-pasted; changing it requires a treasure hunt | Extract to one place; Combine Functions into Class |
| Connascence of Type / Meaning | A magic value's interpretation is duplicated across modules | Replace with a named type / sealed hierarchy |
Page-Jones's two operative laws apply directly: minimise overall connascence by encapsulation, and where connascence remains, maximise its locality — keep connascent elements in the same class/module so a change stays in one file. Shotgun Surgery is exactly the violation of the locality law. The remedy is always to raise locality: move the connascent elements into one home (Move Function/Move Field/Combine Functions into Class), then dissolve the empty helpers (Inline Class).
11. Reading list¶
- Martin Fowler — Refactoring: Improving the Design of Existing Code, 2nd ed., Addison-Wesley, 2018. Ch. 3 names Shotgun Surgery and Divergent Change as a paired diagnosis; chs. 6–8 and 10–12 give the cures (Combine Functions into Class, Inline Class, Move Function, Move Field, Replace Conditional with Polymorphism, Replace Type Code with Subclasses).
- Robert C. Martin — Clean Code, Prentice Hall, 2008, ch. 10*, and Agile Software Development, 2002, ch. 8.* The Single Responsibility Principle — the principle both smells violate.
- Robert C. Martin — Clean Architecture, Prentice Hall, 2017, ch. 7. SRP recast as "a module should have one, and only one, reason to change — one actor." The cleanest lens for locating the misplaced responsibility.
- Meilir Page-Jones — What Every Programmer Should Know About Object-Oriented Design, Dorset House, 1995. The connascence taxonomy; degree and locality; the two laws this file leans on.
- Adam Tornhill — Your Code as a Crime Scene, Pragmatic Bookshelf, 2015* (2nd ed. 2024), and Software Design X-Rays, 2018.* Temporal coupling, change coupling, and the hotspot metrics in §§1–3 above.
- Michael Feathers — Working Effectively with Legacy Code, Prentice Hall, 2004. Seams and characterization tests — the safety net for gathering scattered behaviour without behaviour change.
- Kent Beck — Tidy First?, O'Reilly, 2023. Small, safe, reversible structural moves — the discipline for executing a gather refactor incrementally.
The spec sections in this file measure the smell; the literature above defines it and prescribes the fix. Reach for the metrics to decide which cluster to attack; reach for Fowler's catalogue to decide which move dissolves it.
Memorize this: Change coupling above 40% over a 30-commit window is the operational definition of shotgun surgery. CodeMaat or Codescene computes it; a 30-line shell script approximates it. Track p95 files-per-commit as the leading indicator, refactor the top cluster monthly, gate PR size as a guardrail.
In this topic