Coverage — Professional¶

← Back

By the time you have a sizeable Go service in production, coverage stops being a tutorial topic and becomes an organizational lever. This page is about applying the tooling to large teams: how to set targets that improve quality instead of corrupting it, what to do with services like Codecov and Coveralls, how to detect "fake coverage", and how to reason about coverage as a signal rather than a goal.

1. Coverage as signal, not target¶

The single most important rule of professional coverage practice is the rule that follows from Goodhart's law: when a measure becomes a target, it ceases to be a good measure. Coverage tracks one specific thing — what fraction of statements ran during the test suite. It does not track:

correctness of assertions (you can cover 100% with no t.Errorf at all),
meaningful inputs (a test calling Process(nil) is one test, not coverage of all valid inputs),
branch combinations (a && b has four logical outcomes; statement coverage shows two at most),
concurrent interleavings,
failure modes that depend on environment, time, or external services,
the absence of defensive code that should exist.

Use coverage as a quick smell test: low coverage is a problem; high coverage is not proof of anything. A 30% number is bad, a 90% number is suspicious until you read the tests, and a 100% number is almost always achieved by tests that exercise execution without verifying behavior.

2. Setting team policies that work¶

A working coverage policy has four properties:

It rewards adding tests where it matters and tolerates uncovered code where it does not.
It is enforced automatically, not in code review.
It is robust to refactors that legitimately reduce a percentage by removing dead code.
It is owned by the team, not imposed by a quality department.

Concretely:

Per-directory floors, not whole-repo floors. For example, internal/billing/ must be at 85% and rising; cmd/admin-cli/ may be at 0%.
Delta gates, not absolute gates. A PR may merge if the delta on touched files is non-negative, even if the absolute number stays at 60%.
Critical-path tagging. Mark specific functions or packages as "covered or fail". This is more useful than a global percentage.
No reward for trivial coverage. Constructors, getters, and one-line String() methods should be excluded from the denominator if they dominate the count.

3. The fake-coverage problem¶

Engineers under coverage pressure produce tests like this:

func TestProcessAllBranches(t *testing.T) {
    Process(input1)
    Process(input2)
    Process(input3)
}

There are no assertions. The function runs and the test passes regardless of output. The coverage number goes up. The bug rate does not go down.

Detecting fake coverage at scale is hard. Heuristics that work:

Lint for tests with no calls to t.Error*, t.Fatal*, t.Fail*, require.*, or assert.* in the test body. (False positives include sub-test setup and tests that rely on Example outputs.)
Mutation testing: tools like go-mutesting randomly mutate source and check whether any test fails. If a mutation survives, the covered code is not meaningfully tested. Mutation scores are a much stronger signal than coverage but are 10–100x slower to compute.
Code review focus: when a PR adds coverage but no assertions, push back.

In practice the social mechanism (review culture) outperforms automation for fake coverage, because the human reviewer can see at a glance whether the test exercises behavior or just executes code.

4. Codecov and Coveralls integration¶

Codecov and Coveralls are SaaS services that consume Go coverage profiles, display them as PR comments, and store history. They share the same general workflow:

# In CI:
go test -race -coverprofile=cover.out -covermode=atomic ./...

# Codecov:
bash <(curl -s https://codecov.io/bash) -f cover.out

# Coveralls:
goveralls -coverprofile=cover.out -service=github

Both tools accept the legacy text profile format. Useful features:

PR coverage delta: a comment on every PR showing per-file coverage changes. This is the single most useful feature; it reframes coverage as a delta metric.
Sunburst and graph views: visualizations of coverage by directory.
Coverage history: a trend graph. Useful for spotting "the day the metric collapsed" (usually a generated-code dump).
Status checks: GitHub or GitLab status checks that block PRs failing a configured policy.

Configuration tips:

Configure codecov.yml (or .coveralls.yml) at the repo root with directory-specific targets and ignore patterns for generated code, vendor, and mocks.
Upload from a single CI job that runs the full test suite, not from parallel shards (or use the --flags mechanism to merge profiles across shards).
Use the informational: true setting in Codecov to display deltas without blocking merges, then ratchet up once the team trusts the numbers.

5. Coverage in monorepos¶

A monorepo with hundreds of packages multiplies coverage complexity:

A single go test -cover ./... invocation runs out of memory or takes hours.
Different teams own different subdirectories with different quality bars.
Generated code (proto, mock, OpenAPI) explodes the line count.

Practical patterns:

Per-team profiles: each team's CI runs only go test -cover ./teams/foo/... and uploads with a team-specific flag.
Shared core profile: a single job runs go test -cover -coverpkg=./internal/core/... ./... so that integration tests in any team count toward core coverage.
Profile merging: nightly job merges all profiles via go tool covdata merge (or text-profile arithmetic via x/tools/cover) for a global view.

6. Identifying low-value 100%-covered code¶

Some code can be trivially covered without producing useful tests:

Getters and setters: func (x *T) Foo() string { return x.foo }.
One-line constructors: func NewT() *T { return &T{} }.
String() methods on enums.
Error() methods returning a constant.

These pad the denominator but contribute nothing. To identify them programmatically:

Parse the profile.
Walk the AST of each covered file.
Mark any function body whose statement count is ≤2 and contains no control flow as "trivial".
Compute the coverage percentage excluding trivial functions.

The second number is usually 5–15 points lower and is a more honest indicator. Some teams call this "meaningful coverage" and track it as a secondary metric.

7. Coverage budgets for new code only¶

A pragmatic policy that resists Goodhart: only enforce coverage on new code in a PR.

# Pseudocode
git diff main...HEAD --name-only -- '*.go' | filter-to-changed-lines
intersect with cover.out blocks
if covered_changed_lines / changed_lines < 0.80: fail

This is the policy embodied by tools like diff_cover. It encourages teams to test what they write today without forcing a quixotic backfill of legacy code. Critical-path tagging supplements this for legacy code that must improve.

8. Coverage reports as PR artifacts¶

Every PR should include a coverage delta as a first-class artifact. Useful forms:

PR comment: per-file delta with red/green coloring. Codecov and Coveralls do this automatically.
HTML artifact: the output of go tool cover -html uploaded to the CI's artifact store, linked from the PR.
Annotated diff: lines added in the PR that are not covered by tests, displayed inline in the review UI. GitHub Actions can produce this via golangci-lint integration.

The PR comment is the highest-leverage piece because it puts the information in front of every reviewer with zero clicks.

9. Coverage during incident response¶

During an outage post-mortem, coverage often gets blamed: "we had 80% coverage and still crashed". Resist this framing. Instead, ask three more useful questions:

Was the specific code path that failed covered?
If yes, what assertions did the test make? Did they cover the failure mode?
If no, what was the cost of writing that test before the incident, and is it worth writing now?

This redirects post-mortem energy from "raise coverage" (which produces fake coverage) to "test the next-most-likely failure" (which actually reduces incident rate).

10. Coverage for libraries vs. services¶

The two have different optima:

Libraries ship to many callers, have stable API surface, and benefit from very high coverage (95%+). Every public function is part of the contract. Test the contract.
Services have many internal helpers, mutable architecture, and short-lived implementation details. A blanket 95% target slows refactors. 70–80% with strong assertions on critical paths is usually better.

Mature teams set different targets for pkg/ (library code) and internal/ (service code).

11. Coverage versus mutation testing¶

When coverage saturates and the team wants a deeper quality signal, mutation testing is the next step. A mutation tool like go-mutesting mutates source (e.g. > → >=, + → -, removed return statements) and runs the test suite. A "killed" mutation is one whose tests fail; a "survived" mutation is one that compiles and passes tests despite changed behavior — a strong signal that the covered code is under-asserted.

Mutation scores are expensive (each mutation requires a full test build/run). Used selectively on critical packages, they reveal fake coverage that no static analysis can find. They are not a replacement for coverage but a complement.

12. Integration coverage policy¶

With Go 1.20+ binary coverage, you can require that integration tests against a staging environment exercise at least X% of the production code. This is much harder to game than unit coverage because it requires real traffic patterns.

A working pattern:

Production binaries are also built with -cover.
A non-production staging environment runs the -cover binary and accumulates GOCOVERDIR.
Coverage is merged nightly with unit-test coverage.
The merged percentage and the per-file delta are reported to Codecov.

Teams report that integration coverage often raises the headline number while revealing uncovered error paths that unit tests never reached (because unit tests rarely exercise the timeout/retry/circuit-breaker error paths exercised by real traffic).

13. Anti-patterns to avoid¶

Linting "test files must have coverage": makes no sense; coverage is of production code.
Demanding 100% across the board: drives engineers to write tests for if err != nil { return err } lines and ignore real defects.
Blocking PRs on absolute coverage: penalizes refactors that delete dead code (which appears as "lowering coverage").
Using coverage as a performance metric for engineers: turns it from a signal into a target instantly. Coverage belongs to teams, not individuals.
Treating coverage and assertion quality as one number: two engineers can produce identical coverage numbers with wildly different test quality. The number alone is not enough.

14. Summary table¶

Stage	Useful coverage practice
0–30% coverage	Add tests anywhere; the number is meaningfully low.
30–70%	Target critical paths; per-directory floors.
70–85%	Per-file deltas on PRs; ignore generated code.
85%+	Mutation testing; integration coverage; meaningful-coverage refinement.
100% claimed	Treat as a red flag until tests reviewed.

Coverage in a professional setting is a habit, not a number. The habit is: every change adds meaningful tests for the behavior it changes, and the tooling makes the gaps visible. The number is a side effect.

15. Coverage in incident response¶

After a production incident, the natural reaction is "we need more tests". Coverage data can guide that, but it can also mislead:

The code that failed may already be covered. The bug was in the test's assumptions, not in the line's execution.
The bug may be a race condition that no statement-level coverage could detect.
The bug may be in a third-party library whose code is not in your profile.

The right post-incident question is not "what was the coverage of this line?" but "what test would have caught this?". Coverage is one input, but the post-mortem should focus on test design, not test quantity.

That said, if the post-mortem identifies that the buggy code was at 0% coverage, the absence of coverage is itself a finding. Write a regression test, raise coverage of that file, and move on.

16. Coverage and engineering performance reviews¶

A persistent question: should engineer-level coverage be a performance metric? The unanimous answer from teams that have tried it: no.

When coverage becomes an individual performance metric:

Engineers write assertion-free tests to pad their numbers.
PR reviewers stop pushing back on weak tests because the author needs the credit.
Engineers avoid working on hard-to-test areas because the coverage hit reflects badly on them.
The coverage number drifts up while bug rates stay constant.

Use team-level coverage trends to evaluate testing culture; use individual contributions in code review for technical depth. Do not conflate the two.

17. The "coverage as documentation" pattern¶

A subtler use of coverage: as documentation of system behavior. A well-written test that covers a function communicates how the function is meant to be used. Combined with Example functions and godoc, coverage maps to documentation.

For libraries especially, every public function should have:

An Example (covers happy path, doubles as doc).
A TestXxx for edge cases (covers boundaries).
Real-world consumer code in the test suite (validates integration assumptions).

These three together push coverage up and serve as living documentation. Users of the library can read the tests to understand the contract.

18. Coverage for security-critical code¶

Security-critical code (authentication, authorization, cryptography wrappers, input validation) deserves higher coverage standards than business logic. Common practice:

Mandatory 90%+ coverage on internal/auth/, internal/crypto/, internal/security/.
Mandatory mutation testing on the same packages.
Mandatory third-party security review for non-trivial changes.

Coverage is necessary but not sufficient. A 100%-covered authentication module can still have constant-time-comparison bugs, padding-oracle vulnerabilities, or session-fixation flaws. Coverage tells you the lines were run; security audits tell you the lines were right.

19. Coverage in regulated environments¶

For projects subject to regulatory compliance (medical devices, financial systems, aerospace), coverage requirements may be externally imposed:

FDA: coverage of safety-critical code per IEC 62304.
FAA: DO-178C requires branch coverage for higher assurance levels (Go's tooling does not provide branch coverage natively; this is a problem).
PCI DSS: code coverage is one of many control requirements.

For Go projects in regulated contexts, expect to supplement the native coverage tooling with:

Third-party branch-coverage tools (rare, usually built in-house).
Manual documentation of why specific gaps are acceptable.
Mutation-testing reports as a stronger proxy for "tested" than coverage alone.

Most Go services are not in regulated environments; this section is informational.

20. Coverage governance over years¶

For a long-lived Go service, coverage policy should evolve:

Year 1: establish baseline; no gates; let trends emerge.
Year 2: introduce per-directory floors for critical packages.
Year 3: introduce delta gates; PR cannot reduce coverage of touched files.
Year 4: introduce mutation testing on critical packages; coverage targets become floors, mutation score becomes the active discipline.
Year 5+: re-evaluate; some packages may be deprecated, others may be promoted to "critical".

Coverage policies are not set-and-forget. They reflect the maturing testing culture of the team.

21. Coverage of cross-cutting concerns¶

Logging, metrics, tracing, error wrapping — these are cross-cutting concerns that appear throughout a codebase. Their direct coverage is typically high (every function logs, so every log statement is covered) but their behavioral correctness is rarely tested.

Examples of cross-cutting concerns where coverage misleads:

"Did we log the right thing?" — coverage says yes (the log call ran), but the message format could be wrong.
"Did the trace span have the right attributes?" — coverage says the call to set attributes ran, but the test does not check what was set.
"Did the metric increment?" — covered, but the metric name or labels could be wrong.

For these, you need behavioral tests with assertions on the side effects (captured logs, captured spans, captured metrics) — not just coverage.

22. Coverage as a team contract¶

Some teams use coverage as a contract among themselves:

"We don't merge code that lowers coverage of touched files."
"We don't merge code that introduces uncovered defensive branches without justification."
"We don't write tests without assertions just to lift the number."

This is an informal contract, enforced by code review, not CI. The CI provides the data; the team provides the discipline.

The strength of this pattern is that it makes coverage a shared value, not a rule imposed from above. The weakness is that it requires consistent enforcement; if any team member relaxes the standard, the contract erodes.

23. Tooling beyond `go tool cover`¶

The Go ecosystem has several third-party coverage tools:

gocovsh: an interactive TUI for browsing coverage profiles. Useful for fast inspection.
gocovmerge: a small tool for merging multiple profiles into one, simpler than rolling your own.
gocover-cobertura: converts profiles to Cobertura XML for Jenkins/GitLab CI.
goverbose: a wrapper that produces more detailed reports.
golangci-lint: not a coverage tool but indispensable alongside; the gocyclo, nakedret, and errcheck linters surface code-quality issues that coverage cannot.

Pick what fits your workflow. The defaults are usually enough.

24. The "coverage trap" for new engineers¶

A new engineer joining a Go team is likely to ask: "what is our coverage target?". The right answer is usually "we don't have one — we use coverage as a guide, not a target". The wrong answer is "85% on every PR".

The wrong answer leads predictably to:

Time wasted on trivial tests.
Frustration when a clean refactor temporarily drops coverage.
Cynicism about testing in general.

A senior engineer's first conversation with a new hire on testing should make this distinction clear. Coverage is one input; engineering judgment is the output.

25. Coverage as part of onboarding¶

When a new engineer joins, they should be shown:

How to run coverage locally.
How to read the HTML report.
How the CI displays coverage on PRs.
The team's culture around coverage (what is gated, what is informational).
The Goodhart trap and why the team does not chase the absolute number.

This is a 30-minute onboarding session. It saves months of confusion and prevents the new engineer from developing the wrong instincts.

26. Decommissioning coverage¶

If a project is being decommissioned (sunset, archived), there is no value in maintaining coverage discipline. Decommission decisions:

Stop reporting coverage in CI.
Cancel Codecov/Coveralls subscriptions for the project.
Archive past coverage data if regulatory requirements demand it.
Remove coverage gates from CI configs.

This is a freeing change. Resources go to projects that benefit from the discipline.

27. Cross-language coverage in polyglot teams¶

A team with Go, Python, and TypeScript services should standardize on one coverage SaaS (Codecov is the typical choice) for consistency. Each language has its own coverage tool but they all emit a format Codecov accepts:

Go: cover.out text profile.
Python: coverage.xml Cobertura format.
TypeScript: LCOV format from Istanbul/c8.

The unified Codecov dashboard shows per-service trends. Inter-service comparisons are uneven (different languages have different coverage semantics), but within a service the trend is meaningful.

28. A short essay on Goodhart's law¶

The British economist Charles Goodhart observed in 1975 that "when a measure becomes a target, it ceases to be a good measure". The original context was monetary policy: any indicator the Bank of England targeted would be gamed.

For coverage, the dynamics are identical:

Coverage as a measure: an honest indicator of which lines were exercised by tests.
Coverage as a target: an incentive to maximize the number, regardless of whether the underlying tests are useful.

The dynamics:

Engineers find ways to lift the number (assertion-free tests, mocking everything, ignoring hard-to-cover code).
The number rises; the bugs do not fall.
Management sees the rising number and concludes the strategy is working.
The strategy is not working; the metric is now decoupled from the goal.

To resist Goodhart's law for coverage:

Use coverage as a delta metric, not an absolute one.
Couple with assertion-quality review.
Periodically audit fake coverage (mutation testing, manual review).
Make the metric a team concern, not an individual one.
Be willing to abandon the metric if it stops correlating with engineering quality.

This is the central professional challenge of coverage. Get this right and coverage is one of the most useful tools in your kit. Get it wrong and it becomes a distraction worse than no metric.

29. The professional engineer's coverage rhythm¶

A working professional engineer's typical interaction with coverage:

Daily: runs go test -cover ./mypackage after substantial changes. Glances at the HTML if the number surprised them.
Per-PR: reads the Codecov comment on their PR, fixes any per-file regressions before requesting review.
Per-sprint: reviews team coverage trends in the sprint retrospective. Discusses any persistent gaps.
Per-quarter: reviews per-directory floors with team leads. Adjusts based on what is critical now.
Per-incident: in post-mortems, asks "would more coverage have caught this?" — sometimes yes, often no.

This rhythm is sustainable, informative, and resistant to Goodhart's law. Aim for it.

30. Closing¶

Coverage at professional level is about wielding a flawed metric thoughtfully. The flaws are well-known: it measures statements not branches, executions not assertions, count not quality. The thoughtful wielding is also well-known: use it as a flashlight, not a scorecard; report deltas, not absolutes; pair with other signals; make it a team concern.

Done well, coverage is a quiet, useful background discipline. Done badly, it is a metric that consumes engineering time while masking quality issues. The difference is mostly about culture and a little about tooling.

Aim for the quiet, useful kind.

31. Quarterly coverage review¶

A useful artifact for senior teams: a quarterly coverage review document. Contents:

Module-wide coverage trend over the quarter.
Per-package trend for critical packages.
Notable PRs that lifted or dropped coverage significantly.
Open coverage gaps that the team has chosen to accept.
Adjustments to per-directory floors.
Tooling changes (e.g., updated Codecov config).

This 10-minute read sets the context for the next quarter's testing investment. It also creates a record: when a future engineer asks "why is legacy/foo at 23% coverage?", the quarterly review explains the team's reasoning.

32. Engineering manager perspective on coverage¶

An engineering manager evaluating coverage data should ask:

Is the trend healthy? Rising slowly, flat, or declining?
Are the critical packages above their floors?
Are PRs being rejected for coverage reasons? How often? Are the rejections useful?
Are engineers complaining about the metric? What is the substance of the complaints?
Is the metric correlated with bug-escape rates over the past quarter?

The manager's job is not to enforce a specific number. It is to ensure the metric is serving the team's goals and to intervene if the metric is being misused.

33. Communicating coverage to non-engineers¶

Stakeholders (product managers, executives, customers) sometimes ask about coverage. The right responses:

"We track coverage as one of several quality signals."
"Our critical packages are at 85%; the overall service is at 78%."
"Coverage alone is not the goal; we also track bug-escape rate, time-to-detect, and customer-visible incidents."
"A higher coverage number does not directly translate to fewer bugs; it depends on the quality of the underlying tests."

Resist the urge to commit to a specific coverage target in business terms. "We will reach 90% coverage by Q4" sounds good but reduces engineering flexibility. Better: "we will improve testing of these specific high-risk areas by Q4".

34. Coverage in vendor and contract relationships¶

If you ship a Go library to other teams or customers, coverage data is a credibility signal. Publishing coverage numbers (via Codecov badge in README, or in release notes) demonstrates engineering investment.

For contracts: if you provide Go services under SLA, coverage targets in the contract are usually a bad idea (Goodhart trap). Better contract language: "we use industry-standard testing practices including unit testing and code review". This conveys the discipline without exposing the metric to gaming.

35. Coverage and developer happiness¶

Surprisingly, coverage practice affects developer happiness:

Engineers like seeing their PRs receive useful feedback.
Engineers dislike being blocked by metrics they consider arbitrary.
Engineers like the satisfaction of "covering" a tricky code path with a clever test.
Engineers dislike chasing a number that does not reflect real quality.

A coverage culture that emphasizes feedback over gates, learning over compliance, and judgment over metrics will improve happiness. The opposite culture drives engineers away.

36. The "is coverage worth it" debate¶

Periodically, an engineer (often a senior one) argues that coverage should be abandoned entirely. The arguments:

"It encourages fake tests."
"It is gamed without addressing real quality."
"It costs CI time."
"We could spend the time on mutation testing instead."

These critiques have merit. The response should be substantive, not dismissive:

Show the value: per-PR coverage feedback, gap identification, trend visibility.
Acknowledge the limits: it is one signal, not the whole story.
Discuss alternatives: are we ready for mutation testing? What would that look like?
Reach consensus: the team chooses what to track.

A team that has this debate every year is healthy. A team that has never had it has probably not examined the practice in depth.

37. Documenting the coverage policy¶

The team's coverage policy deserves a written document. A short one:

# Coverage Policy

Coverage is a signal, not a target. We use it as a diagnostic to find untested code.

## Practices
- Unit tests run with `-race -coverprofile` on every PR.
- Codecov reports per-file deltas in PR comments.
- We do not enforce absolute coverage gates.
- We enforce per-directory floors for critical packages (`internal/auth`, `internal/billing`).
- We exclude generated code (`*_gen.go`, `*.pb.go`, `*_mock.go`) from the denominator.

## What we look for in code review
- Tests with assertions, not just coverage.
- Coverage of new code (delta-positive on touched files).
- Justification for accepting coverage gaps.

## What we do not do
- Set absolute coverage targets for the whole repo.
- Use coverage as an individual performance metric.
- Block PRs for legitimate coverage drops (e.g., dead-code removal).

This document removes ambiguity, supports onboarding, and constrains scope creep. Update it annually.

38. Coverage and the post-mortem culture¶

In a healthy post-mortem culture, coverage is one input among many. It is not blamed for incidents (the blame stays on people and processes, not metrics). It is consulted to ask "would a test have caught this?".

In an unhealthy culture, coverage becomes a scapegoat: "we had 80% coverage and still had an outage, so let's mandate 95%". This is exactly the Goodhart trap. Mandating 95% will not reduce outages; it will produce more assertion-free tests.

The senior engineer's responsibility in post-mortems: redirect the conversation from "raise the number" to "improve the test design". Coverage is a tool to find gaps, not a target to satisfy.

39. Industry comparison¶

Across the industry, mature Go teams tend to settle around:

70-85% module-wide coverage.
90%+ on critical packages.
100% on small, library-style packages.
0-30% on cmd/* packages (until Go 1.20 integration coverage adoption).

Teams reporting > 95% module-wide coverage are either (a) heavy users of integration coverage, (b) library projects with naturally testable surface, or (c) gaming the metric. Inspect such claims.

Teams reporting < 30% are either (a) starting fresh, (b) in a deep testing-debt situation, or (c) primarily test through manual QA. None is necessarily bad, but the number itself is a starting conversation.

40. The "should coverage be public" question¶

For open-source Go projects, publishing coverage data builds trust. Codecov, Coveralls, and SonarCloud all support badges and public dashboards. A 90% coverage badge on a README is a small but real signal to potential users.

For closed-source corporate projects, coverage is internal data. Sharing it externally (e.g., in marketing) is rarely meaningful and can be misleading. Customers care about service reliability, not test metrics.

41. Closing the professional page¶

You should now be able to:

Set coverage policy for a team.
Resist Goodhart's law and explain why it matters.
Detect and address fake coverage.
Integrate with Codecov/Coveralls thoughtfully.
Lead post-mortem and quarterly reviews involving coverage data.
Communicate coverage to non-engineers without overpromising.

Coverage at professional level is leadership, not just operations. You shape the team's relationship to the metric. Done well, that relationship is healthy and sustainable. Done badly, it drives engineering anti-patterns for years.

Choose deliberately.

42. A worked organizational scenario¶

Imagine you are tech lead of a 20-engineer team. The product is a Go-based SaaS. The current state:

Module-wide coverage: 64%.
Critical packages (billing, auth, data): 70-80%.
CI: coverage is reported but not gated.
Engineering culture: testing varies widely by team.

Your goal: improve testing discipline without triggering Goodhart's law.

A three-month plan:

Month 1 — Visibility. Set up Codecov integration. Configure PR comments showing per-file coverage deltas. No gates yet. Document the coverage policy as discussed in section 37. Run a 30-minute team meeting walking through the policy.

Month 2 — Soft gates. Introduce per-directory floors for billing, auth, and data (set just below current coverage so they are not blockers immediately). Configure Codecov status checks as "informational", not "required". Watch how engineers react.

Month 3 — Hard gates on the critical packages. Promote the soft gates to required status checks on critical packages. Pair with a code-review checklist that asks "did you add assertions for new code?". Begin a quarterly review process.

Outcomes to expect:

Coverage on critical packages rises to 85-90% over the quarter.
A few PRs are legitimately blocked; review whether each block was useful.
Some engineers push back; respond by explaining the rationale, not by relaxing the policy.
Bug-escape rate on critical packages should fall over six months (lagging indicator).

After three months, evaluate: are engineers happier? Are bugs fewer? Are PRs flowing well? Adjust based on data, not gut feeling.

43. The coverage discussion in performance reviews¶

When evaluating a senior engineer's testing contributions in a performance review, coverage data alone is misleading. Better signals:

Did they identify and close meaningful coverage gaps?
Did they advocate for assertion-quality tests in code review?
Did they push back against fake coverage?
Did they teach junior engineers good testing habits?
Did they contribute to the coverage tooling and pipeline?

These are leadership and engineering judgment indicators, not metric-chasing indicators. They build a healthier team.

44. Closing the professional page¶

You should leave this page able to:

Articulate why coverage is a signal, not a target.
Design coverage policy that resists Goodhart's law.
Communicate coverage to executives without overcommitting.
Lead post-mortems that use coverage as one signal among many.
Mentor engineers in healthy coverage practice.
Recognize and address fake coverage in your team.

The remaining pages of the subsection are specification (reference material), interview (review questions), tasks (exercises), find-bug (anti-patterns), and optimize (performance). Use them as needed.

Coverage practice at the professional level is mostly about discipline, judgment, and communication. The tools are simple. The leadership is the hard part.