go test — Professional¶
1. Standardize the CI test command¶
Define one canonical CI invocation so every repo tests the same way:
Encode it in a Makefile/CI step and document the rationale: - -race — catch concurrency bugs (a DATA RACE blocks merge). - -covermode=atomic — race-safe coverage counters. - -shuffle=on — surface hidden test-ordering dependencies (log the seed). - -timeout — fail fast on hangs with a goroutine dump rather than a stuck job.
2. Coverage as a signal, not a gate-by-number¶
Reporting coverage is useful; hard-failing a PR on a coverage percentage often backfires (asserting-nothing tests game the metric). Professional policy:
- Track coverage trends and surface deltas in PRs (e.g., via a coverage bot).
- Require coverage for new, risky code in review judgment, not a blanket repo threshold.
- Use
go tool cover -htmlin review when a change drops coverage to see what is untested.
go tool cover -func=cover.out | tail -1 # repo total
go tool cover -html=cover.out -o cover.html # artifact for the PR
3. Separate fast and slow tests¶
Keep the inner loop and PR gate fast; run heavy suites separately.
go test ./... # unit tests, fast (PR gate)
go test -tags=integration ./... # integration, on a schedule or merge
Policy: unit tests must be hermetic (no network/DB), run in seconds, and never t.Skip silently in CI. Integration/e2e tests live behind tags and run on a slower cadence.
4. Flaky test policy¶
Flakes erode trust in CI. Establish a clear protocol:
- A
DATA RACEis never a flake — fix it, never retry. - Quarantine genuinely flaky tests (tag/skip with a tracking issue) rather than auto-retrying everything, which hides real intermittent bugs.
- Forbid blanket
--retryon the whole suite; if a test needs retries, it is testing something nondeterministic that should be made deterministic (fake clock, seeded RNG, controlled concurrency). - Use
-shuffle=onand-racein CI to find nondeterminism early.
5. Determinism and test data¶
- Inject clocks, RNG seeds, and IDs so tests are reproducible (
-shuffleseed, fixedrandseed). - Use
t.TempDir()andt.Cleanup()for filesystem isolation that auto-cleans. - Use
testing.Short()(-short) to letgo test -shortskip long tests in quick loops. - Prefer golden files with an explicit
-updateflag for large expected outputs:
var update = flag.Bool("update", false, "update golden files")
// compare against testdata/x.golden; rewrite when -update is set
6. Performance regression tracking¶
For performance-sensitive code, make benchmarks part of the process:
Policy: store a baseline, run benchmarks with -count ≥ 10, and review benchstat deltas for significant regressions. Treat CI benchmark absolutes as noisy; compare relative deltas on the same runner.
7. Caching strategy in CI¶
- The test cache is usually cold per CI job; cache
GOCACHEandGOMODCACHEto speed compilation, not test execution caching. - For integration suites that touch untracked external state, run with
-count=1so a stale(cached)pass cannot mask a regression. - Do not rely on the test cache for correctness gates — rely on it for local speed.
8. Reviewing for misuse¶
| Smell | Why it's wrong | Fix |
|---|---|---|
t.Skip with no reason in CI | silently disables coverage | require a reason + tracking issue |
| Retrying the whole suite on failure | hides real flakes/races | quarantine specific tests; fix nondeterminism |
| Unit tests hitting a real DB/network | slow, flaky, non-hermetic | tag as integration; fake the dependency |
| Coverage-gaming tests (assert nothing) | false confidence | review asserts, not just the % |
No -race in CI | concurrency bugs ship | add -race to the gate |
Bumping -timeout to "fix" a hang | masks a deadlock | read the goroutine dump; fix the hang |
9. Summary¶
Standardize one CI test command (-race -covermode=atomic -shuffle=on -timeout=...), treat coverage as a reviewed signal rather than a gamed number, and split fast hermetic unit tests from tag-gated integration tests. Enforce a flaky-test policy that fixes nondeterminism instead of retrying, make tests deterministic (clocks, seeds, t.TempDir, golden files with -update), track benchmark deltas with benchstat, and use -count=1 for integration suites that touch untracked state. A DATA RACE always blocks merge.
Further reading¶
go help testflag- Coverage for integration tests: https://go.dev/doc/build-cover
testing(TempDir, Cleanup, Short): https://pkg.go.dev/testing