Error Design — Best Practices — Professional Level¶
Table of Contents¶
- Introduction
- The Cost Model of Error Construction
- Allocation Patterns of Common Idioms
- Designing Error Types for Zero Allocation
- The
%wVerb: Whatfmt.ErrorfActually Does - Error Equality and Pointer Identity
- Design Evolution: From Sentinel to Family to Code
- Cross-Process Error Contracts
- Error Versioning Across Major Releases
- Errors and Generics
- Thread-Safety of Error Values
- Compiler and Inliner Interactions
- Summary
- Further Reading
Introduction¶
Focus: "What happens under the hood?" and "How do I design errors at the limit?"
Senior level was about systems. Professional level is about the limits: the cost of constructing an error, the ABI implications of typed errors, the ways errors.Is actually walks the chain, and the design constraints that show up only at very high request rates or in cross-process protocols.
This file is for engineers who have to defend error design against benchmarks and version-skew matrices.
The Cost Model of Error Construction¶
Reference: Go 1.21, amd64, modern hardware, no instrumentation.
| Operation | Time | Allocs |
|---|---|---|
errors.New("msg") at package level (init once) | 0/call | 0 |
errors.New("msg") per call | ~30 ns | 1 |
fmt.Errorf("ctx: %w", err) | ~150 ns | 2-3 |
fmt.Errorf("ctx: %v", err) | ~120 ns | 1-2 |
fmt.Errorf("a %d b %s: %w", id, name, err) | ~250 ns | 3-4 |
Comparing with == (sentinel) | < 1 ns | 0 |
errors.Is(err, target) chain depth N | ~5N ns | 0 |
errors.As(err, &t) chain depth N | ~10N ns | 0 |
Custom struct error (&MyErr{...}) | ~30 ns | 1 |
errors.Join(a, b, c) | ~80 ns | 1 (slice) |
The dominant cost in fmt.Errorf is the fmt machinery: it parses the format string, runs the reflect-based argument formatting, and allocates the result string. The %w adds a reference store but is otherwise the same as %v.
Implications:
- A wrap per layer at typical request rates (1k req/s, 5 layers): ~10 µs/req, ~5k allocs/s. Acceptable.
- A wrap inside a hot loop (parser, validator, 1M calls/s): ~150 ms of CPU, millions of allocations. Not acceptable.
- A sentinel comparison is free. There is no reason to avoid
errors.Is.
For high-throughput paths, see Designing Error Types for Zero Allocation below.
Allocation Patterns of Common Idioms¶
errors.New(s):
errorString struct ({ s string }, 16 bytes). - The string s is shared with the caller; no copy. - At package level, the allocation happens once, in init. fmt.Errorf("ctx: %w", err): 1. Format-string parse and arg processing. 2. A wrapError{msg, err} struct (%w path) — one allocation. 3. The composed message string — one allocation. 4. Possibly an []any for varargs — escape-analysis dependent, often elided.
So 2-3 allocations per fmt.Errorf call.
fmt.Errorf("ctx: %v", err): - Same as %w but produces a plain errorString, no wrapError. - 1-2 allocations.
errors.Is(err, target): - No allocation. - Walks Unwrap() and compares with == and Is method calls. - Each step is a virtual call (interface dispatch).
errors.As(err, &t): - No allocation (the target is a stack pointer the caller supplies). - Walks Unwrap() and tries reflect-based type matching at each step. - Slightly more expensive than Is due to reflection.
Designing Error Types for Zero Allocation¶
For high-throughput error paths, allocation matters. Two techniques:
Technique 1: Pre-allocated sentinel¶
var ErrInvalidUTF8 = errors.New("invalid UTF-8")
func decode(b []byte) (string, error) {
if !utf8.Valid(b) {
return "", ErrInvalidUTF8 // no allocation
}
// ...
}
The error value is allocated once at init; every return reuses the same pointer. Compare with errors.Is works perfectly.
The trade-off: the sentinel cannot carry per-call data. If callers need to know which byte was invalid, you need a typed error and an allocation per failure.
Technique 2: Stack-allocated typed errors via interface¶
A typed error is normally &MyErr{...} — a heap allocation because it escapes through the error interface. But careful design can keep it stack-bound when possible.
In practice, escape analysis usually causes errors returned across function boundaries to escape. This is one of those cases where the compiler is right and you should not fight it.
A more practical approach: pre-allocate a pool of errors:
var validationErrPool = sync.Pool{
New: func() any { return &ValidationError{} },
}
func newValidation(field, reason string) *ValidationError {
e := validationErrPool.Get().(*ValidationError)
e.Field = field
e.Reason = reason
return e
}
func releaseValidation(e *ValidationError) {
e.Field = ""
e.Reason = ""
validationErrPool.Put(e)
}
Caveat: pooling errors is only safe if the consumer reliably releases them. For most code paths, this is fragile and the allocation savings are small. Do not pool errors unless you have a measured hot path and a clear lifecycle.
Technique 3: Skip wrapping on cold paths¶
If you know the wrap context is rarely useful (e.g., a parser that returns error strings to a user who ignores them), do not wrap:
// hot path: cheap sentinel
if !ok { return ErrInvalid }
// cold path: rich wrap
if err := slowOp(); err != nil {
return fmt.Errorf("slow op on %s: %w", name, err)
}
The compiler cannot make this decision for you. Profile, then prune.
The %w Verb: What fmt.Errorf Actually Does¶
fmt.Errorf with %w produces a *fmt.wrapError:
type wrapError struct {
msg string
err error
}
func (e *wrapError) Error() string { return e.msg }
func (e *wrapError) Unwrap() error { return e.err }
Properties: - Single wrap (%w once): produces *wrapError whose Unwrap() returns one error. - Multiple wraps in Go 1.20+: produces *wrapErrors (note the s) whose Unwrap() returns []error. - The msg is the formatted text with the wrapped error's text already substituted. So e.Error() does not call e.err.Error() again.
Implication for performance: you pay for the format once at construction, not at every Error() call.
err := fmt.Errorf("foo: %w", innerErr)
fmt.Println(err.Error()) // does not call innerErr.Error() again
If you want lazy formatting (rare), implement your own type whose Error() composes on demand. The standard library does not.
Deep wraps and the chain¶
A 5-deep wrap chain produces 5 *wrapError values. Each is a tiny allocation (~32 bytes). For 1M requests with 5 wraps each, that is 160 MB of short-lived garbage — significant GC pressure.
For high-volume services, design wraps to be 1-2 layers deep, not 5. The structured error approach (Op/Kind/Path) replaces multiple wraps with one struct.
Error Equality and Pointer Identity¶
Two errors.New("x") calls produce two different errors:
This is intentional. Identity, not message text, defines a sentinel. If you want both calls to produce the "same" error, define a package-level sentinel and reuse it.
This also means typed errors with value receivers and identical fields are equal:
type myErr struct{ code int }
func (e myErr) Error() string { return "x" }
a := myErr{code: 1}
b := myErr{code: 1}
a == b // true (value comparison)
errors.Is(myErr{code: 1}, myErr{code: 1}) // true
For pointer-receiver types, the values are compared by pointer:
type myErr struct{ code int }
func (e *myErr) Error() string { return "x" }
a := &myErr{code: 1}
b := &myErr{code: 1}
a == b // false (different pointers)
errors.Is(a, b) // false unless one implements Is()
The implication for design: pointer-receiver typed errors require an Is method if you want them to be matchable across construction sites. Most "family" patterns implement Is for exactly this reason.
Design Evolution: From Sentinel to Family to Code¶
A useful evolution path as your service grows:
Stage 1: Single sentinel¶
Works for one-or-two call sites. Loses context — you cannot tell which thing was not found.
Stage 2: Wrapped sentinel¶
Adds context; preserves identity. Good for most cases.
Stage 3: Typed family¶
type NotFoundError struct{ Resource, Key string }
func (e *NotFoundError) Error() string { ... }
func (e *NotFoundError) Is(t error) bool { return t == ErrNotFound }
Adds structured context (callers can read Resource directly). Still matchable.
Stage 4: Structured Error with Kind enum¶
One type, all kinds. Heavier construction but every error is uniform.
Stage 5: Cross-process error code¶
The internal Kind enum maps to a stable string code at the API boundary. The code is the contract; the kind enum can evolve internally.
You do not have to climb this ladder. Most services live happily at stage 2 or 3. Stage 4 is appropriate when every layer needs structured access; stage 5 is required for stable public APIs.
Cross-Process Error Contracts¶
When errors cross process boundaries (RPC, message queue, HTTP), you cannot send a Go error chain. You send:
- A code (string or enum).
- A message (string).
- Optional structured details.
The receiver reconstructs an internal error. Three patterns:
Pattern: gRPC status.Status with details¶
st, _ := status.New(codes.NotFound, "user not found").WithDetails(
&errdetails.ResourceInfo{
ResourceType: "user",
ResourceName: fmt.Sprintf("%d", id),
},
)
return st.Err()
The receiver decodes:
s, _ := status.FromError(err)
for _, d := range s.Details() {
if r, ok := d.(*errdetails.ResourceInfo); ok {
// structured access
}
}
This is the modern recommended approach for gRPC. The proto definitions live in google.golang.org/genproto/googleapis/rpc/errdetails.
Pattern: HTTP JSON with stable codes¶
The code is the cross-process contract. Documented, versioned, never changed without a major release.
Pattern: Message queue with kind¶
A queue message that fails permanently includes the failure kind in its dead-letter envelope:
{
"original_message": { ... },
"failure": {
"kind": "invalid",
"message": "schema validation failed",
"ts": "..."
}
}
The dead-letter consumer can sort by kind and triage.
Common mistakes¶
- Sending the raw
err.Error()text as the cross-process contract — couples remote callers to your internal wording. - Not versioning error codes — adding
user.not_found_v2because someone changed the meaning ofuser.not_found. - Forgetting to translate at the receiving end — leaking foreign error sentinels into your domain.
Error Versioning Across Major Releases¶
Errors are part of your public API. They evolve with semver:
| Change | Semver impact |
|---|---|
| Add a new sentinel | minor (additive) |
| Add a field to typed error | minor (additive) |
| Remove or rename a sentinel | major (breaking) |
| Change semantics of a kind | major (breaking, even if name unchanged) |
| Add a new kind to an enum | minor (callers should have a default branch) |
| Stop returning a documented error in favor of another | major |
| Improve error messages | safe (no semver impact, but mind tests) |
For libraries with strict compatibility guarantees (golang.org/x/exp graduating to stdlib, for example), every error addition is reviewed. For application code, a CHANGELOG entry is usually enough.
Deprecation pattern¶
Goimports aliasing keeps both working. After two release cycles, remove ErrMissing. Tooling (staticcheck's SA1019) warns callers.
Errors and Generics¶
Generic functions over types that can fail:
type Result[T any] struct {
Val T
Err error
}
func Map[T, U any](s []T, f func(T) (U, error)) ([]U, error) {
out := make([]U, 0, len(s))
for i, t := range s {
u, err := f(t)
if err != nil {
return nil, fmt.Errorf("at index %d: %w", i, err)
}
out = append(out, u)
}
return out, nil
}
Two design notes:
erroris not a type parameter. It is the sameerrorinterface everywhere. Generics do not change error design.- Resist
Result[T]-style monadic types. Idiomatic Go uses(T, error); introducingResultadds friction without adding value. Languages whereResultshines (Rust, Haskell) have language-level support; Go does not.
Use generics in error design when: - You write a wrapper that propagates errors (retry helpers, transformation pipelines). - You build typed error containers (Map[K]error, Set[error]). - You build telemetry helpers that operate on any error.
Do not use generics to create an error monad. It is not idiomatic and other Go developers will refactor it out.
Thread-Safety of Error Values¶
By convention, an error value should be safe to read from any goroutine. Implementations should not have state-changing Error() methods.
Sentinels are trivially safe (immutable). Typed errors are safe if their fields are immutable after construction. The standard library treats this as a contract: errors flow across goroutines all the time.
If you implement an error type with mutable state (a buffer, a counter), you have a thread-safety bug waiting to happen. Always treat constructed errors as immutable values.
// Bad
type MyErr struct{ Visited int }
func (e *MyErr) Error() string {
e.Visited++ // mutation in Error()!
return "..."
}
// Good
type MyErr struct{ Field string } // immutable post-construction
func (e *MyErr) Error() string { return "..." }
Compiler and Inliner Interactions¶
Three points where the compiler influences error design:
1. Escape analysis on returns¶
When you return &MyErr{...}, escape analysis says "this struct escapes via the error interface" and heap-allocates it. There is no way around this without changing the interface (Result[T] patterns, but see above).
2. Inlining of if err != nil { return err }¶
A simple error check inlines without overhead. A wrap (fmt.Errorf) is too large to inline; the compiler emits a function call.
errors.Is is small enough to be inlined for the common single-step case (err == target); deeper walks require the actual function. errors.As does not inline (uses reflect).
3. Interface devirtualization¶
When the compiler can prove that an error is concretely *MyErr, it can devirtualize the call to Error(). In practice this is rare because errors usually pass through interface boundaries. Profile-guided optimization (Go 1.20+) can speculatively devirtualize, but the gain on error paths is small.
For maximum performance, avoid the error interface in hot paths. A (T, ok bool) return is faster when "failure" is binary and the caller does not need a message:
func tryFastPath(b []byte) (Token, bool) // never allocates
func slowParse(b []byte) (Token, error) // descriptive error
The fast path returns bool; only the slow path constructs an error. This pattern is in the standard library — strconv.ParseInt returns an error, but strconv.Atoi is special-cased and bytes.Cut returns a bool.
Summary¶
At professional level, error design is governed by cost models, ABI commitments, and cross-process contracts. The cheap path is a sentinel comparison or a simple wrap; the expensive path is fmt.Errorf in a hot loop. Sentinels are pointer-equal across calls; typed errors with pointer receivers are not — design Is methods accordingly. Cross-process contracts use stable codes, not Go pointer chains. Error versioning is part of semver. Generics do not change error design; do not introduce Result[T]. The compiler sees errors as just another interface — escape analysis usually heap-allocates them, and that is acceptable except in the hottest loops where you should pre-allocate sentinels and skip wrapping. Most of the discipline at this level is knowing when not to optimize — the wrap is fine, the allocation is fine, until a benchmark proves otherwise.
Further Reading¶
$GOROOT/src/errors/errors.goand$GOROOT/src/errors/wrap.go— the entire stdlib error machinery.$GOROOT/src/fmt/errors.go—fmt.Errorfand thewrapErrortype.- Go Blog — Error Values, Improved
- Russ Cox — Error syntax for Go
- google.golang.org/grpc/status — gRPC status with details
- google.aip.dev/193 — Google AIP for errors
go test -bench=. -benchmem— measure your own paths- github.com/cockroachdb/errors — production error library