Futures & Promises — Professional¶
Focus: staff/principal-level decisions. Futures are not a library feature in Go; they are an emergent property of channels, goroutines, and contexts. The hard parts are not writing
ResolveandAwait. They are: who owns cancellation, how the deferred value behaves under partial failure, what happens when a Future graph crosses a process boundary, and what an unbounded fan-out does to a server at 03:00. Opinionated where the field agrees, explicit about trade-offs where it does not.
1. Futures as a system primitive¶
A Future is a deferred value with a cancellation contract. Conflating it with neighbouring primitives costs production. The taxonomy:
| Primitive | Identity | Coupling | Failure model | Typical use |
|---|---|---|---|---|
| Future / Promise | Anonymous deferred value | Producer/consumer share a handle | One value or one error | "Fetch this user; I'll need it in ~10 ms" |
| Goroutine-per-task | None; fire-and-forget | Side effects only | Panic crashes process | "Log this; I don't care when" |
| Actor (Erlang/Akka) | Long-lived addressable mailbox | Caller knows actor ref | Supervisor tree; restart | "Stateful entity processes a stream" |
| RPC (gRPC/HTTP) | Endpoint with schema | Tight: caller knows callee | Sync deadline; explicit error | "Charge this card and tell me now" |
| Channel-of-results | Stream of values | Anonymous producer, many values | Stream closes on done or error | "Yield results as they arrive" |
| Pub/Sub topic | Event broadcast | Loose; producer ignorant of subs | At-least-once; no return | "An order was placed" |
Four distinctions matter:
- Cardinality. A Future is exactly one outcome. A channel-of-results is many. Confusing these turns a Future into a leaked goroutine blocking on the second send.
- Cancellation direction. RPC: caller to callee. Actor: supervisor to actor. Future: cooperative — the consumer signals via
context.Context, the producer must observe. - Identity. Actors have stable mailbox addresses. Futures are anonymous handles — once awaited and discarded, unreachable. Futures are unfit for long-lived stateful work.
- Composition. Futures compose with
All/First/Map; actors with message passing; RPC with sequential calls; Pub/Sub with topic chains. Each has different latency and failure semantics.
The rule: Future for one deferred value with cancellation; actor for stateful identity; RPC for inline result; Pub/Sub for fan-out notification. The Future is the smallest; pick it when the work is bounded, the result is one value, and overlap with other work matters.
2. Quantitative cost analysis¶
Numbers below are Go 1.22, amd64, Linux 6.6 on a tuned 16-core box.
2.1 The four costs of a Future¶
go func() {} goroutine spawn ~1.0 µs (~2.5 KB stack)
ch <- v; <-ch unbuffered chan send+recv ~50 ns
ctx.Done() select branch context cancellation check ~3 ns (one chan recv)
context.WithCancel new ctx with cancel goroutine ~100 ns (no goroutine; cheap)
context.WithTimeout same + timer ~600 ns (timer allocation)
sync.Once.Do (fast path) once-per-Future fulfilment ~5 ns
A bare Future built on a goroutine and a one-shot channel costs about 1 µs to spawn, 50 ns to deliver, 3 ns to poll cancellation. Against an I/O call (network: 100 µs–10 ms; disk: 50 µs–10 ms), the overhead is in the noise. Against in-process CPU work (1–100 ns), the Future is a 1000x tax. Futures pay for I/O, not for compute.
2.2 Combinator costs¶
errgroup.WithContext + 8 g.Go + Wait ~10 µs total fixed (8 goroutines)
errgroup.SetLimit(64) + 1000 tasks ~1 ms throughput floor (semaphore amortized)
singleflight.Do (no contention) ~200 ns
singleflight.Do (1000 callers, 1 winner) ~500 ns/caller (waiter wakeup amortized)
sync.WaitGroup Add/Done/Wait (4 g) ~300 ns
errgroup adds ~1 µs per goroutine over raw go. singleflight is essentially free relative to the work it dedups.
2.3 The hidden cost — goroutine stack¶
| Goroutines | Resident memory (min) | Scheduler overhead |
|---|---|---|
| 1 000 | ~8 MB | imperceptible |
| 100 000 | ~250 MB | GC and scheduler still smooth |
| 1 000 000 | ~3 GB | GC pauses grow; scheduler latency visible |
| 10 000 000 | OOM territory | runtime degrades |
A Future-per-request pattern with 50 k req/s and 20 ms work creates ~1 000 concurrent Futures — fine. The same pattern with 20 ms x 1 M concurrent fan-out is OOM. Concurrency is not free; it is cheap. Cheap is not free. Section 9 makes this quantitative with Little's Law.
3. Structured concurrency — borrowing from Trio/Kotlin; errgroup as Go's answer¶
The textbook critique of futures: unstructured concurrency leaks. A goroutine spawned inside a function can outlive it. Trio (Python) and Kotlin coroutines formalized the answer — structured concurrency. Go's errgroup is the closest idiomatic match.
The principle: every goroutine has a parent scope; the scope cannot return until every child has completed or been cancelled. No goroutine outlives its lexical parent.
func fanOutStructured(ctx context.Context, ids []string) ([]User, error) {
g, gctx := errgroup.WithContext(ctx)
g.SetLimit(32)
out := make([]User, len(ids))
for i, id := range ids {
i, id := i, id
g.Go(func() error {
u, err := fetchUser(gctx, id); if err != nil { return err }
out[i] = u; return nil
})
}
if err := g.Wait(); err != nil { return nil, err }
return out, nil
}
When fanOutStructured returns, every goroutine has finished or seen gctx.Done(). No leak by construction. The unstructured version with bare go func() writing to an unbuffered channel is correct only if the caller drains forever. Structured concurrency is the default; unstructured is the optimization with a proof obligation.
| Property | errgroup | Trio nursery | Kotlin coroutineScope |
|---|---|---|---|
| Parent waits for children | Yes (Wait) | Yes | Yes |
| First child error cancels siblings | Yes | Yes | Yes |
| Concurrency cap | SetLimit | CapacityLimiter | Semaphore |
| Panic propagation | No (recover yourself) | Yes | Yes |
| Nested scopes | Yes (nest errgroup) | Yes | Yes |
The one gap: Go does not propagate panics across goroutines. A child panic in an errgroup.Go crashes the process (panic in goroutine = runtime.Goexit + crash), unless the function itself recovers. This is intentional but surprising; §4 covers the recovery wrapper every production codebase needs.
4. Cancellation models — explicit ctx, deadline propagation, panic propagation¶
There are three sources of cancellation in a Future graph: the consumer gave up, a deadline elapsed, a sibling failed. All three must flow to the producer without coupling.
4.1 Explicit context¶
Every Future must accept context.Context and observe it on every blocking point. The contract:
| Producer responsibility | Consumer responsibility |
|---|---|
Pass ctx to every downstream call | Provide ctx with appropriate deadline |
Select on ctx.Done() in every wait | Cancel ctx when no longer needed |
Return ctx.Err() on cancellation | Treat ctx.Err() as a non-retriable error |
A Future that ignores ctx is a goroutine leak waiting for a slow downstream. The canonical wrapper:
func Go[T any](ctx context.Context, fn func(context.Context) (T, error)) *Future[T] {
f := NewFuture[T]()
go func() {
defer func() {
if r := recover(); r != nil {
f.Reject(fmt.Errorf("panic: %v\n%s", r, debug.Stack()))
}
}()
v, err := fn(ctx)
if err != nil { f.Reject(err); return }
f.Resolve(v)
}()
return f
}
Three loadbearing details: ctx is passed explicitly (no globals); panics are recovered and surfaced through Reject (so a panic does not crash the process and does not leak the awaiter); defer recover runs before Resolve/Reject so the future is always fulfilled exactly once.
4.2 Deadline propagation¶
A deadline must flow through every layer:
func ServeOrder(ctx context.Context, id string) (*Order, error) {
ctx, cancel := context.WithTimeout(ctx, 250*time.Millisecond)
defer cancel()
g, gctx := errgroup.WithContext(ctx)
var (user User; items []Item; risk RiskScore)
g.Go(func() error { var e error; user, e = userSvc.Get(gctx, id); return e })
g.Go(func() error { var e error; items, e = inventorySvc.Items(gctx, id); return e })
g.Go(func() error { var e error; risk, e = riskSvc.Score(gctx, id); return e })
if err := g.Wait(); err != nil { return nil, err }
return assemble(user, items, risk), nil
}
If userSvc.Get takes 200 ms, the other two see gctx deadline arrive at the same wall-clock instant, not 250 ms after they started. Deadlines are absolute (time.Time), not durations. context.WithTimeout(parent, d) uses the shorter of parent.Deadline() and now+d — never longer. A child cannot outlive its parent.
The opposite mistake: each layer adds its own timeout. A request with a 300 ms budget passing through three services each adding 250 ms expires the client first. Compute remaining budget; never reset it: remaining := time.Until(parentDeadline); ctx, cancel := context.WithTimeout(ctx, remaining - margin).
4.3 Panic propagation¶
Go does not propagate panics across goroutines. A panic in g.Go either crashes the process or is silently lost, depending on the runtime version. Every production Future spawner must wrap its body in recover and convert the panic to an error. The Go[T] helper above does this; errgroup does not. Either wrap every g.Go or use a wrapper:
func safeGo(g *errgroup.Group, fn func() error) {
g.Go(func() (err error) {
defer func() {
if r := recover(); r != nil {
err = fmt.Errorf("panic: %v\n%s", r, debug.Stack())
}
}()
return fn()
})
}
A team that does not wrap panics in goroutines has a process-crash bug per major version of their dependencies. Wrap once, in the project's future package.
5. Distributed Futures — pending requests across services, correlation IDs, durable promises¶
A Future inside one process is a channel. A Future across processes is a correlation problem. The pattern: client sends a request with correlation_id, server replies asynchronously, client matches reply to the pending Future.
5.1 Correlation IDs over async transports¶
type Pending struct {
mu sync.Mutex
futures map[string]*Future[Response]
}
func (p *Pending) Send(ctx context.Context, req Request) (*Future[Response], error) {
cid := uuid.NewString()
req.CorrelationID = cid
f := NewFuture[Response]()
p.mu.Lock(); p.futures[cid] = f; p.mu.Unlock()
if err := p.transport.Publish(ctx, req); err != nil {
p.mu.Lock(); delete(p.futures, cid); p.mu.Unlock()
return nil, err
}
return f, nil
}
func (p *Pending) onReply(resp Response) {
p.mu.Lock()
f, ok := p.futures[resp.CorrelationID]
delete(p.futures, resp.CorrelationID)
p.mu.Unlock()
if ok { f.Resolve(resp) }
}
Three pitfalls. Memory leak under packet loss — if the reply never arrives, the entry sits forever; attach a per-Future expiry that calls Reject(ErrTimeout) and removes the entry. Replay attacks — sign the correlation ID or include it in an authenticated envelope. Process restart — pending futures vanish; the caller sees an error; idempotent retry recovers, which means the protocol must be idempotent.
5.2 Durable promises — Temporal, durable execution¶
A pending Future across services is not durable. If the client dies, the Future is gone — even when the server completes. For multi-minute or multi-hour Futures, the answer is durable execution: Temporal, Cadence, AWS Step Functions. The handle is persisted; the awaiter can crash and resume.
func OnboardCustomer(ctx workflow.Context, customerID string) error {
var profile Profile
if err := workflow.ExecuteActivity(ctx, ProvisionAccount, customerID).Get(ctx, &profile); err != nil { return err }
// This Future survives worker crashes and host failures.
return workflow.ExecuteActivity(ctx, SendWelcomeEmail, profile).Get(ctx, nil)
}
f.Get(ctx, ...) looks like Await, but workflow state is persisted to a datastore. If the worker dies between ExecuteActivity and Get, a new worker resumes from the last persisted step.
| Property | In-process Future | RPC | Temporal activity |
|---|---|---|---|
| Lifetime | Goroutine | TCP connection | Hours to days |
| Failure mode | Process restart loses state | Connection drop loses request | Survives worker/host/broker restarts |
| Latency floor | 50 ns | 100 µs | 1 ms (persistence) |
| Cancellation | ctx.Done() | Stream close | Workflow signal |
Decision: in-process Futures for sub-second work, RPC for inline request/response, Temporal for "might take an hour and must not be lost." Mixing the third with the first ("emit a job, poll on a goroutine") reinvents Temporal poorly.
6. Backpressure & flow control — semaphores, weighted, token bucket¶
A Future is a permission slip to spawn a goroutine. Unbounded permission slips destroy servers. Three primitives dominate.
6.1 Counting semaphore — golang.org/x/sync/semaphore¶
sem := semaphore.NewWeighted(64)
g, gctx := errgroup.WithContext(ctx)
for _, item := range items {
item := item
if err := sem.Acquire(gctx, 1); err != nil { return err }
g.Go(func() error {
defer sem.Release(1)
return process(gctx, item)
})
}
return g.Wait()
errgroup.SetLimit(64) is the simpler form when every task has equal weight. The semaphore is for weighted work — a heavy task takes 4 slots, a light one takes 1. This is correct backpressure when downstream resource cost varies per request.
6.2 Token bucket — golang.org/x/time/rate¶
A semaphore caps concurrency. A token bucket caps rate.
lim := rate.NewLimiter(rate.Limit(500), 100) // 500 ops/s, burst 100
for _, item := range items {
if err := lim.Wait(ctx); err != nil { return err }
go process(ctx, item)
}
Wait blocks until a token is available or ctx cancels. Use this for outbound API quotas (third-party limits), database connection conservation, and downstream protection. Pair with circuit breakers: when downstream returns 429, halve the limiter rate.
6.3 Bounded channel as a queue¶
jobs := make(chan Job, 256) plus select { case jobs <- j: case <-ctx.Done(): } is the smallest backpressure mechanism in Go: producers slow to consumer speed when the buffer fills. Pitfall: never close the channel from the producer side with multiple producers; close from a coordinator after sync.WaitGroup.Wait.
| Mechanism | Caps | Right for |
|---|---|---|
errgroup.SetLimit | Concurrency | Equal-weight fan-out |
semaphore.Weighted | Concurrent weight | Heterogeneous tasks |
rate.Limiter | Rate (ops/s) | External quota, DB load |
| Bounded channel | Queue depth | Producer-consumer with multiple workers |
The pathology these all prevent: an HTTP handler that does go expensiveBackgroundWork(req) with no bound and no observability. A traffic spike spawns 100 k goroutines, each holding a 10 KB working set; the process OOMs in 30 s. Every fan-out point needs a documented concurrency cap.
7. Observability — OpenTelemetry tracing through Future graphs¶
A Future graph is a tree (the caller spawns N futures; each may spawn more). The tree is invisible at runtime unless you instrument it. Three observability layers cover the failure modes.
7.1 Traces¶
Every Go spawns a child span; every Await ends it. OpenTelemetry's trace.Span carries through context.Context, so the child sees the parent automatically:
func tracedGo[T any](ctx context.Context, name string, fn func(context.Context) (T, error)) *Future[T] {
ctx, span := tracer.Start(ctx, name)
f := NewFuture[T]()
go func() {
defer span.End()
defer func() {
if r := recover(); r != nil {
span.RecordError(fmt.Errorf("panic: %v", r))
span.SetStatus(codes.Error, "panic")
f.Reject(fmt.Errorf("panic: %v", r))
}
}()
v, err := fn(ctx)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
f.Reject(err); return
}
f.Resolve(v)
}()
return f
}
A request that fans out to 8 backends produces a trace with 8 child spans; the slowest one is the request's critical path. Jaeger or Tempo show this as a gantt; the longest bar is the optimization target.
7.2 Metrics¶
| Metric | Type | Why |
|---|---|---|
future_inflight | Gauge | Concurrent unresolved Futures by class |
future_duration_seconds | Histogram | Latency by operation |
future_resolved_total{outcome} | Counter | Success/error/cancelled counts |
future_queue_wait_seconds | Histogram | Time blocked on semaphore before spawn |
future_panic_total | Counter | Panics caught by the recover wrapper |
Queue wait is the signal that backpressure is biting: when future_queue_wait_seconds.p99 climbs above the operation's own latency, the system is saturated.
7.3 Slow-Future detection¶
A logger that dumps stacks when in-flight count exceeds a threshold is the highest-leverage tool in this layer:
go func() {
t := time.NewTicker(30 * time.Second); defer t.Stop()
for range t.C {
if n := futureInflightCount.Load(); n > 10_000 {
buf := make([]byte, 1<<20)
runtime.Stack(buf, true)
log.Warn("future_inflight", "n", n, "stacks", string(buf))
}
}
}()
This is the difference between "we have a leak somewhere" and "9 800 of 10 000 stacks are blocked on a chan recv in productSvc.Lookup."
8. Failure modes — goroutine leak inventory, double resolve, context lifecycle bugs¶
The Future implementations in the middle level were correct under the assumption that producers and consumers cooperate. Production breaks that assumption.
8.1 The goroutine leak inventory¶
| Leak | Cause | Detection | Fix |
|---|---|---|---|
| Awaiter gives up, producer blocks on send | Unbuffered chan, no ctx.Done() branch in producer | runtime.NumGoroutine climbs forever | Buffer size 1 or select on ctx.Done() |
time.After in a loop | Each iteration starts a timer that lives until expiry | pprof goroutine shows blocked-on-timer | time.NewTimer + Reset |
context.WithCancel without defer cancel() | Cancel goroutine never cleaned up | vet warns; runtime accumulates | Always defer cancel() |
errgroup.Wait never called | Group goroutines never observed | Pending forever | Always pair g.Go with g.Wait |
| Producer panics; awaiter never notified | No recover in goroutine | Process crash or silent hang | Recover wrapper (§4.3) |
| Fan-out without bound | Spike spawns N=traffic goroutines | goroutine count tracks RPS | SetLimit or semaphore |
The first is the most common and most insidious. Reproduce:
// BUG — producer leaks if consumer cancels
func leakyFetch(ctx context.Context, id string) <-chan User {
ch := make(chan User)
go func() {
u := slowFetch(id) // ignores ctx
ch <- u // blocks forever if no reader
}()
return ch
}
If the caller does select { case u := <-leakyFetch(ctx, id): ...; case <-ctx.Done(): }, then on cancellation the goroutine sits in ch <- u forever. Fix:
func fetch(ctx context.Context, id string) <-chan User {
ch := make(chan User, 1) // buffer 1 absorbs late send
go func() {
u, err := slowFetch(ctx, id) // ctx threaded through
if err == nil {
select {
case ch <- u:
case <-ctx.Done(): // don't block on a dead reader
}
}
}()
return ch
}
8.2 Double resolve¶
A Future fulfilled twice loses the second value silently — or panics if it's a raw close(ch). sync.Once is the minimum. The subtler variant: a race between Resolve (success) and Reject (cancellation). The loser's result is dropped, which is correct only if both branches are idempotent. A Future resolving to an open file handle that gets dropped on cancellation leaks the FD. Wrap resource-holding Futures with close-on-discard logic, or return a Result[T, error] with Close().
8.3 Context lifecycle bugs¶
| Bug | Symptom |
|---|---|
Storing ctx in a struct | The context outlives its scope; go vet warns; cancellation never propagates |
context.Background() deep in a request | Request-scoped values lost; deadlines lost; logs missing correlation IDs |
context.WithValue for non-request data | Type-unsafe; opaque; tests fail to provide the value |
ctx shared across requests | One request's cancel kills another's work |
context.TODO() shipped to production | "I'll fix this later" never gets fixed |
The rule: context.Context is a function parameter, never a struct field; one context per goroutine tree; derived contexts never share parents across requests. A linter (contextcheck) catches the worst offenders.
9. Concurrency limits — Little's Law applied to Future fan-out¶
Little's Law: L = λ · W (in-flight = arrival × latency). It applies exactly to Futures.
Worked example. A handler fans out to 5 backends at 50 ms mean, 2 000 req/s. L = 2 000 × 5 × 0.050 = 500 mean concurrent Futures; with p99 = 200 ms tail, L_p99 ≈ 2 000. Set errgroup.SetLimit per request and a process-wide semaphore at ~2 000 plus headroom. At 50 k req/s, L = 12 500 mean and L_p99 = 50 000 tail — fine for goroutines, but downstream services must absorb the same fan-out.
Server resource exhaustion. 50 000 goroutines × 8 KB minimum stack = 400 MB. Each downstream connection takes an FD; default nofile = 1 024, so ulimit -n 65536 is mandatory for any fan-out service.
| Failure | Trigger | Mitigation |
|---|---|---|
| OOM from goroutine stacks | Unbounded fan-out under spike | Per-handler + global cap |
| FD exhaustion | Many outbound connections | Connection pooling; raise nofile |
| Downstream collapse | Fan-out amplifies request count | Rate limiter + circuit breaker |
| GC pressure | Many small allocations per Future | sync.Pool on hot paths |
| Scheduler starvation | Tight CPU loops without yields | runtime.Gosched() or yielding I/O |
The rule: every fan-out must be sized to the slowest downstream. If inventorySvc p99 is 500 ms and you fan out 100 calls/req at 100 req/s, that's 5 000 in-flight to inventory. Does it have capacity? Check before you ship.
10. Security — auth context propagation, sandboxed execution, timeout enforcement¶
A Future inherits the caller's context, which includes its authorization. Three failure classes are common.
Auth context loss. A handler stuffs identity into ctx; downstream futures must see it. The bug: spawning a Future with context.Background() (to avoid cancellation when the request ends) loses identity, trace, and deadline. The fix: context.WithoutCancel(ctx) (Go 1.21+) preserves values while detaching the deadline. Never strip the context wholesale.
Sandboxed execution. A Future running user-supplied code needs hard resource limits — CPU, memory, wall-clock, FS, network. Go has no in-process sandbox.
| Sandbox | Isolation | Latency |
|---|---|---|
os/exec subprocess | Process-level; OS RLIMIT | Fork ~1 ms |
| WASM (wazero, wasmtime-go) | In-process; no syscalls | ~10 µs invocation |
| gVisor / Firecracker | Kernel-level | 100 ms cold start |
| Lua / Starlark embed | Cooperative; trust interpreter | Microseconds |
For multi-tenant services, WASM is the modern answer: explicit memory cap, deadline, zero ambient permissions. Starlark is fine for "buggy customer code", not "hostile customer code".
Timeout enforcement. A Future without a deadline is a security issue — a slow downstream attacker holds resources indefinitely. Every external call gets a wall-clock budget (request budget minus elapsed minus margin). A timeout is a security control, not just UX. Handlers without a cap on customer-influenced calls are one slow-loris away from goroutine exhaustion.
11. Anti-patterns at scale¶
| Anti-pattern | Symptom | Fix |
|---|---|---|
| Future-per-sync-call | Goroutine spawn dwarfs the work | Inline; Futures for I/O only |
go expensiveWork() in a handler | Goroutine count tracks RPS; OOM under spike | errgroup with SetLimit |
Awaiter ignores ctx; producer ignores ctx | Cancellation does nothing; goroutines leak | ctx on both sides, observed at every blocking point |
Future stored in a struct field | Same Future awaited by multiple goroutines; one wins, rest hang | One Future, one Await; or use sync.Once resolve + broadcast |
chan T returned, no buffering, ctx ignored | Producer blocks forever if consumer gives up | Buffer 1, select on ctx.Done() in producer |
errgroup.Wait discarded | First error lost; siblings still running | Always check g.Wait() |
context.Background() inside a request | Auth, trace, deadline all lost | context.WithoutCancel if you must detach |
time.After in tight loop | Timer leak per iteration | time.NewTimer + Reset |
No recover in Future body | Panic crashes process | Wrapper that converts panic to error |
| Fan-out with no semaphore | Spike to OOM | SetLimit or weighted semaphore |
| Future-of-Future-of-Future chain | Latency stacks; debugging impossible | Flatten; one Future per logical step |
| Polling a Future from outside | Busy loop on select with default | Block on Done() or Await(ctx) |
| Synchronous-looking API hides goroutine | Caller has no way to cancel | Accept ctx; document goroutine lifecycle |
| Future for cross-host RPC | Crash loses pending work silently | Use durable execution (Temporal) for hour-scale work |
| Cancelling parent leaves children running | Children hold downstream resources after request abort | Always use errgroup.WithContext or derived ctx |
| Awaiting in shutdown without timeout | Graceful shutdown hangs forever | context.WithTimeout around g.Wait() |
| Future inside a hot loop | Microsecond work overwhelmed by goroutine spawn | Use a worker pool or batch |
singleflight returning shared mutable | Two callers mutate the same value | Treat result as immutable; defensive copy |
The deepest anti-pattern: using Futures as a default for asynchrony. Many problems are sequential. A handler that does three things in order at 5 ms each does not benefit from three Futures; it pays goroutine overhead for nothing. Futures are for overlap; sequential work is for sequential code. The g, gctx := errgroup.WithContext(ctx) line is not free — it is a commitment that the work is parallelizable, the failures compose with first-error-cancels, and the result is worth the orchestration cost.
12. Closing principles¶
A Future is a deferred value with a cancellation contract. Honor both halves:
-
The deferred value is exactly one outcome. Not zero (use
chan struct{}), not many (use a stream), not maybe. A Future that resolves twice, never, or to nothing is a bug.sync.Onceguards the implementation; the signature(T, error)documents the contract. -
Cancellation is cooperative. Go cannot preempt a goroutine. The producer must observe
ctx.Done()at every blocking point. A Future that ignores cancellation is a leak waiting for a slow downstream. The smallest correct Future is<-chan Result[T]withctxthreaded through and aDone()branch on the producer's select. -
Structured concurrency is the default. Every goroutine has a lexical parent. Use
errgroup.WithContextandWait; the parent does not return until every child has finished or cancelled. Unstructured concurrency carries a proof obligation: who owns the lifetime, and who cleans up. The proof must be written down. -
Concurrency is cheap, not free. ~1 µs spawn + 8 KB stack per Future; 100 k concurrent is 800 MB. Every fan-out needs
SetLimitor a semaphore sized to the slowest downstream. Little's Law:L = λ · W— measure both, pick a cap, alert when you approach it. -
Deadlines shrink; never grow. A child context's deadline is the parent's minus headroom. Cross-service calls carry the deadline in headers (
grpc-timeout); each hop subtracts overhead and refuses work that cannot fit. -
Recover panics in every Future spawner. Go does not propagate panics across goroutines; unrecovered panic crashes the process. One
defer recoverwrapper, in the project'sfuturepackage, used everywhere. -
Observability is non-optional. OpenTelemetry spans across
ctxmake the graph a trace. Metrics onfuture_inflightandfuture_duration_secondsmake it a dashboard. A slow-Future logger turns "we have a leak" into "the leak is inproductSvc.Lookup." -
Distributed Futures are correlation problems. Pending entries live in a
map[correlation_id]*Futurewith expiry; without expiry, it leaks. For work that must survive restarts, use durable execution (Temporal), not in-process Futures. -
Security follows the context. Auth, trace, deadline all live in
ctx. Dropping the context (context.Background()) loses all three. Usecontext.WithoutCancelto detach lifetime while preserving values. Sandbox untrusted work in WASM or subprocesses. -
Futures are for I/O overlap. Not a general-purpose primitive. Sequential CPU work runs faster sequentially. The
errgroup.WithContextline is a commitment that work is parallelizable, failures compose, and the orchestration cost is worth paying.
Get these right and Futures are invisible: handlers fan out, traces show the critical path, lag stays bounded. Get them wrong and the on-call incident is 50 000 goroutines blocked on the same downstream, a context stored in a struct that was never cancelled, and a panic three deploys ago silently crashing one pod per hour. The Future is the easiest pattern to write; the hardest to operate.
Further reading¶
- Nathaniel Smith, Notes on structured concurrency, or: Go statement considered harmful
golang.org/x/sync/errgroupandsemaphoresource — 100 lines each- Sameer Ajmani, Go Concurrency Patterns: Pipelines and cancellation — Go blog
- Bryan C. Mills, Rethinking Classical Concurrency Patterns — GopherCon 2018
- Roman Elizarov, Structured concurrency — Kotlin coroutines design notes
- Trio docs, Structured Concurrency — Python implementation and rationale
- Temporal, Workflows and Activities — durable execution model
- Russ Cox, Go and Dogma —
context.Contextdesign rationale pprof+runtime/trace— the only tools that show what your Future graph is doing- Martin Kleppmann, Designing Data-Intensive Applications, chapter 8
- Tony Hoare, Communicating Sequential Processes