Context Tree — Senior¶
Reading the Source¶
The context package is one of the cleanest examples of careful concurrent programming in the standard library. Most of its work happens in three places: the cancelCtx struct, the propagateCancel function, and the cancel method. Once you can read those, you understand the whole tree.
We will reproduce the relevant code in pseudocode (close enough to the real source that you can map line-for-line) and discuss the trade-offs.
cancelCtx¶
type cancelCtx struct {
Context // embedded parent
mu sync.Mutex // protects the fields below
done atomic.Value // chan struct{}, lazily allocated
children map[canceler]struct{} // set to nil after cancel
err error // set once, under mu
cause error // since Go 1.20, set with err
}
Three lazinesses to notice:
doneis lazy. It is only allocated when something callsDone(). Many short request-scoped contexts never have a watcher; they don't pay for a channel.childrenis lazy. It is allocated on the first call topropagateCancelthat registers a child.causeis implicit. If never set, it equalserr.
atomic.Value is used for done because Done() may be called concurrently with cancel(). Atomic load makes that race-free without contesting mu.
The cancel method¶
func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {
if err == nil {
panic("context: internal error: missing cancel error")
}
if cause == nil {
cause = err
}
c.mu.Lock()
if c.err != nil {
c.mu.Unlock()
return // already cancelled — first-write-wins
}
c.err = err
c.cause = cause
d, _ := c.done.Load().(chan struct{})
if d == nil {
c.done.Store(closedchan) // sentinel
} else {
close(d)
}
for child := range c.children {
// NOTE: child holds its own mu, so deadlock-free only because
// we never grab a child's mu while holding parent.mu... except
// here. The recursion is safe because children form a tree:
// no cycles, and a child cannot back-cancel its parent.
child.cancel(false, err, cause)
}
c.children = nil
c.mu.Unlock()
if removeFromParent {
removeChild(c.Context, c)
}
}
Key invariants:
- First-write-wins. Only the first cancellation sets
err/cause. Later cancellations are no-ops. closedchansentinel. If nobody ever calledDone()on this node, the runtime hands out a pre-closed singleton instead of allocating a fresh channel.- Children cancel recursively under parent's lock. This is the load-bearing line. We will look at it more carefully in a moment.
removeFromParentisfalseduring cascade. When a parent is cancelling its children, there is no need for each child to callremoveChild(parent, child)— the parent will resetchildren = nilanyway.removeFromParent = trueonly when the cancel is called from user code (the returnedCancelFunc).
propagateCancel¶
func propagateCancel(parent Context, child canceler) {
done := parent.Done()
if done == nil {
return // parent is never cancellable (e.g. Background, WithoutCancel)
}
select {
case <-done:
child.cancel(false, parent.Err(), Cause(parent))
return
default:
}
if p, ok := parentCancelCtx(parent); ok {
p.mu.Lock()
if p.err != nil {
child.cancel(false, p.err, p.cause)
} else {
if p.children == nil {
p.children = make(map[canceler]struct{})
}
p.children[child] = struct{}{}
}
p.mu.Unlock()
} else {
// Parent is custom; we cannot register in its children map.
// Spawn a goroutine that forwards the signal.
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err(), Cause(parent))
case <-child.Done():
}
}()
}
}
Two paths:
- Fast path:
parentis a built-incancelCtx(ortimerCtx, which embeds one). Register the child inp.childrenunderp.mu. Cost: one map insert. - Slow path:
parentis a customContext. The runtime cannot reach into a foreign struct, so it spawns a goroutine that selects onparent.Done()and the new child'sDone(). Every derived cancelable child of a custom context spawns one goroutine.
This is the single biggest reason to avoid implementing your own Context. Wrappers added by middleware (a Context that adds a logger, for example) silently turn each request's cancellation tree into a goroutine-per-derive forest.
parentCancelCtx is a runtime-internal helper that walks up the parent chain through valueCtx nodes (which embed their parent transparently) until it finds a real cancelCtx, or returns (nil, false) if it cannot find one.
removeChild¶
func removeChild(parent Context, child canceler) {
p, ok := parentCancelCtx(parent)
if !ok {
return
}
p.mu.Lock()
if p.children != nil {
delete(p.children, child)
}
p.mu.Unlock()
}
Called when a user invokes the CancelFunc returned by WithCancel (or when a timer fires in timerCtx). It removes the child from the parent's map so the entry can be GCed.
If you forget defer cancel(), the child stays in the parent's map until either the parent cancels or the child is GCed. As long as the parent is reachable, the child is reachable through parent.children, so it cannot be GCed. Hence the memory leak.
timerCtx¶
type timerCtx struct {
cancelCtx
timer *time.Timer
deadline time.Time
}
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
if cur, ok := parent.Deadline(); ok && cur.Before(d) {
return WithCancel(parent) // first-deadline-wins short-circuit
}
c := &timerCtx{
cancelCtx: newCancelCtx(parent),
deadline: d,
}
propagateCancel(parent, c)
if dur := time.Until(d); dur <= 0 {
c.cancel(true, DeadlineExceeded, nil)
return c, func() { c.cancel(false, Canceled, nil) }
}
c.timer = time.AfterFunc(dur, func() {
c.cancel(true, DeadlineExceeded, nil)
})
return c, func() { c.cancel(true, Canceled, nil) }
}
timerCtx embeds cancelCtx, so its cascade behaviour is identical. The only addition is a time.Timer that fires the cancel on deadline expiry. The user-returned CancelFunc stops the timer (inside cancel via c.timer.Stop() — omitted here) and removes the child from its parent.
valueCtx¶
type valueCtx struct {
Context // parent
key, val any
}
func (c *valueCtx) Value(key any) any {
if c.key == key {
return c.val
}
return c.Context.Value(key)
}
valueCtx has no cancellation state of its own. Done(), Err(), and Deadline() all delegate to the embedded Context. Only Value is overridden.
This means: if you call WithCancel(someValueCtx), propagateCancel walks up through the valueCtx to the real parent — the cancelCtx somewhere above — and registers there. The cascade ignores valueCtx entirely.
parentCancelCtx handles this skip:
func parentCancelCtx(parent Context) (*cancelCtx, bool) {
done := parent.Done()
if done == closedchan || done == nil {
return nil, false
}
p, ok := parent.Value(&cancelCtxKey).(*cancelCtx)
// ...
}
It uses a sentinel key (&cancelCtxKey) that cancelCtx.Value answers with c itself. So parent.Value(&cancelCtxKey) walks through valueCtx nodes and returns the nearest *cancelCtx ancestor. Elegant.
withoutCancelCtx¶
type withoutCancelCtx struct {
c Context
}
func (withoutCancelCtx) Deadline() (time.Time, bool) { return time.Time{}, false }
func (withoutCancelCtx) Done() <-chan struct{} { return nil }
func (withoutCancelCtx) Err() error { return nil }
func (w withoutCancelCtx) Value(key any) any { return w.c.Value(key) }
WithoutCancel returns this wrapper. Crucially, Done() returns nil. When propagateCancel runs against a child of withoutCancelCtx, it sees done == nil and returns immediately — no registration, no watcher. The detached child is a standalone subtree.
Value still delegates, so request-scoped values propagate.
AfterFunc¶
Implemented as a special child:
type afterFuncCtx struct {
cancelCtx
once sync.Once
f func()
}
func (a *afterFuncCtx) cancel(removeFromParent bool, err, cause error) {
a.cancelCtx.cancel(false, err, cause)
if removeFromParent {
removeChild(a.Context, a)
}
a.once.Do(func() { go a.f() })
}
func AfterFunc(ctx Context, f func()) (stop func() bool) {
a := &afterFuncCtx{f: f}
a.cancelCtx.Context = ctx
propagateCancel(ctx, a)
return func() bool {
stopped := false
a.once.Do(func() { stopped = true })
if stopped {
removeChild(ctx, a)
}
return stopped
}
}
(Real source is more careful; this is for shape.) Two design points:
- The callback is run in a new goroutine — so it cannot deadlock the cascade by trying to acquire locks the cascade holds.
stop()uses the sameonceto deregister, racing the callback safely.
If you call stop() after the cascade has already fired but before f has run, stop() returns false and f still runs. This is intentional: once the runtime has decided to invoke f, the user cannot cancel that decision.
The Race Profile¶
Several races could happen in a tree under load. The package avoids them with careful invariants.
Race 1: Done() and cancel() concurrent.
Done() does an atomic load. cancel() takes mu, then atomically stores closedchan or closes the existing channel. The atomic.Value ensures the reader either gets nil (initial), the user-allocated chan struct{}, or closedchan — never a torn value.
Race 2: two cancel()s on the same node.
Both grab mu. The first sets err. The second finds err != nil and returns. First-write-wins.
Race 3: child cancellation overlapping parent cancellation.
Parent cancels at the same instant the user calls childCancel(). Both acquire their own mu. The parent's cascade calls child.cancel(false, ...), which finds the child already cancelled (because user got there first) — returns. No double-effect, no deadlock.
Race 4: deadlock between parent.mu and child.mu.
The parent's cascade holds parent.mu while calling child.cancel(false, ...). That call acquires child.mu. Could a chain of locks deadlock? No, because:
- Children never lock parents.
- The lock order is strictly down the tree.
removeChild acquires parent.mu from a child's cancel function — but only with removeFromParent = true, which the cascade does not set. So during cascade, no child reaches up.
Cause Walk¶
context.Cause(ctx) walks up the parent chain looking for the original cancellation cause:
func Cause(c Context) error {
if cc, ok := c.Value(&cancelCtxKey).(*cancelCtx); ok {
cc.mu.Lock()
defer cc.mu.Unlock()
return cc.cause
}
return c.Err()
}
The walk uses the same cancelCtxKey trick. If the nearest cancelCtx ancestor has a cause, return it. Otherwise fall back to Err().
This means Cause is O(depth) for the first call and is locked once. A tree of depth 10 incurs about 10 interface-call hops to find the nearest cancel ancestor.
The Watcher Goroutine for Custom Contexts¶
We mentioned this in passing; here is the full cost.
Suppose someone wrote:
type loggingCtx struct{ context.Context }
func (loggingCtx) Done() <-chan struct{} { ... } // wraps parent
Now every middleware in the chain wraps req.Context() in a loggingCtx. When the handler derives a child:
parentCancelCtx(loggingCtxInstance) cannot find a built-in cancelCtx directly through loggingCtx (because loggingCtx's Value method does not answer &cancelCtxKey — or if it embeds and forwards, it works). Real-world: many custom contexts don't implement the value forwarding correctly. The runtime falls back to the goroutine path.
For each derive, one goroutine. For a fan-out of 100, 100 goroutines. Multiply by request rate.
The fix: do not implement Context yourself. Use WithValue to attach data; let the runtime see the real ancestor.
Tree Depth and Stack¶
The cascade is implemented with explicit recursion (for child := range c.children { child.cancel(...) }). A depth-10 tree uses 10 stack frames per cancel. The Go runtime's stacks grow as needed, but in extreme cases a million-deep tree could cause a stack overflow. Real apps stay well below.
The closedchan Sentinel¶
A single closed channel shared across the whole process. Any Done() call on an already-cancelled node returns this singleton. Two consequences:
- Comparing channels:
if ctx.Done() == closedchanwould work but is internal; user code cannot reach the sentinel. - Many cancelled contexts share the same
Done()value. This is by design: a closed channel is always ready, so identity doesn't matter.
A Worked Example of Cascade Timing¶
b := context.Background()
p, cancelP := context.WithCancel(b)
a, _ := context.WithCancel(p)
a1, _ := context.WithCancel(a)
a2, _ := context.WithCancel(a)
c, _ := context.WithCancel(p)
cancelP()
The runtime path:
cancelP()callsp.cancel(true, Canceled, Canceled).p.mu.Lock().p.err = Canceled.p.cause = Canceled.close(p.done)(if allocated) or storeclosedchan.- Iterate
p.children. Suppose order isa, c. - Call
a.cancel(false, ...). a.mu.Lock(). Set fields. Closea.done. Iteratea.children = {a1, a2}.- Call
a1.cancel(false, ...). Lock, set, close. Children empty. - Call
a2.cancel(false, ...). Lock, set, close. Children empty. a.children = nil. Unlocka.- Call
c.cancel(false, ...). Same as above. p.children = nil. Unlockp.- Outer caller:
removeChild(b, p)(no-op because Background has no children map).
The whole cascade runs on the goroutine that called cancelP(). Total time: O(N) under a chain of locks, but each lock is held briefly. For 1000 nodes, the cascade typically completes in tens of microseconds.
The Stop() Decision¶
When you call cancel() on a timerCtx, the runtime calls c.timer.Stop(). If the timer has already fired (deadline reached), Stop returns false — the cancellation has already happened. The user's call becomes a no-op via the first-write-wins check.
If the timer is still pending, Stop returns true. Now the user's cancel proceeds as if it were a plain WithCancel's cancel.
Production Stories¶
Two patterns I have seen in production:
Story 1: Cascade Storm¶
A pool of 100,000 idle goroutines each held a child context derived from a shared root. Some bug caused the root to cancel every second (a flapping deadline). Each cancel cascaded into 100,000 children, holding root.mu for milliseconds. Latency spiked.
The fix was twofold: (1) reduce the cancel rate, (2) flatten the tree so each goroutine derived only when about to do work, not eagerly at startup.
Story 2: Goroutine Leak via Custom Context¶
A homemade Context wrapper added structured logging fields. Every RPC derived WithCancel from it. Each derive spawned a watcher goroutine. Under load the process accumulated millions of goroutines. The fix: drop the custom context, use WithValue with a logger key.
When to Reach for WithoutCancel and AfterFunc¶
WithoutCancelwhen a background task must outlive the trigger and you want value propagation. Audit logs, asynchronous notifications, metrics emission.AfterFuncfor synchronous resource cleanup — closing a connection, releasing a semaphore — that you used to write as a<-ctx.Done()goroutine.
If you are migrating from Go 1.20 or earlier, audit your <-ctx.Done() goroutines and replace them with AfterFunc where possible.
Summary¶
The cancelCtx-propagateCancel-cancel triad is small and elegant. Trees are wired through children maps with lazy allocation; cancellation cascades depth-first under per-node mutexes; the only goroutine overhead is the watcher created when the parent is a non-built-in Context. Once you have read the source — and we strongly recommend doing so — every subtle behaviour of the API (first-deadline-wins, idempotent cancel, Cause propagation, WithoutCancel short-circuit) follows from one or two lines of code.