Context Internals — Middle¶
Table of Contents¶
- From Users to Implementers
- The cancelCtx Struct in Detail
- propagateCancel — The Linker
- The parentCancelCtx Optimisation
- The Slow Path: A Forwarder Goroutine
- removeChild — Breaking the Link
- Lazy Done Channel Allocation
- The closedchan Trick
- Atomic err Plus Mutex — Why Both?
- Walking the cancel Method Step by Step
- How the Children Map Drains
- What Happens on Double Cancel
- Reading the cancelCtx Source
From Users to Implementers¶
At junior level we toured the types from outside. At middle level we trace the actual control flow when you call context.WithCancel(parent). Three things happen, in this order:
- A
cancelCtxstruct is heap-allocated. propagateCancelis called to link this child to its parent.- A
CancelFuncclosure is returned.
The first is mundane allocation. The third is a closure that calls c.cancel(true, Canceled, nil). The interesting work is in step two: propagateCancel is the function that wires the cancellation tree together, and its implementation choices dominate the package's performance.
We will spend most of this page on that one function — and on the cancel method that fires when the wiring eventually pays off.
The cancelCtx Struct in Detail¶
Reproduced verbatim from src/context/context.go:
type cancelCtx struct {
Context // the embedded parent
mu sync.Mutex // protects following fields
done atomic.Value // of chan struct{}, created lazily,
// closed by first cancel call
children map[canceler]struct{} // set to nil by the first cancel call
err atomic.Value // set to non-nil by the first cancel call
cause error // set to non-nil by the first cancel call
}
The choice of atomics + mutex is deliberate. The two fields that may need to be read from many goroutines at once — done and err — are atomics, so reads do not have to take the mutex. The other fields are accessed only when there is a write happening (cancel time, child registration), so the mutex is fine.
Imagine a hot loop:
This is allowed to be lock-free because Err() is implemented as an atomic load. If Err were behind the mutex, every iteration would acquire and release a lock, which would dominate the loop. The atomic+mutex split makes the hot path cheap.
A handy summary:
| Field | Atomic? | Lock-protected? | Why |
|---|---|---|---|
Context | No (set once at construction, then immutable) | No | Embedded parent never changes |
mu | n/a — it is the lock | n/a | |
done | Yes (atomic.Value) | Also under mu when storing | Reads must be fast |
children | No | Yes | Mutated only at child register / cancel |
err | Yes (atomic.Value) | Also under mu when storing | Err() must be fast |
cause | No | Yes | Read only via Cause(ctx), which already takes mu |
Why store err and cause differently? err must be read on every Err() call (potentially in a hot loop), so atomic. cause is read only when somebody explicitly asks via context.Cause(ctx), which is rare — a mutex-protected read is fine.
propagateCancel — The Linker¶
This is the function that connects a freshly-created child to its parent. Read it carefully:
func (c *cancelCtx) propagateCancel(parent Context, child canceler) {
c.Context = parent
done := parent.Done()
if done == nil {
return // parent is never canceled
}
select {
case <-done:
// parent is already canceled
child.cancel(false, parent.Err(), Cause(parent))
return
default:
}
if p, ok := parentCancelCtx(parent); ok {
// parent is a *cancelCtx, or derives from one.
p.mu.Lock()
if err := p.err.Load(); err != nil {
// parent has already been canceled
child.cancel(false, err.(error), p.cause)
} else {
if p.children == nil {
p.children = make(map[canceler]struct{})
}
p.children[child] = struct{}{}
}
p.mu.Unlock()
return
}
if a, ok := parent.(afterFuncer); ok {
// parent implements an AfterFunc method.
c.mu.Lock()
stop := a.AfterFunc(func() {
child.cancel(false, parent.Err(), Cause(parent))
})
c.Context = stopCtx{
Context: parent,
stop: stop,
}
c.mu.Unlock()
return
}
goroutines.Add(1)
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err(), Cause(parent))
case <-child.Done():
}
}()
}
Five branches, each handling a different parent shape. Reading top to bottom:
- Parent's
Done()isnil— parent isBackground/TODOor some similar uncancelable. There is nothing to listen to. Return immediately. The child still functions as a cancel target if its ownCancelFuncis invoked, but cancellation does not flow down because there is no cancellation to flow. - Parent is already canceled — the
select { case <-done: }succeeds. We cancel the child immediately and return. The child never gets registered anywhere because its lifetime has effectively ended. - Parent is a known
*cancelCtx(or wraps one in a recognisable way) —parentCancelCtx(parent)returns the underlying*cancelCtx. We register ourselves inparent.childrenand return. No goroutine spawned. - Parent implements
AfterFunc(func()) func() bool— a custom interface. We register a callback. When the parent cancels, our callback fires and cancels us. - Otherwise — the slow path. Spawn a goroutine that does a two-case select on
parent.Done()andchild.Done(). When either fires, we react.
The branches are ordered by frequency: case 3 (recognised *cancelCtx) is by far the most common in real code. The slow path is reserved for custom contexts.
The parentCancelCtx Optimisation¶
The crucial helper:
func parentCancelCtx(parent Context) (*cancelCtx, bool) {
done := parent.Done()
if done == closedchan || done == nil {
return nil, false
}
p, ok := parent.Value(&cancelCtxKey).(*cancelCtx)
if !ok {
return nil, false
}
pdone, _ := p.done.Load().(chan struct{})
if pdone != done {
return nil, false
}
return p, true
}
How does this work? Two clever tricks:
Trick 1: &cancelCtxKey as a Sentinel¶
The package declares an unexported package-level variable:
Its address — &cancelCtxKey — is used as a private context key. The cancelCtx.Value method is overridden to return itself when this key is queried:
func (c *cancelCtx) Value(key any) any {
if key == &cancelCtxKey {
return c
}
return value(c.Context, key)
}
So parent.Value(&cancelCtxKey) walks up the chain (through any valueCtx wrappers) until it finds a cancelCtx, which returns itself. If the chain has no cancelCtx, we get nil.
This is how parentCancelCtx pierces through arbitrarily many valueCtx wrappers to find the innermost real cancelable. It is a brilliant little protocol: the unexported cancelCtxKey lets the package do reflection-free type recognition through the Value interface.
Trick 2: The done Channel Cross-Check¶
Even after we have a *cancelCtx in hand, we need to ensure it is the real cancel source for the parent. Someone might have wrapped a cancelCtx in a custom type that overrides Done() to return a different channel. In that case, we must not register ourselves on the inner cancelCtx's children list — because cancelling the outer wrapper would not flow through to that channel.
The check:
Compares the parent's Done() channel against the inner cancelCtx's done channel. If they match, the cancelable is the authoritative source. If they differ, fall back to the slow path.
This is a beautiful piece of defensive design. It accommodates custom contexts that wrap a cancelCtx for their parent but want different cancellation semantics.
The Slow Path: A Forwarder Goroutine¶
If parentCancelCtx returns (nil, false) and there's no AfterFunc method, we hit:
goroutines.Add(1)
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err(), Cause(parent))
case <-child.Done():
}
}()
This goroutine sits in a select. It is parked on two channels:
- The parent's
Done()— wakes when the parent cancels. - The child's
Done()— wakes when the child cancels (so we can exit and stop watching the parent).
When either fires, the goroutine runs once and exits. It is small and bounded.
But it is still a goroutine. Each one costs ~2 KB of stack while parked, plus scheduler bookkeeping. If you derive 100,000 cancelable contexts off a custom-implemented parent, you spawn 100,000 forwarder goroutines.
The package keeps a counter for tests:
You can read it via runtime/debug or by inspecting the package's tests. It is incremented each time the slow path fires.
When Does the Slow Path Fire?¶
In normal Go code, never. Every standard-library context type is recognised by parentCancelCtx. The slow path fires only when:
- You define a custom type that implements
context.Contextbut does not wrap a recognisable cancelCtx and does not implementAfterFunc. - You wrap a stdlib context but override
Done()in a way that changes the channel identity. - A testing mock implements the interface without delegating.
For framework authors writing their own context types, this is the most important page in this document.
removeChild — Breaking the Link¶
When a child cancels, we need to remove it from the parent's children map so the parent does not leak references. That is what removeChild does:
func removeChild(parent Context, child canceler) {
if s, ok := parent.(stopCtx); ok {
s.stop()
return
}
p, ok := parentCancelCtx(parent)
if !ok {
return
}
p.mu.Lock()
if p.children != nil {
delete(p.children, child)
}
p.mu.Unlock()
}
Two cases:
parentis astopCtx— the parent had registered anAfterFunccallback for this child. Callstop()to unregister it.parentis a recognised*cancelCtx— delete this child from its children map.
Note that the slow-path case is not represented here. The forwarder goroutine handles its own cleanup: when the child's Done fires, the goroutine's select takes the <-child.Done() branch and the goroutine exits. There is no separate map entry to remove.
Why removeChild Matters¶
Without it, every child your code creates would stay in its parent's children map forever, even after the child completed. Long-running parents (think context.Background() plus a top-level cancelCtx in main) would accumulate millions of dead children, exhausting memory.
The Go runtime cannot rescue you here. The children map holds live references — these are not weak references. GC will not collect a child that the parent's map still points to.
This is why you must call the returned CancelFunc, even when the work completed naturally. Calling it triggers the removeFromParent=true branch in cancel, which calls removeChild. Skipping this is a classic context leak. go vet -lostcancel catches it.
Lazy Done Channel Allocation¶
Let us re-read Done:
func (c *cancelCtx) Done() <-chan struct{} {
d := c.done.Load()
if d != nil {
return d.(chan struct{})
}
c.mu.Lock()
defer c.mu.Unlock()
d = c.done.Load()
if d == nil {
d = make(chan struct{})
c.done.Store(d)
}
return d.(chan struct{})
}
Two reads of c.done.Load(). Why?
This is double-checked locking. The first read is atomic and lock-free — if the channel already exists, we return without touching the mutex at all. The second read happens inside the lock; it ensures we do not allocate twice if two goroutines race on the first read.
It works because atomic.Value.Store and atomic.Value.Load are correctly ordered with respect to each other.
Why Lazy?¶
Many contexts never have Done observed:
- A handler creates
ctx, cancel := context.WithTimeout(...), calls a downstream service, the downstream service succeeds before the deadline, and we never enter a select onctx.Done(). - A goroutine derives a context just to pass to a method that takes
context.Contextbut does not actually block.
In these cases, the channel allocation would be wasted. Go's standard library habit is to defer allocation until the moment of need; cancelCtx.Done is a textbook example.
The cost saved is one make(chan struct{}), which is a 96-byte heap allocation plus the runtime channel descriptor. Per-request, small. At 200k req/s, real.
The closedchan Trick¶
Look at the cancel method:
func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {
// ...
d, _ := c.done.Load().(chan struct{})
if d == nil {
c.done.Store(closedchan)
} else {
close(d)
}
// ...
}
When cancel runs and the channel was never allocated, instead of allocating a fresh closed channel, we substitute the package-global closedchan:
This is one channel, closed at package init time. Every cancelled-but-never-observed context shares it.
A later call to Done() on a canceled-but-never-observed context returns this shared channel:
The caller does not know or care that it is shared. From the outside, a closed channel is a closed channel. Memory saved, behaviour preserved.
This trick — substituting a singleton "already-done" channel — is one of the most economical decisions in the package.
Atomic err Plus Mutex — Why Both?¶
Why is err an atomic.Value even though we always store it inside the mutex? Two reasons:
- Reads do not need the mutex.
Err()is called frequently. Avoiding lock acquisition on the read path is a big win. Errmay need to be visible beforecancelfinishes. Suppose goroutine A callscancel, while goroutine B is inErr(). If B reads through the mutex, it might block until A is done. If B reads atomically, B sees the new error as soon as A has stored it — before A has finished closing the done channel.
But wait: returning a non-nil err before the done channel has been closed would violate the spec. The spec says: "If Done is not yet closed, Err returns nil. If Done is closed, Err returns a non-nil error explaining why."
So the implementation has to be careful. Re-read Err():
func (c *cancelCtx) Err() error {
if err := c.err.Load(); err != nil {
// Ensure the done channel has been closed before returning a non-nil error.
<-c.Done()
return err.(error)
}
return nil
}
After atomically loading err, if it is non-nil, we receive from Done() — which forces a "happens-before" relationship: by the time we have received from Done, the channel has been closed, which means cancel has run the close(d) (or stored closedchan).
So the invariant spec said Done is closed before Err is non-nil is preserved even though the atomic store and the channel close are technically two separate operations.
This is subtle. It is the kind of code you write only after the race detector has yelled at you a few times.
Walking the cancel Method Step by Step¶
Here is the full body, annotated:
func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {
if err == nil {
panic("context: internal error: missing cancel error")
}
if cause == nil {
cause = err
}
c.mu.Lock()
if c.err.Load() != nil {
c.mu.Unlock()
return // already canceled
}
c.err.Store(err)
c.cause = cause
d, _ := c.done.Load().(chan struct{})
if d == nil {
c.done.Store(closedchan)
} else {
close(d)
}
for child := range c.children {
// NOTE: acquiring the child's lock while holding parent's lock.
child.cancel(false, err, cause)
}
c.children = nil
c.mu.Unlock()
if removeFromParent {
removeChild(c.Context, c)
}
}
Numbered walk:
- Sanity checks —
errcannot benil; ifcauseisnil, fall back toerr. - Take the mutex.
- Check idempotency — if already canceled (
err.Load() != nil), return. This is what makescancelsafe to call repeatedly: subsequent calls are no-ops. - Store the error and cause.
- Close the done channel — either close the lazily-allocated one or substitute
closedchan. - Cascade to children — for each child registered with us, call its
cancelrecursively. The comment notes that we are calling into the child's mutex while holding our own. This is safe because the lock order is always parent-to-child; there is no opposing direction. - Drop the children map — set to
nil. This releases references and lets the runtime GC the freed children. - Release the mutex.
- If
removeFromParentis true (i.e., this is the external cancel call, not a cascading one), remove ourselves from our parent's children map.
The lock-order observation is important. If two cancel calls raced from two unrelated parents trying to cancel overlapping subtrees, we would need a global lock order to avoid deadlock. The package's structure — cancel always cascades down — provides that order naturally.
Why removeFromParent Is Different for Cascading vs External¶
The argument removeFromParent is true only when called from the outside (your CancelFunc closure). When cancel recurses into children, it passes false because the parent is about to set c.children = nil anyway — there is no point in each child also trying to delete itself from a map that is about to be wiped.
This shaves a lock acquisition per child during cascade. For a wide fan-out, it adds up.
How the Children Map Drains¶
Walk through this scenario:
root, cancel := context.WithCancel(context.Background())
defer cancel()
for i := 0; i < 1000; i++ {
sub, _ := context.WithCancel(root)
go work(sub)
}
We register 1000 children in root.children. None of the sub-contexts have their cancel called (we threw away the cancel func — a bug, but let's continue).
When the work goroutines finish, the sub-contexts become unreachable from user code. But they are still reachable from root.children. They will not be garbage-collected.
Now cancel() runs at the end of the outer function:
- The root's
cancelmethod fires. - It cascades into each of the 1000 children, setting their
errand closing theirdone. - It then sets
root.children = nil. - Now the sub-contexts are unreachable from user code and from the parent. GC can free them.
So the memory leak (children retained while parent lives) lasts only as long as the parent. If root lives for the whole process, those 1000 sub-contexts leak for the whole process. If root is short-lived, the leak is bounded.
The lesson: discarding a CancelFunc is only safe if the parent will cancel soon.
What Happens on Double Cancel¶
What if you call cancel() twice? The idempotency check at the top of cancel handles it:
The second call takes the mutex, sees that err is already non-nil, releases the mutex, and returns. Done channel stays closed. Children map stays nil. Everything is a no-op.
This is why defer cancel() is safe even when you call cancel() explicitly earlier. Defer always fires; the second invocation is a no-op.
Cause Semantics on Double Cancel¶
WithCancelCause returns a CancelCauseFunc that takes an error. If you call it twice with different causes:
ctx, cancel := context.WithCancelCause(parent)
cancel(errors.New("first reason"))
cancel(errors.New("second reason"))
The second call is a no-op. The cause stays "first reason". This is consistent with the err semantics — first cancellation wins, subsequent ones are silently dropped.
Reading the cancelCtx Source¶
Best way to internalise this material: open src/context/context.go and read these functions in order:
| Order | Function/Type | Lines (approx) |
|---|---|---|
| 1 | Context interface | 71-160 |
| 2 | emptyCtx + singletons | 181-225 |
| 3 | WithCancel + withCancel | 240-285 |
| 4 | parentCancelCtx | 382-398 |
| 5 | removeChild | 399-415 |
| 6 | canceler interface | 417-422 |
| 7 | closedchan + init | 423-428 |
| 8 | cancelCtx type | 431-442 |
| 9 | cancelCtx.Value | 441-446 |
| 10 | cancelCtx.Done | 448-461 |
| 11 | cancelCtx.Err | 463-473 |
| 12 | cancelCtx.propagateCancel | 475-528 |
| 13 | cancelCtx.cancel | 549-580 |
Line numbers drift between Go versions, but the order of definitions is stable. The cumulative reading time is about an hour.
Things to underline as you read:
- The four sync primitives in
cancelCtx: one mutex, twoatomic.Value, oneerror, onemap. - The five branches in
propagateCancel. - The two-level check in
parentCancelCtx:Value(&cancelCtxKey)plus thedonechannel cross-check. - The
closedchansubstitution incancel. - The atomic-load-then-receive pattern in
Err.
Next: senior.md — timerCtx, valueCtx, afterFuncCtx, withoutCancelCtx, and what Cause actually looks like underneath.