When to Use sync.Cond — Middle¶
Table of Contents¶
- What this file assumes
- The three rules of sync.Cond
- Why the wait must be a for loop
- Signal vs Broadcast
- A correct bounded queue
- When a channel is the better answer
- The cases channels cannot express
- Lost wakeups and the lock discipline
- Cancellation: Cond's weak spot
- Common middle-level mistakes
- Cheat sheet
- Self-assessment checklist
- Summary
- Further reading
What this file assumes¶
You can: - Explain that sync.Cond lets goroutines wait for a condition guarded by a mutex. - Write a Wait/Signal pair that compiles.
You will learn here: - The three non-negotiable rules for using Cond correctly. - Exactly when a channel replaces Cond and when it cannot. - How lost wakeups happen and why Wait must sit in a for loop. - Why Cond and cancellation don't mix, and what to do about it.
The three rules of sync.Cond¶
- Hold the lock around
Wait,Signal, andBroadcast-adjacent state changes.Waitatomically unlocks, blocks, and re-locks. You must holdLwhen you call it. - Always re-check the condition in a
forloop, never anif.Waitcan return without the condition being true (spurious-ish wakeups, multiple waiters racing for one item). - Change the shared state under the lock, then signal. Signal/Broadcast may be called with or without the lock held, but the state mutation that the waiters are testing must be protected.
Why the wait must be a for loop¶
Consider two consumers waiting on a queue and one producer that pushes a single item and calls Signal. Without a re-check:
c.L.Lock()
if len(q) == 0 { // BUG: if, not for
c.Wait()
}
item := q[0] // both consumers may reach here; second one indexes empty slice
q = q[1:]
c.L.Unlock()
Signal may wake one waiter, but by the time it re-acquires the lock another goroutine could have taken the item. The re-check in a for loop re-tests len(q) == 0 after re-locking and goes back to sleep if the item is gone. This is the single most common Cond bug.
Signal vs Broadcast¶
Signal()wakes one waiting goroutine (if any). Use it when one state change satisfies exactly one waiter — e.g., one item pushed, one consumer can proceed.Broadcast()wakes all waiters. Use it when a state change might satisfy many waiters, or when waiters wait on different conditions over the same mutex — e.g., "configuration reloaded" should wake everyone so each re-checks its own predicate.
When in doubt, Broadcast is safe (everyone re-checks and most go back to sleep); Signal is an optimization you take only when you can prove exactly one waiter can make progress.
A correct bounded queue¶
type BoundedQueue struct {
mu sync.Mutex
notEmpty *sync.Cond
notFull *sync.Cond
items []int
capacity int
}
func New(capacity int) *BoundedQueue {
q := &BoundedQueue{capacity: capacity}
q.notEmpty = sync.NewCond(&q.mu)
q.notFull = sync.NewCond(&q.mu)
return q
}
func (q *BoundedQueue) Push(v int) {
q.mu.Lock()
defer q.mu.Unlock()
for len(q.items) == q.capacity {
q.notFull.Wait()
}
q.items = append(q.items, v)
q.notEmpty.Signal()
}
func (q *BoundedQueue) Pop() int {
q.mu.Lock()
defer q.mu.Unlock()
for len(q.items) == 0 {
q.notEmpty.Wait()
}
v := q.items[0]
q.items = q.items[1:]
q.notFull.Signal()
return v
}
Two condition variables share one mutex: one signals "space available", the other "item available". Each Wait is in a for loop. This is the canonical multi-condition Cond example.
When a channel is the better answer¶
The bounded queue above is exactly a buffered channel:
One line each, no Cond, no lock, integrates with select for timeouts and cancellation. If your wake condition is "a value was added" or "a value was removed", use a channel. The channel is a condition variable specialized for value handoff.
The cases channels cannot express¶
sync.Cond earns its place when the wake condition is not a value transfer:
- "Wake all waiters when the configuration is reloaded." No value is handed to any specific waiter; everyone re-reads shared state.
- "Wake when a shared counter crosses a threshold" where multiple goroutines test different thresholds against the same state.
- "Resume all paused workers when a global pause flag clears."
type Gate struct {
mu sync.Mutex
cond *sync.Cond
open bool
}
func (g *Gate) WaitOpen() {
g.mu.Lock()
for !g.open {
g.cond.Wait()
}
g.mu.Unlock()
}
func (g *Gate) Open() {
g.mu.Lock()
g.open = true
g.mu.Unlock()
g.cond.Broadcast() // wake everyone waiting on the gate
}
A close(chan struct{}) can model a one-shot gate, but Cond handles a gate that opens and closes repeatedly, which a channel cannot (you can't re-open a closed channel).
Lost wakeups and the lock discipline¶
A lost wakeup happens when a Signal fires while no goroutine is yet in Wait, and the signal is simply discarded. Cond does not count signals. The defense is the lock discipline: because the waiter holds the lock while checking the condition and Wait releases it atomically, and the signaler changes state under the lock before signaling, a waiter cannot "miss" a state change — it either sees the new state on its for-check (and never sleeps) or is asleep and gets woken. Break the discipline (signal a state change without holding the lock during the mutation) and lost wakeups return.
Cancellation: Cond's weak spot¶
sync.Cond.Wait cannot be cancelled. There is no WaitContext. A goroutine blocked in Wait stays blocked until signaled — even if its context.Context is cancelled. Workarounds:
- Broadcast on shutdown. Set a
closedflag under the lock,Broadcast, and have each waiter'sforloop also test the flag and return.
func (q *BoundedQueue) Pop(ctx context.Context) (int, error) {
q.mu.Lock()
defer q.mu.Unlock()
for len(q.items) == 0 {
if q.closed {
return 0, errClosed
}
q.notEmpty.Wait()
}
// ...
}
func (q *BoundedQueue) Close() {
q.mu.Lock(); q.closed = true; q.mu.Unlock()
q.notEmpty.Broadcast()
q.notFull.Broadcast()
}
- Prefer a channel when per-operation timeout/cancellation matters, because
select { case v := <-ch: case <-ctx.Done(): }is built for it. This is the most common reason senior engineers avoidCondentirely.
Common middle-level mistakes¶
ifinstead offoraroundWait— the textbook lost-item bug.- Calling
Waitwithout holdingL— panics or corrupts state. - Mutating shared state without the lock, then signaling — reintroduces lost wakeups.
Signalwhen multiple waiters could proceed but only one is woken and it can't make progress — useBroadcast.- Expecting
Waitto honor a context — it never does. - Reaching for
Condwhen a buffered channel would do — adds lock plumbing for no gain.
Cheat sheet¶
| Situation | Use |
|---|---|
| Producer/consumer value handoff | buffered channel |
| One-shot "go" signal to many | close(chan struct{}) |
| Repeatable open/close gate | sync.Cond + Broadcast |
| Wake all on shared-state change | sync.Cond + Broadcast |
| Wake exactly one that can proceed | sync.Cond + Signal |
| Need per-wait timeout/cancel | channel + select, not Cond |
Self-assessment checklist¶
- I always wrap
Waitin aforloop and can explain why. - I hold the lock around
Waitand the state mutation. - I can choose
SignalvsBroadcastwith a reason. - I can rewrite a
Condqueue as a channel and know when not to. - I can name a wake condition that channels cannot express.
- I know
Waitignores context and how to add shutdown.
Summary¶
sync.Cond is for waiting on an arbitrary condition over shared state guarded by a mutex. Three rules keep it correct: hold the lock, re-check in a for loop, mutate-then-signal. Use Signal for one-waiter-can-proceed and Broadcast for everyone-re-checks. Most "wait for an item" problems are really channels in disguise — reach for Cond only when the wake condition is not a value handoff (repeatable gates, shared-state thresholds, config reloads). Its fatal weakness is cancellation: Wait ignores context, so when per-operation timeouts matter, a channel wins.
In senior.md we'll look at where Cond actually survives in production code, how to wrap it safely behind an API, and the measured cases where it beats both channels and busy-polling.
Further reading¶
sync.Conddocs — https://pkg.go.dev/sync#Cond- Bryan C. Mills, "Rethinking Classical Concurrency Patterns" (argues for channels over Cond)
src/sync/cond.go— short and worth reading- The Go Memory Model — https://go.dev/ref/mem