Channel Runtime Behaviour — Middle Level¶
Table of Contents¶
- Introduction
- The
hchanLock in Context chansendWalkthrough with State DiagramschanrecvWalkthrough with State Diagrams- Direct Hand-off in Detail
- Buffer Rotation on Full-Buffer Receive
closechanand the Wake List- Wait Queue Behaviour
- Parking Mechanics:
gopark - Waking Mechanics:
goready - Select Readiness Check
- Non-Blocking Variants
- Practical Implications
- Summary
Introduction¶
The junior page introduced the runtime functions and what they do at a glance. This page turns the page on the source code: we look at every step in chansend and chanrecv, walk through what changes in the hchan struct on each operation, and discuss the design choices that make the runtime predictable under contention.
The function names, lock points, and state transitions in this document are accurate against Go 1.22's runtime/chan.go. The pseudocode follows the source closely; the only liberties are simplification of error paths and tracing hooks.
The hchan Lock in Context¶
c.lock is a runtime.mutex, the runtime's internal mutex. Unlike sync.Mutex, it is not used through the sync package API; it is the lower-level futex/sema-based primitive declared in runtime/lock_futex.go (Linux) or lock_sema.go (others). Key properties:
- Lock is a 4-byte futex on Linux; ~8 bytes including state on macOS.
- Acquisition is uncontended-fast (one CAS), contended-slow (futex syscall).
- Holds for at most the duration of one channel op — bounded by a few hundred nanoseconds in the no-park case.
- Released before
goreadyis called, so wakeups happen outside the critical section.
The lock is the only synchronisation inside the channel. Everything in hchan — the buffer slots, the queue pointers, the closed flag — is protected exclusively by it. There is no double-checked locking, no per-slot atomic, no "fast path without lock." If you see a stack trace in runtime.lock2, that goroutine is queuing for c.lock.
The shortness of the critical section is why channels scale reasonably. But it is also why channels do not scale infinitely: a million senders contending on one channel will queue on c.lock no matter how fancy your hardware is.
chansend Walkthrough with State Diagrams¶
Annotated source¶
The actual signature is:
// chansend implements `c <- ep`.
// If block is false (set by select with default), and the send cannot
// complete immediately, it returns false without blocking.
func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
The body, with annotations:
// (A) Nil channel: block forever, or fast-fail.
if c == nil {
if !block {
return false
}
gopark(nil, nil, waitReasonChanSendNilChan, traceBlockForever, 2)
throw("unreachable")
}
// (B) Non-blocking fast check (select default).
// If the channel is open AND we cannot send (no receiver, no room),
// bail without locking.
if !block && c.closed == 0 && full(c) {
return false
}
// (C) Acquire the lock.
lock(&c.lock)
// (D) Closed check.
if c.closed != 0 {
unlock(&c.lock)
panic(plainError("send on closed channel"))
}
// (E) Receiver waiting? Direct hand-off.
if sg := c.recvq.dequeue(); sg != nil {
send(c, sg, ep, func() { unlock(&c.lock) }, 3)
return true
}
// (F) Buffer has room? Copy in.
if c.qcount < c.dataqsiz {
qp := chanbuf(c, c.sendx)
typedmemmove(c.elemtype, qp, ep)
c.sendx++
if c.sendx == c.dataqsiz {
c.sendx = 0
}
c.qcount++
unlock(&c.lock)
return true
}
// (G) Non-blocking and could not send: bail.
if !block {
unlock(&c.lock)
return false
}
// (H) Park ourselves.
gp := getg()
mysg := acquireSudog()
mysg.elem = ep
mysg.g = gp
mysg.c = c
mysg.waitlink = nil
gp.waiting = mysg
c.sendq.enqueue(mysg)
atomic.Store8(&gp.parkingOnChan, 1)
gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanSend, traceBlockChanSend, 2)
// (I) Woken up. Either the receiver took our value, or the channel closed.
// If closed, mysg.success is false → panic.
KeepAlive(ep)
if mysg.elem != nil {
throw("chansend: bad sudog elem")
}
mysg.c = nil
closed := !mysg.success
gp.param = nil
releaseSudog(mysg)
if closed {
if c.closed == 0 {
throw("chansend: spurious wakeup")
}
panic(plainError("send on closed channel"))
}
return true
State diagram: typical send paths¶
send(c, v)
|
+-------------+---------------+
| | |
v v v
[nil channel] [open channel] [closed channel]
| | |
block forever v panic
+----+---+
| |
recvq ready no waiter
| |
direct hand-off v
qcount < cap?
/ \
yes no
/ \
copy to buffer block?
/ \
yes no
/ \
park on return false
sendq (select default)
What changes inside hchan per branch¶
| Branch | qcount | sendx | recvq | sendq | closed |
|---|---|---|---|---|---|
| Direct hand-off | unchanged | unchanged | head removed | unchanged | unchanged |
| Buffer copy | +1 | +1 mod dataqsiz | unchanged | unchanged | unchanged |
| Park on sendq | unchanged | unchanged | unchanged | tail appended | unchanged |
| Closed panic | unchanged | unchanged | unchanged | unchanged | unchanged |
chanrecv Walkthrough with State Diagrams¶
Annotated source¶
func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
// (A) Nil channel: block forever or fast-fail.
if c == nil {
if !block {
return
}
gopark(nil, nil, waitReasonChanReceiveNilChan, traceBlockForever, 2)
throw("unreachable")
}
// (B) Non-blocking fast check.
if !block && empty(c) {
if atomic.Load(&c.closed) == 0 {
return false, false
}
if empty(c) {
if ep != nil {
typedmemclr(c.elemtype, ep)
}
return true, false
}
}
// (C) Lock.
lock(&c.lock)
// (D) Closed and empty: return zero.
if c.closed != 0 && c.qcount == 0 {
unlock(&c.lock)
if ep != nil {
typedmemclr(c.elemtype, ep)
}
return true, false
}
// (E) Sender waiting? Two sub-cases below.
if sg := c.sendq.dequeue(); sg != nil {
recv(c, sg, ep, func() { unlock(&c.lock) }, 3)
return true, true
}
// (F) Buffer non-empty.
if c.qcount > 0 {
qp := chanbuf(c, c.recvx)
if ep != nil {
typedmemmove(c.elemtype, ep, qp)
}
typedmemclr(c.elemtype, qp)
c.recvx++
if c.recvx == c.dataqsiz {
c.recvx = 0
}
c.qcount--
unlock(&c.lock)
return true, true
}
if !block {
unlock(&c.lock)
return false, false
}
// (G) Park on recvq.
gp := getg()
mysg := acquireSudog()
mysg.elem = ep
mysg.g = gp
mysg.c = c
gp.waiting = mysg
c.recvq.enqueue(mysg)
gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanReceive, traceBlockChanRecv, 2)
// (H) Woken.
success := mysg.success
gp.param = nil
mysg.c = nil
releaseSudog(mysg)
return true, success
}
State diagram¶
v, ok := <-c
|
+-------------+---------------+
| | |
v v v
[nil channel] [open channel] [closed channel]
| | |
block forever v qcount>0?
+----+---+ / \
| | yes no
v v | |
sendq ready no waiter v v
| | drain return (true,false)
direct hand-off v buffer
or buffer rotate qcount>0?
/ \
yes no
| |
drain buffer block?
/ \
yes no
| |
park on return (false,false)
recvq
What changes inside hchan per branch¶
| Branch | qcount | recvx | recvq | sendq | closed |
|---|---|---|---|---|---|
| Direct hand-off (unbuffered) | unchanged | unchanged | unchanged | head removed | unchanged |
| Buffer rotate (full buffer + parked sender) | unchanged | +1 mod cap | unchanged | head removed | unchanged |
| Buffer drain | −1 | +1 mod cap | unchanged | unchanged | unchanged |
| Park on recvq | unchanged | unchanged | tail appended | unchanged | unchanged |
| Closed-and-empty | unchanged | unchanged | unchanged | unchanged | unchanged |
Direct Hand-off in Detail¶
The send helper, called from both chansend (when a receiver is parked) and the equivalent path in chanrecv (when a sender is parked), is the implementation of direct hand-off.
// send processes a send operation on an empty channel c.
// The value ep sent by the sender is copied to the receiver sg.
// The receiver is then woken up to go on its merry way.
func send(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) {
if sg.elem != nil {
sendDirect(c.elemtype, sg, ep)
sg.elem = nil
}
gp := sg.g
unlockf()
gp.param = unsafe.Pointer(sg)
sg.success = true
goready(gp, skip+1)
}
sendDirect does a memmove from ep (the sender's stack) into sg.elem (the receiver's stack address). The runtime has special handling here: the receiver was parked, so its stack will not move (no GC stack-shrink while parked), and the receiver's ep pointer is a stable target. The runtime uses typedmemmovestack to also handle write barriers if the element type contains pointers (the GC needs to know about the cross-stack write).
Why direct hand-off is fast¶
Without direct hand-off, an unbuffered rendezvous would require:
- Sender locks
c.lock. - Sender sees no receiver, no buffer → parks.
- Receiver later locks
c.lock. - Receiver sees parked sender → take value from sender's
sudog.eleminto receiver'sep. - Wake sender.
That is two lock-unlock pairs and an extra copy through the sudog. With direct hand-off, the receiver is already parked, and the sender writes directly into the receiver's ep:
- Sender locks
c.lock. - Sender sees receiver parked →
typedmemmove(elemtype, sg.elem, ep). - Unlock;
goreadythe receiver.
Saving one lock-unlock pair and one copy is a non-trivial fraction of the operation's total cost.
The cross-stack write¶
sg.elem points into the receiver's stack frame (where the receive expression's destination is). The sender writes into that address from its own stack. This is unusual because Go's garbage collector usually relies on goroutine stacks being private. The trick: while a goroutine is parked, its stack is pinned (won't be relocated). The write barrier is invoked if elemtype contains pointers, so the GC sees the write.
Buffer Rotation on Full-Buffer Receive¶
The "sender parked + buffer full" case is subtler than direct hand-off. The receiver wants to drain a value; the sender wants to add a value; the buffer is full. The runtime does this:
func recv(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) {
if c.dataqsiz == 0 {
// Unbuffered: direct copy from sender's ep into receiver's ep.
if ep != nil {
recvDirect(c.elemtype, sg, ep)
}
} else {
// Buffered, full: take from head of buffer; put sender's value at tail.
qp := chanbuf(c, c.recvx)
if ep != nil {
typedmemmove(c.elemtype, ep, qp)
}
typedmemmove(c.elemtype, qp, sg.elem)
c.recvx++
if c.recvx == c.dataqsiz {
c.recvx = 0
}
c.sendx = c.recvx // assignment: sendx tracks recvx in the full-buffer case
}
sg.elem = nil
gp := sg.g
unlockf()
gp.param = unsafe.Pointer(sg)
sg.success = true
goready(gp, skip+1)
}
This is a clever optimisation. The buffer is full, so sendx == recvx (think of the ring: head and tail meet). The receiver:
- Reads buffer[recvx] into its
ep. - Writes sender's value into the same slot.
- Advances
recvx. - Sets
sendx = recvx(the slot just written is now the new tail).
Net effect: the buffer stays full, but the receiver got one value out and the parked sender got their value in. No copy through sudog.elem to the buffer separately — the buffer slot does double duty.
If a queue of senders is parked, each subsequent receive does the same rotation, walking through senders FIFO.
closechan and the Wake List¶
Source¶
func closechan(c *hchan) {
if c == nil {
panic(plainError("close of nil channel"))
}
lock(&c.lock)
if c.closed != 0 {
unlock(&c.lock)
panic(plainError("close of closed channel"))
}
c.closed = 1
var glist gList
// Drain recvq.
for {
sg := c.recvq.dequeue()
if sg == nil {
break
}
if sg.elem != nil {
typedmemclr(c.elemtype, sg.elem)
sg.elem = nil
}
if sg.releasetime != 0 {
sg.releasetime = cputicks()
}
gp := sg.g
gp.param = unsafe.Pointer(sg)
sg.success = false
glist.push(gp)
}
// Drain sendq.
for {
sg := c.sendq.dequeue()
if sg == nil {
break
}
sg.elem = nil
if sg.releasetime != 0 {
sg.releasetime = cputicks()
}
gp := sg.g
gp.param = unsafe.Pointer(sg)
sg.success = false
glist.push(gp)
}
unlock(&c.lock)
// Wake each goroutine.
for !glist.empty() {
gp := glist.pop()
gp.schedlink = 0
goready(gp, 3)
}
}
Why two passes?¶
Drain inside the lock, wake outside the lock. If goready were called inside the critical section, it could cause cascading scheduling work while still holding c.lock, slowing other channel ops. By collecting goroutines into glist and releasing the lock first, the wake-up cost does not block other operations on the channel — even though, at this point, the channel is closed and no further operations on it make sense.
What woken senders do¶
A parked sender, on wake-up, returns from gopark inside chansend. The post-park code reads mysg.success. closechan set it to false. The sender then panics with "send on closed channel."
So: closing a channel that has parked senders causes deferred panics in the sending goroutines. This is not a single panic in the closing goroutine — it is N panics, one per sender, after they wake. If those panics are unrecoverable (no defer recover), they crash the program.
What woken receivers do¶
A parked receiver returns from gopark inside chanrecv. mysg.success is false. The receiver returns (true, false) — the standard "channel closed" return — without panicking.
Receivers always handle close gracefully. Senders never do.
Wait Queue Behaviour¶
waitq is the doubly-linked list of sudog used for both recvq and sendq:
type waitq struct {
first *sudog
last *sudog
}
func (q *waitq) enqueue(sgp *sudog) {
sgp.next = nil
x := q.last
if x == nil {
sgp.prev = nil
q.first = sgp
q.last = sgp
return
}
sgp.prev = x
x.next = sgp
q.last = sgp
}
func (q *waitq) dequeue() *sudog {
for {
sgp := q.first
if sgp == nil {
return nil
}
y := sgp.next
if y == nil {
q.first = nil
q.last = nil
} else {
y.prev = nil
q.first = y
sgp.next = nil
}
// Filter out sudogs that were already woken by select.
if sgp.isSelect && !sgp.g.selectDone.CompareAndSwap(0, 1) {
continue
}
return sgp
}
}
Two notable details:
-
Strict FIFO order. The runtime does not allow "barging" in the wait queue itself — but barging can happen, because a new send/recv that arrives while waiters are still parked may complete its operation under the lock and only then check the queue. By the time the queue is checked, the new arrival has already won.
-
isSelectfiltering. Asudogfrom aselectmay be on multiple channels' queues. When one channel wakes the goroutine, that goroutine'sselectDoneis CAS'd. The dequeue function on other channels will skip already-claimed sudogs.
Parking Mechanics: gopark¶
gopark puts the current goroutine to sleep:
func gopark(unlockf func(*g, unsafe.Pointer) bool, lock unsafe.Pointer, reason waitReason, traceEv byte, traceskip int) {
mp := acquirem()
gp := mp.curg
status := readgstatus(gp)
if status != _Grunning && status != _Gscanrunning {
throw("gopark: bad g status")
}
mp.waitlock = lock
mp.waitunlockf = unlockf
gp.waitreason = reason
mp.waittraceev = traceEv
mp.waittraceskip = traceskip
releasem(mp)
mcall(park_m)
}
mcall(park_m) switches to the M's g0 (system stack), then runs park_m:
func park_m(gp *g) {
casgstatus(gp, _Grunning, _Gwaiting)
dropg()
if fn := mp.waitunlockf; fn != nil {
ok := fn(gp, mp.waitlock)
mp.waitunlockf = nil
mp.waitlock = nil
if !ok {
// unlock failed; resurrect the goroutine.
casgstatus(gp, _Gwaiting, _Grunnable)
execute(gp, true)
}
}
schedule()
}
Steps:
- Goroutine status →
_Gwaiting. - M drops the G (no longer associated).
- The unlock function is called — for channel ops, this is
chanparkcommit, which callsunlock(&c.lock). schedule()finds another G to run.
chanparkcommit:
func chanparkcommit(gp *g, chanLock unsafe.Pointer) bool {
gp.activeStackChans = true
atomic.Store8(&gp.parkingOnChan, 0)
unlock((*mutex)(chanLock))
return true
}
The key trick: the channel lock is released as part of parking. This means the wake-side cannot find the sudog on the queue until the parker has fully gone to sleep. Without this dance, you could have a race where the waker sees an sudog whose goroutine status is still _Grunning.
Waking Mechanics: goready¶
func goready(gp *g, traceskip int) {
systemstack(func() {
ready(gp, traceskip, true)
})
}
func ready(gp *g, traceskip int, next bool) {
status := readgstatus(gp)
casgstatus(gp, _Gwaiting, _Grunnable)
runqput(getg().m.p.ptr(), gp, next)
wakep()
}
The receiver/closer calls goready after the parker has already released the channel lock. The flow:
- Goroutine status:
_Gwaiting→_Grunnable. runqputputs the goroutine on the current P's local runqueue (head ifnextis true, tail otherwise).wakepensures some idle P is woken to run the goroutine if no P is currently spinning.
The next parameter is true for channel wakeups — Go preferentially runs the woken goroutine soon, on the assumption that the just-completed channel op is part of a hot path.
Select Readiness Check¶
selectgo (in runtime/select.go) handles select:
func selectgo(cas0 *scase, order0 *uint16, pc0 *uintptr, nsends, nrecvs int, block bool) (int, bool) {
cases := cas0[:ncases:ncases]
pollorder := order0[:ncases:ncases] // permutation
lockorder := order0[ncases:][:ncases:ncases] // sorted by lock address
// 1. Generate pollorder via Fisher-Yates shuffle.
norder := 0
for i := range pollorder {
cas := &cases[i]
if cas.c == nil {
cas.elem = nil
continue
}
j := fastrandn(uint32(norder + 1))
pollorder[norder] = pollorder[j]
pollorder[j] = uint16(i)
norder++
}
pollorder = pollorder[:norder]
// 2. Sort lockorder by channel address.
// Heap-sort, in place, using channel pointer as key.
...
// 3. Lock all channels (in lockorder).
sellock(scases, lockorder)
// 4. Pass 1: find any case that is immediately ready.
var casi int
var cas *scase
var caseSuccess bool
var recvOK bool
var sg *sudog
loop:
for _, casei := range pollorder {
casi = int(casei)
cas = &scases[casi]
c := cas.c
if casi >= nsends {
// Receive case.
sg = c.sendq.dequeue()
if sg != nil {
goto recv
}
if c.qcount > 0 {
goto bufrecv
}
if c.closed != 0 {
goto rclose
}
} else {
// Send case.
if c.closed != 0 {
goto sclose
}
sg = c.recvq.dequeue()
if sg != nil {
goto send
}
if c.qcount < c.dataqsiz {
goto bufsend
}
}
}
// 5. None ready. If default, unlock all and take default.
if !block {
selunlock(scases, lockorder)
casi = -1
goto retc
}
// 6. Park on every channel.
...
gopark(selparkcommit, nil, waitReasonSelect, traceBlockSelect, 1)
// 7. Woken. Find which channel fired. Dequeue from the rest.
...
goto retc
The big picture:
- One linear scan with the shuffled
pollorderpicks the first ready case. - If no case is ready and there is no default, the goroutine parks itself on every channel in the select.
- When one channel wakes the goroutine, the runtime walks the other channels and removes the goroutine's sudog from their wait queues.
Lock acquisition order: deadlock avoidance¶
lockorder is sorted by hchan pointer address. All select statements in the program agree on this ordering. So if two selects share two channels — say A and B — they both lock min(addr(A), addr(B)) first. No two-cycle deadlock is possible.
This is a classic lock-hierarchy trick. The cost is a sort (heap-sort, O(n log n)) per select. Selects with many cases pay this cost, but n is usually small.
Why shuffle?¶
pollorder is shuffled to avoid bias. If we always checked source-order, a select with many ready cases would always take the first one. With shuffle, ready cases are chosen with equal probability — this prevents one channel from monopolising a worker that selects across many channels.
Non-Blocking Variants¶
chansend and chanrecv accept a block parameter:
chansend(c, &v, true, callerpc) // ch <- v (blocking)
chansend(c, &v, false, callerpc) // select { case ch <- v: ... default: }
chanrecv(c, &v, true) // <-ch (blocking)
chanrecv(c, &v, false) // select { case v = <-ch: ... default: }
When block=false:
- The fast-path "non-blocking pre-check" (step B above) bails without locking if the channel is open and there is no work to do.
- After taking the lock, if we cannot complete immediately, we return
falseinstead of parking.
The fast non-blocking check is a read-without-lock. It can race against another goroutine that changes c.qcount or c.recvq. But: races that miss a ready operation are acceptable for non-blocking — at worst, the default case fires when a true blocking op would have succeeded. The runtime accepts this. It does not accept missing a closed flag, which is why the closed check requires the atomic load.
Practical Implications¶
Why select cases are not free¶
Each case in a select is checked under a lock on its channel. A select with 8 cases on 8 different channels acquires 8 locks. That is 8x the lock cost of a single channel op. For very wide selects, this is measurable.
Why a small buffer is sometimes worse than no buffer¶
An unbuffered channel between two ready goroutines uses direct hand-off and no buffer copy. A buffered channel (capacity 1) between the same two ready goroutines uses one buffer copy plus one drain — twice the data movement, in the worst case.
If the producer is consistently slightly faster than the consumer, a small buffer absorbs the difference. If they run in lockstep, an unbuffered channel is faster.
Why close is a fanout¶
Closing a channel with N parked receivers wakes all N. Each wake is a goready, which puts the goroutine on a runqueue and may invoke wakep. If you close a channel with thousands of receivers, the closing goroutine is responsible for that whole avalanche — and it happens before close returns.
This is fine for "done" channels with a handful of receivers. It is not fine for "fan-out to a thousand workers and close" patterns where you want bounded wake-up time.
Why direct hand-off helps latency but not throughput¶
Direct hand-off saves one copy and one wake. For a single rendezvous, that is a real win. For high-throughput pipelines, the bottleneck is c.lock contention, which direct hand-off does not address. To scale, you shard channels or batch sends.
Summary¶
At middle level, the channel runtime is a small state machine: lock, decide based on closed/queues/buffer, act, unlock. The actions are:
- Hand off directly to a parked partner.
- Use the ring buffer.
- Park on a wait queue.
- (For close) drain both queues into a wake-list and wake everyone after unlocking.
The non-obvious bits are: direct hand-off across stacks, full-buffer rotation, the selectgo lock-ordering trick, and the dance between gopark and chanparkcommit to release the channel lock at exactly the right moment.
You now have enough of the model to read runtime/chan.go and runtime/select.go straight through. The senior level adds the sudog allocator, memory-model details, and scheduler interactions.