Goroutine Common Pitfalls — Find the Bug¶

The heaviest file in this subsection. Twenty-five-plus broken programs across the full pitfall catalogue, from beginner-level to staff-level subtle. For each: read carefully, predict the symptom (panic, race, leak, deadlock, wrong output, fatal error, performance regression), identify the root cause, and sketch the fix before reading the solution.

If you have read the rest of this subsection — and especially the junior.md catalogue — every bug here should be recognisable in family, even if not at first glance.

How to use this file¶

Read the snippet to the end before forming an opinion. Many bugs hide in line N+3, not where your eye lands.
State the symptom: what does running this program produce? Panic? Race? Hang? Wrong output? Fatal error?
State the cause: name the pitfall family — captured variable, missing cancel, wrong-closer, atomic mixing, etc.
Sketch the fix: not just "add a mutex," but specifically which mutex around which code.
Read the solution and compare. The solution explains why your fix works and what alternative fixes look like.

Easy¶

Bug 1 — Captured `i` in a worker spawn¶

package main

import (
    "fmt"
    "sync"
)

func main() {
    var wg sync.WaitGroup
    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            fmt.Println(i)
        }()
    }
    wg.Wait()
}

Observation. On Go 1.21, output is often 5 5 5 5 5. On Go 1.22+, some permutation of 0 1 2 3 4.

Find the bug.

Bug 2 — `wg.Add(1)` after `go`¶

var wg sync.WaitGroup
for i := 0; i < 100; i++ {
    go func() {
        wg.Add(1)
        defer wg.Done()
        doWork()
    }()
}
wg.Wait()
fmt.Println("done")

Observation. "done" prints almost immediately, before most goroutines finish.

Find the bug.

Bug 3 — Sleep instead of Wait¶

go heavyWork()
time.Sleep(time.Second)
fmt.Println("hopefully done")

Observation. On the developer's laptop, prints after work completes. On a busy CI runner, prints before.

Find the bug.

Bug 4 — Panic in a worker¶

func startWorker(jobs <-chan Job) {
    go func() {
        for j := range jobs {
            process(j)  // can panic on malformed Job
        }
    }()
}

Observation. A single bad input crashes the whole service.

Find the bug.

Bug 5 — Unread result channel¶

func compute(x int) int {
    ch := make(chan int)
    go func() {
        ch <- expensive(x)
    }()
    if cached, ok := cache[x]; ok {
        return cached
    }
    return <-ch
}

Observation. runtime.NumGoroutine() grows over time.

Find the bug.

Bug 6 — `for range` without close¶

ch := make(chan int)
go func() {
    for v := range ch {
        process(v)
    }
}()
for i := 0; i < 10; i++ {
    ch <- i
}
// program continues, never closes ch

Observation. Receiver goroutine leaks.

Find the bug.

Bug 7 — Concurrent map write¶

m := make(map[int]int)
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
    wg.Add(1)
    go func(i int) {
        defer wg.Done()
        m[i] = i * 2
    }(i)
}
wg.Wait()

Observation. Sometimes runs fine; sometimes crashes with fatal error: concurrent map writes. Cannot be recovered.

Find the bug.

Bug 8 — Captured `&item`¶

type Item struct { Name string }
for _, item := range items {
    go func() {
        process(&item)
    }()
}

Observation (Go 1.21). All goroutines process the last item.

Find the bug.

Bug 9 — Double close¶

ch := make(chan int)
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
    wg.Add(1)
    go func(i int) {
        defer wg.Done()
        defer close(ch)         // each goroutine closes
        ch <- i
    }(i)
}
wg.Wait()

Observation. Panic: close of closed channel.

Find the bug.

Bug 10 — Forgotten `cancel()`¶

func fetch(parent context.Context, url string) (*Response, error) {
    ctx, _ := context.WithTimeout(parent, 5*time.Second)
    return doHTTP(ctx, url)
}

Observation. go vet warns. Under load, memory grows.

Find the bug.

Medium¶

Bug 11 — `time.After` in a select loop¶

func consumer(messages <-chan Message) {
    for {
        select {
        case m := <-messages:
            handle(m)
        case <-time.After(time.Second):
            return
        }
    }
}

Observation. Memory grows under high message rates.

Find the bug.

Bug 12 — `defer f.Close()` in a loop¶

func processFiles(names []string) error {
    for _, name := range names {
        f, err := os.Open(name)
        if err != nil {
            return err
        }
        defer f.Close()
        if err := process(f); err != nil {
            return err
        }
    }
    return nil
}

Observation. On 10 000 files, fails with "too many open files."

Find the bug.

Bug 13 — Mutex over HTTP¶

type Cache struct {
    mu sync.Mutex
    data map[string][]byte
}

func (c *Cache) Get(key string) ([]byte, error) {
    c.mu.Lock()
    defer c.mu.Unlock()
    if v, ok := c.data[key]; ok {
        return v, nil
    }
    resp, err := http.Get("https://upstream/" + key)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()
    b, _ := io.ReadAll(resp.Body)
    c.data[key] = b
    return b, nil
}

Observation. Cache hits are fast; cache misses serialise everything.

Find the bug.

Bug 14 — Atomic + plain read¶

var requests int64

func handle() {
    atomic.AddInt64(&requests, 1)
    process()
}

func report() {
    for range time.Tick(time.Second) {
        fmt.Println("requests:", requests) // plain read
    }
}

Observation. Race detector flags a race. Numbers occasionally look stale.

Find the bug.

Bug 15 — Singleton via `if == nil`¶

var db *sql.DB

func DB() *sql.DB {
    if db == nil {
        db = openDB()
    }
    return db
}

Observation. Under heavy startup, several DB pools are created; only the last is reachable.

Find the bug.

Bug 16 — Background goroutine outliving request¶

func handleUpload(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)
    go func() {
        time.Sleep(10 * time.Second)
        s3.Upload(body)
    }()
    w.WriteHeader(http.StatusAccepted)
}

Observation. Memory grows linearly with request rate.

Find the bug.

Bug 17 — `errgroup` ignoring `ctx`¶

g, ctx := errgroup.WithContext(parent)
for _, url := range urls {
    url := url
    g.Go(func() error {
        return slowFetch(url)   // doesn't use ctx
    })
}
err := g.Wait()

Observation. When one URL fails, Wait still takes the full slowFetch time for all others.

Find the bug.

Bug 18 — Closing input mid-iteration¶

jobs := make(chan Job)
var wg sync.WaitGroup
for i := 0; i < 8; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        for j := range jobs {
            process(j)
        }
    }()
}

for _, j := range allJobs {
    jobs <- j
    if j.Final {
        close(jobs)
    }
}
wg.Wait()

Observation. Sometimes panics: send on closed channel.

Find the bug.

Bug 19 — WaitGroup passed by value¶

func spawn(wg sync.WaitGroup, work func()) {
    go func() {
        defer wg.Done()
        work()
    }()
}

func main() {
    var wg sync.WaitGroup
    for i := 0; i < 10; i++ {
        wg.Add(1)
        spawn(wg, work)
    }
    wg.Wait()
}

Observation. Wait blocks forever. go vet warns.

Find the bug.

Bug 20 — Tight loop and `Gosched`¶

runtime.GOMAXPROCS(1)

var wg sync.WaitGroup
wg.Add(2)
go func() {
    defer wg.Done()
    for i := 0; i < 1_000_000_000; i++ {
        runtime.Gosched()
        _ = i
    }
}()
go func() {
    defer wg.Done()
    fmt.Println("hello")
}()
wg.Wait()

Observation. "hello" prints eventually, but the program never finishes.

Find the bug.

Hard¶

Bug 21 — Shutdown race¶

type Service struct {
    jobs   chan Job
    cancel context.CancelFunc
    wg     sync.WaitGroup
}

func (s *Service) Shutdown() {
    close(s.jobs)
    s.cancel()
    s.wg.Wait()
}

func (s *Service) Submit(j Job) {
    s.jobs <- j
}

Observation. During shutdown under load, panic: send on closed channel.

Find the bug.

Bug 22 — `time.Tick` leak¶

func monitor() {
    for t := range time.Tick(time.Second) {
        publish(t)
        if shouldStop() {
            return
        }
    }
}

Observation. Goroutine returns, but the ticker keeps firing into memory until process exit. Tested by repeatedly calling monitor().

Find the bug.

Bug 23 — Nested `LockOSThread` confusion¶

func enterNS(fd int) error {
    runtime.LockOSThread()
    defer runtime.UnlockOSThread()
    return syscall.Setns(fd, syscall.CLONE_NEWNET)
}

func makeCall(ns int) error {
    runtime.LockOSThread()
    defer runtime.UnlockOSThread()
    if err := enterNS(ns); err != nil {
        return err
    }
    return dialAndRead()
}

Observation. Sometimes the dial happens in the wrong namespace.

Find the bug.

Bug 24 — `sync.Once` capturing config¶

var (
    once sync.Once
    cfg  *Config
)

func Initialize(c *Config) {
    once.Do(func() {
        cfg = c
        startBackground(c)
    })
}

Observation. Multiple callers pass different configs; only the first wins. Second callers silently use the first caller's config.

Find the bug.

Bug 25 — RWMutex upgrade¶

type Cache struct {
    mu   sync.RWMutex
    data map[string][]byte
}

func (c *Cache) GetOrLoad(key string) []byte {
    c.mu.RLock()
    if v, ok := c.data[key]; ok {
        c.mu.RUnlock()
        return v
    }
    c.mu.RUnlock()
    c.mu.Lock()
    v := load(key)
    c.data[key] = v
    c.mu.Unlock()
    return v
}

Observation. Under cache-miss bursts, load(key) is called multiple times for the same key.

Find the bug.

Bug 26 — Send blocks on shutdown¶

func (s *Service) Run(ctx context.Context) {
    for {
        select {
        case <-ctx.Done():
            return
        case j := <-s.jobs:
            s.results <- s.process(j)
        }
    }
}

Observation. During shutdown, Run hangs and never returns. ctx.Done() is closed; s.jobs is empty; but the goroutine is stuck.

Find the bug.

Bug 27 — `init` spawning unmanaged goroutine¶

package telemetry

var counters atomic.Int64

func init() {
    go func() {
        for range time.Tick(time.Second) {
            flush(counters.Load())
        }
    }()
}

Observation. Every test in every package that imports telemetry runs the flusher. Tests are slow and flaky.

Find the bug.

Bug 28 — Panic taking down all peers¶

func parallelMap(items []Item, fn func(Item) int) []int {
    out := make([]int, len(items))
    var wg sync.WaitGroup
    for i := range items {
        i := i
        wg.Add(1)
        go func() {
            defer wg.Done()
            out[i] = fn(items[i])
        }()
    }
    wg.Wait()
    return out
}

Observation. Works perfectly until one fn(item) panics on a bad input — then the whole process crashes.

Find the bug.

Bug 29 — Cgo holding M¶

package main

/*
#include <unistd.h>
void slow(void) { sleep(2); }
*/
import "C"

import (
    "fmt"
    "runtime"
    "time"
)

func main() {
    for round := 0; round < 10; round++ {
        for i := 0; i < 100; i++ {
            go C.slow()
        }
        time.Sleep(2500 * time.Millisecond)
        fmt.Println("round", round, "goroutines:", runtime.NumGoroutine())
    }
}

Observation. Thread count climbs every round.

Find the bug.

Bug 30 — Context not propagated in DB call¶

func (s *Service) Lookup(ctx context.Context, id string) (*User, error) {
    return s.db.Query("SELECT * FROM users WHERE id = $1", id)
}

Observation. Client cancels request; server-side query continues, returning when the DB finally responds. Heavy retries pile up zombie queries.

Find the bug.

Solutions¶

Solution 1¶

Pre-1.22, i is one variable shared across all iterations of the for loop. The closure captures &i. By the time the goroutines run, the loop is finished and i == 5. All goroutines read 5.

Fix: pass i as a parameter.

go func(i int) { fmt.Println(i) }(i)

This works on every Go version. On 1.22+, the original code also works because each iteration has a fresh i. Still prefer the parameter pass for portability and explicitness.

Solution 2¶

wg.Add(1) is inside the goroutine body, so it runs in parallel with wg.Wait(). The race: Wait may run before any goroutine has executed its Add, see counter = 0, and return immediately.

Fix:

for i := 0; i < 100; i++ {
    wg.Add(1)               // parent, serial with the for loop
    go func() {
        defer wg.Done()
        doWork()
    }()
}

The sync.WaitGroup docs are explicit: "calls with a positive delta that occur when the counter is zero must happen before a Wait."

Solution 3¶

time.Sleep is hope, not synchronisation. The goroutine may take 500 ms, 2 s, or longer. The program prints "hopefully done" with the work still running.

Fix:

done := make(chan struct{})
go func() {
    heavyWork()
    close(done)
}()
<-done
fmt.Println("definitely done")

Or WaitGroup, or errgroup. Never time.Sleep.

Solution 4¶

A panic inside the goroutine, with no recover in the goroutine's defer chain, kills the entire process. Recovering in some other goroutine does not help.

Fix: defend the goroutine boundary.

go func() {
    defer func() {
        if r := recover(); r != nil {
            log.Printf("worker panic: %v\n%s", r, debug.Stack())
        }
    }()
    for j := range jobs {
        process(j)
    }
}()

Better fix: process should not panic on user input; return an error.

Solution 5¶

The goroutine sends on an unbuffered channel: ch <- expensive(x). If the function returns early via the cache hit, no one reads ch, the send blocks forever, the goroutine leaks.

Fix 1: buffer the channel.

ch := make(chan int, 1)

Now the send completes whether or not anyone reads. The channel (and its single value) are GC'd with the function frame.

Fix 2: always read.

v := <-ch
if cached, ok := cache[x]; ok {
    return cached
}
return v

Solution 6¶

for v := range ch exits only when ch is closed. The producer's for i := 0; i < 10; i++ finishes, but the program never closes ch. The receiver waits forever.

Fix:

go func() {
    for i := 0; i < 10; i++ {
        ch <- i
    }
    close(ch)
}()

If multiple senders, use the single-closer pattern: a goroutine that waits on a WaitGroup of senders and then closes.

Solution 7¶

The built-in map is not safe for concurrent use. The Go runtime adds explicit checks that, on detecting concurrent writes, abort the program with fatal error: concurrent map writes. This is not a panic — recover does not catch it.

Fix 1: mutex.

var mu sync.Mutex
go func(i int) {
    defer wg.Done()
    mu.Lock()
    m[i] = i * 2
    mu.Unlock()
}(i)

Fix 2: sync.Map.

var sm sync.Map
go func(i int) {
    defer wg.Done()
    sm.Store(i, i*2)
}(i)

sync.Map is best for write-once-read-many or disjoint-key-sets-per-goroutine. For most map use cases, map + Mutex is faster.

Solution 8¶

item is the loop variable, mutated each iteration. Even though it is declared by for _, item := range items, in Go ≤ 1.21 it is a single variable. &item is the same address every time. By the time the goroutines run, item holds the last value.

Fix on all versions:

for _, item := range items {
    item := item        // shadow
    go func() {
        process(&item)
    }()
}

Or:

for _, item := range items {
    go func(item Item) {
        process(&item)
    }(item)
}

Note: even on Go 1.22+, only for ... range and for i := ...; ...; ... loop variables have per-iteration scope. Other variables inside the body still need explicit shadowing if captured.

Solution 9¶

Each of the five goroutines defer close(ch). The first close succeeds; the second panics with close of closed channel. Even before that, sends from goroutines that have not yet closed may race with a close from another goroutine — send on closed channel.

Fix: single-closer pattern.

ch := make(chan int)
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
    wg.Add(1)
    go func(i int) {
        defer wg.Done()
        ch <- i
    }(i)
}
go func() {
    wg.Wait()
    close(ch)
}()
for v := range ch {
    fmt.Println(v)
}

The closer goroutine waits for all senders and then closes once.

Solution 10¶

context.WithTimeout returns a cancel function that must be called. Discarding it (_ = cancel) leaks the timer until the deadline fires. Under high call rates, in-flight contexts accumulate. go vet warns: "the cancel function is not used."

Fix:

ctx, cancel := context.WithTimeout(parent, 5*time.Second)
defer cancel()
return doHTTP(ctx, url)

defer cancel() releases resources whether or not the timeout fires.

Solution 11¶

time.After creates a fresh Timer on every call. The timer is alive in the runtime's heap until it fires or is GC'd. In a hot select loop, you accumulate one timer per loop iteration; if messages arrive at 100 k/s and timeout is 1 s, you have ~100 k pending timers in flight.

Fix: use a single Timer reused across iterations.

timer := time.NewTimer(time.Second)
defer timer.Stop()
for {
    select {
    case m := <-messages:
        handle(m)
        if !timer.Stop() {
            <-timer.C
        }
        timer.Reset(time.Second)
    case <-timer.C:
        return
    }
}

The Reset dance is unfortunate. Go 1.23 made it simpler; check the release notes if you are on a recent toolchain.

Solution 12¶

defer runs at function exit, not loop iteration exit. With 10 000 files, 10 000 file descriptors accumulate before any closes. The OS hits ulimit -n and os.Open returns errors.

Fix: extract the body to a function so defer scopes per-iteration.

func processFiles(names []string) error {
    for _, name := range names {
        if err := processOne(name); err != nil {
            return err
        }
    }
    return nil
}

func processOne(name string) error {
    f, err := os.Open(name)
    if err != nil {
        return err
    }
    defer f.Close()
    return process(f)
}

Solution 13¶

The mutex protects the in-memory map and the HTTP request. While the HTTP is in flight, no other goroutine can take the lock. Latency for all other operations spikes to the HTTP round-trip time. Throughput collapses.

Fix: move the HTTP outside the critical section.

func (c *Cache) Get(key string) ([]byte, error) {
    c.mu.Lock()
    v, ok := c.data[key]
    c.mu.Unlock()
    if ok {
        return v, nil
    }

    // I/O outside lock
    resp, err := http.Get("https://upstream/" + key)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()
    b, _ := io.ReadAll(resp.Body)

    c.mu.Lock()
    c.data[key] = b
    c.mu.Unlock()
    return b, nil
}

Better: singleflight to deduplicate concurrent misses for the same key.

Solution 14¶

atomic.AddInt64 and a plain _ = requests read are mixing protocols. The atomic protocol promises happens-before only between atomic operations. A plain read may observe a torn value on some architectures, and the race detector flags it.

Fix:

fmt.Println("requests:", atomic.LoadInt64(&requests))

Or use the typed wrapper:

var requests atomic.Int64
requests.Add(1)
fmt.Println("requests:", requests.Load())

atomic.Int64 makes the protocol explicit at the type level — you cannot accidentally do a plain read.

Solution 15¶

db == nil check and db = openDB() assignment are not atomic. Two goroutines may both pass the check, both call openDB, both assign. The last write wins; earlier dbs are garbage but their open connections persist until GC, briefly doubling the connection pool.

Fix: sync.Once.

var (
    db   *sql.DB
    once sync.Once
)

func DB() *sql.DB {
    once.Do(func() {
        db = openDB()
    })
    return db
}

sync.Once.Do runs the function exactly once. Concurrent calls block until the first completes; subsequent calls return immediately.

Solution 16¶

The goroutine captures body and outlives the request. Under load, each request pins its body for 10 seconds. At 1 k RPS, 10 k goroutines hold 10 k bodies — gigabytes.

Fix 1: bounded worker pool consuming from a buffered channel.

type S3Uploader struct {
    queue chan []byte
}

func (u *S3Uploader) Submit(body []byte) error {
    select {
    case u.queue <- body:
        return nil
    default:
        return errBackpressure
    }
}

Workers consume from queue at controlled concurrency.

Fix 2: respond to the client after upload completes. This may not be appropriate for StatusAccepted-style flows, but it gives the client a signal.

Solution 17¶

errgroup.WithContext cancels ctx on first error. But slowFetch ignores ctx, so the other tasks continue. Wait returns the first error after waiting for all the slow ignoring tasks.

Fix:

g.Go(func() error {
    return slowFetchCtx(ctx, url)        // pass ctx
})

Inside slowFetchCtx, the HTTP request uses http.NewRequestWithContext(ctx, ...) so cancellation propagates to the network layer.

Solution 18¶

The producer sends, then checks if j.Final { close(jobs) }. If j.Final was the last job to send, the close is fine. But the producer continues to the next iteration of for _, j := range allJobs — which is a send on closed channel.

Fix: close after the loop, not inside.

for _, j := range allJobs {
    jobs <- j
}
close(jobs)
wg.Wait()

If j.Final is supposed to signal early termination, use a separate mechanism (a done channel, or a context).

Solution 19¶

sync.WaitGroup contains internal state. Passing by value copies the state. The function's local wg is independent from the caller's. Done on the local copy does not decrement the caller's counter; the caller's Wait blocks forever.

go vet's copylocks check catches this.

Fix:

func spawn(wg *sync.WaitGroup, work func()) {
    go func() {
        defer wg.Done()
        work()
    }()
}

Solution 20¶

Pre-Go 1.14, the scheduler could not preempt at arbitrary instructions; only at function call points. A tight loop with no function calls held the M indefinitely.

But this code does call runtime.Gosched() — a yield. The yield happens; the second goroutine runs and prints "hello." The bug is the loop never ends: 1 000 000 000 iterations, with Gosched each, takes a long time even on a fast machine. The wg.Wait() blocks until the first goroutine returns, which is after 1 billion yields.

Fix: make the loop actually exit. Or, if "make the loop yield" is the goal, the yield is correct on Go 1.14+ even without Gosched (async preemption handles it).

Solution 21¶

Shutdown closes s.jobs first. If a Submit is in flight, the send on the now-closed channel panics.

Fix sequence:

Cancel context (signal producers).
Wait for producers to drain (via a separate WaitGroup or a "draining" flag).
Close s.jobs.
Wait for consumers via s.wg.

Or: make Submit context-aware and check a closed flag before sending.

func (s *Service) Submit(j Job) error {
    select {
    case <-s.shutdownCtx.Done():
        return errClosed
    case s.jobs <- j:
        return nil
    }
}

The select ensures the send respects shutdown.

Solution 22¶

time.Tick returns a channel from a Ticker that cannot be stopped. When monitor returns, the ticker continues firing into a channel no one reads. The ticker's internal goroutine and channel live until process exit. Calling monitor() repeatedly accumulates leaked tickers.

Fix:

func monitor() {
    t := time.NewTicker(time.Second)
    defer t.Stop()
    for tick := range t.C {
        publish(tick)
        if shouldStop() {
            return
        }
    }
}

time.NewTicker + defer t.Stop() is the only production-safe pattern.

Solution 23¶

LockOSThread calls are counted: two calls require two unlocks to unpin. After enterNS returns, its defer UnlockOSThread decrements once — but makeCall is still pinned. Fine so far.

The bug is more subtle. If enterNS's defer UnlockOSThread fires after Setns succeeded, and then makeCall continues, the goroutine is still pinned (thanks to makeCall's outer LockOSThread). So the namespace switch persists. OK.

But what if enterNS is called not from makeCall? Then the inner defer decrements to zero, the goroutine becomes unpinned, the scheduler may move it to another M (which is in the original namespace). The dial happens on the wrong M.

Fix: be explicit about who pins. Either:

Do not pin in enterNS; require the caller to be pinned.
Or pin once and pass through.

Reference-counted locking is rarely the right pattern; explicit ownership is cleaner.

Solution 24¶

sync.Once.Do runs the function exactly once. The first caller's c is bound; subsequent callers' c is ignored. Worse: the second caller has no way to know its config was discarded.

Fix patterns:

Document the contract: "first caller wins."
Return an error if called twice with different configs.
Or remove the singleton pattern: pass config explicitly into each function call.

A general lesson: sync.Once for idempotent initialisation only. If the function has caller-specific parameters, Once is the wrong tool.

Solution 25¶

Between RUnlock and Lock, another goroutine may have completed the same GetOrLoad(key) and populated the map. The current goroutine then re-loads, calling load(key) redundantly. If load has side effects (counter increment, billing event, external API call), duplication is a bug.

Fix: double-checked locking.

func (c *Cache) GetOrLoad(key string) []byte {
    c.mu.RLock()
    v, ok := c.data[key]
    c.mu.RUnlock()
    if ok { return v }

    c.mu.Lock()
    defer c.mu.Unlock()
    if v, ok := c.data[key]; ok {
        return v        // someone else loaded
    }
    v = load(key)
    c.data[key] = v
    return v
}

Or use singleflight.Group to deduplicate load calls.

Solution 26¶

ctx.Done() and <-s.jobs are in a select. If s.results <- s.process(j) blocks (no consumer), the goroutine is stuck in the send, not in the select. ctx.Done() cannot help — the select has already chosen the jobs case and moved past it.

Fix: nest a select around the send.

case j := <-s.jobs:
    result := s.process(j)
    select {
    case s.results <- result:
    case <-ctx.Done():
        return
    }

Every potentially blocking operation in a long-running goroutine must be ctx-aware.

Solution 27¶

init spawns a goroutine that lives forever. The goroutine cannot be stopped. Every test in every package that imports telemetry (directly or transitively) runs the flusher. Tests pollute each other; goleak flags the leak.

Fix: replace with an explicit lifecycle.

type Telemetry struct { ... }

func Start(ctx context.Context) *Telemetry {
    t := &Telemetry{}
    t.wg.Add(1)
    go t.run(ctx)
    return t
}

func (t *Telemetry) Stop() {
    t.cancel()
    t.wg.Wait()
}

Callers (including test setup) explicitly start and stop.

Solution 28¶

Each fn runs in its own goroutine. A panic in any of them has no local recover. The runtime terminates the entire process.

Fix: install a recover per goroutine.

for i := range items {
    i := i
    wg.Add(1)
    go func() {
        defer wg.Done()
        defer func() {
            if r := recover(); r != nil {
                log.Printf("item %d panicked: %v", i, r)
                // record the failure somehow
            }
        }()
        out[i] = fn(items[i])
    }()
}

Better: design fn to return an error rather than panic. The recover is defence in depth; the real fix is upstream input validation.

Solution 29¶

Each go C.slow() spawns a goroutine that calls a C function. While inside the C call, the goroutine holds an M (the Go runtime cannot reuse the M for other goroutines while it is in C code). 100 concurrent cgo calls = 100 Ms held. Each iteration of the outer loop spawns 100 more before the previous 100 finish (because time.Sleep(2500ms) is just longer than the C sleep). The runtime accumulates Ms.

Fix: bound cgo concurrency with a semaphore.

sem := make(chan struct{}, 10)
for round := 0; round < 10; round++ {
    for i := 0; i < 100; i++ {
        sem <- struct{}{}
        go func() {
            defer func() { <-sem }()
            C.slow()
        }()
    }
}

Now at most 10 Ms are held by cgo calls at any time.

Solution 30¶

s.db.Query uses no context. Cancellation of ctx (from a disconnected client) does not propagate to the DB driver. The query runs to completion, returning when the DB finally responds. Heavy retries → many zombie queries piled up.

Fix:

return s.db.QueryContext(ctx, "SELECT * FROM users WHERE id = $1", id)

QueryContext (and ExecContext) propagate cancellation to the driver, which can cancel mid-query on supported databases.

Wrap-up¶

These thirty bugs span the entire pitfall catalogue. They share a few patterns:

Lifetime bugs. 5, 6, 10, 11, 16, 22, 27. The goroutine, channel, or context outlives its useful scope.
Ordering bugs. 2, 3, 9, 18, 19, 21, 23. Operations happen in the wrong order or are not synchronised.
Sharing bugs. 1, 7, 8, 13, 14, 15, 17, 24, 25, 28, 30. Two goroutines touch state unsafely.
Runtime-aware bugs. 4, 12, 20, 26, 29. The bug shows up only when you understand the runtime: stack vs heap, M holding, async preemption, cgo.

Pattern recognition is the goal. After a year of practice, every one of these should jump off the page within five seconds.

Continue to optimize.md for performance-focused exercises, or revisit junior.md for the catalogue overview.

Goroutine Common Pitfalls — Find the Bug¶

How to use this file¶

Easy¶

Bug 1 — Captured i in a worker spawn¶

Bug 2 — wg.Add(1) after go¶

Bug 3 — Sleep instead of Wait¶

Bug 4 — Panic in a worker¶

Bug 5 — Unread result channel¶

Bug 6 — for range without close¶

Bug 7 — Concurrent map write¶

Bug 8 — Captured &item¶

Bug 9 — Double close¶

Bug 10 — Forgotten cancel()¶

Medium¶

Bug 11 — time.After in a select loop¶

Bug 12 — defer f.Close() in a loop¶

Bug 13 — Mutex over HTTP¶

Bug 14 — Atomic + plain read¶

Bug 15 — Singleton via if == nil¶

Bug 16 — Background goroutine outliving request¶

Bug 17 — errgroup ignoring ctx¶

Bug 18 — Closing input mid-iteration¶

Bug 19 — WaitGroup passed by value¶

Bug 20 — Tight loop and Gosched¶

Hard¶

Bug 21 — Shutdown race¶

Bug 22 — time.Tick leak¶

Bug 23 — Nested LockOSThread confusion¶

Bug 24 — sync.Once capturing config¶

Bug 25 — RWMutex upgrade¶

Bug 26 — Send blocks on shutdown¶

Bug 27 — init spawning unmanaged goroutine¶

Bug 28 — Panic taking down all peers¶

Bug 29 — Cgo holding M¶

Bug 30 — Context not propagated in DB call¶

Solutions¶

Solution 1¶

Solution 2¶

Solution 3¶

Solution 4¶

Solution 5¶

Solution 6¶

Solution 7¶

Solution 8¶

Solution 9¶

Solution 10¶

Solution 11¶

Solution 12¶

Solution 13¶

Solution 14¶

Solution 15¶

Solution 16¶

Solution 17¶

Solution 18¶

Solution 19¶

Solution 20¶

Solution 21¶

Solution 22¶

Solution 23¶

Solution 24¶

Solution 25¶

Solution 26¶

Solution 27¶

Solution 28¶

Solution 29¶

Solution 30¶

Wrap-up¶

Bug 1 — Captured `i` in a worker spawn¶

Bug 2 — `wg.Add(1)` after `go`¶

Bug 6 — `for range` without close¶

Bug 8 — Captured `&item`¶

Bug 10 — Forgotten `cancel()`¶

Bug 11 — `time.After` in a select loop¶

Bug 12 — `defer f.Close()` in a loop¶

Bug 15 — Singleton via `if == nil`¶

Bug 17 — `errgroup` ignoring `ctx`¶

Bug 20 — Tight loop and `Gosched`¶

Bug 22 — `time.Tick` leak¶

Bug 23 — Nested `LockOSThread` confusion¶

Bug 24 — `sync.Once` capturing config¶

Bug 27 — `init` spawning unmanaged goroutine¶