Goroutines — Middle Level¶

Table of Contents¶

Introduction
When to Spawn a Goroutine
The Goroutine Lifecycle
Coordination Tools in Practice
Cancellation with context.Context
Goroutine Leaks: Detection and Prevention
Worker Pools
Fan-Out / Fan-In
Pipelines
errgroup and Structured Coordination
Goroutine-Safe APIs
Race Detection in CI
Best Practices for Established Codebases
Pitfalls You Will Meet
Self-Assessment
Summary

Introduction¶

You already know the mechanics: go f() spawns a new path of execution, WaitGroup joins them, panics in goroutines kill the program. The middle-level question is what to do with that power in real code: which problems benefit from goroutines, which patterns are idiomatic, how to cancel cleanly, and how to detect leaks before they become production incidents.

After reading this you will:

Have a checklist for "should this be a goroutine?"
Know the canonical patterns: fan-out, fan-in, pipeline, worker pool.
Use context.Context to cancel goroutine trees.
Use errgroup.Group to spawn and join with error handling.
Detect goroutine leaks with runtime.NumGoroutine, pprof, and tests.
Understand the implicit contract of "goroutine-safe" types.
Run the race detector productively in CI.

When to Spawn a Goroutine¶

Goroutines are not free. They cost memory, scheduler attention, and mental load. Use them when concurrency is useful, not because they are a Go feature.

Spawn a goroutine when:¶

The work blocks on I/O (network, disk, subprocess) and the caller has other work to do.
You need parallelism across CPU cores for a CPU-bound algorithm.
You have N independent tasks and you want to run them simultaneously to reduce wall-clock time.
You need a long-running background activity (ticker, queue consumer, supervisor).
You want to isolate a slow or untrusted operation behind a timeout.

Do not spawn a goroutine when:¶

The work is shorter than a microsecond (a function call is faster).
The work is purely synchronous and the caller has nothing else to do — f() is simpler.
You only spawn it to "get out of the current call stack" — that is what helper functions are for.
You cannot articulate when it will exit. If you do not have an exit strategy, you have a leak.
The result needs to be returned synchronously. A goroutine cannot return to its parent's frame.

The "blocking I/O" rule¶

A typical Go web server creates one goroutine per incoming request. While that goroutine is waiting on a database query or an HTTP call to a downstream service, the runtime parks it and reuses the OS thread for some other goroutine. This is why a Go server can handle 50 000 concurrent requests on 4 OS threads without breaking.

Without goroutines, that same workload would either:

Use 50 000 OS threads (impossible — too much memory).
Use a callback / event loop (functional but harder to reason about).

Go made the bet that lightweight scheduled units of execution + blocking-style I/O is a better programming model than callbacks. Goroutines are the embodiment of that bet.

The Goroutine Lifecycle¶

A goroutine progresses through a small set of states. The Go runtime tracks each one in a struct called g:

+-----------+   go f()    +-------------+   schedule    +---------+
|  Gidle    | ----------> | Grunnable   | ------------> | Grunning|
+-----------+             +-------------+               +---------+
                                ^                            |
                                |                            v
                                |       block on I/O      +---------+
                                +------------------------ | Gwaiting|
                                                          +---------+
                                                                |
                                  ready (I/O done, lock taken)  v
                                                          +-----------+
                                                          | Grunnable |
                                                          +-----------+

                                                         exit / return
                                                          +---------+
                                                          | Gdead   |
                                                          +---------+

The states are not user-visible directly, but they shape how you reason about goroutines:

Grunnable: ready to run, waiting for an OS thread (M) and processor slot (P).
Grunning: actively executing on a thread.
Gwaiting: parked because of a blocking operation (channel receive, mutex acquire, syscall, sleep). Does not consume an OS thread.
Gdead: returned. The struct may be reused for a future goroutine.

Why this matters at middle level¶

You can spawn 500 000 goroutines that are all blocked on time.Sleep or net.Conn.Read, and they will not keep 500 000 OS threads busy. They are parked. The runtime needs only enough threads (Ms) to run the runnable goroutines, plus a few for syscalls. That is the source of Go's I/O scaling.

It also explains why a CPU profile may show "9 goroutines" running while runtime.NumGoroutine() reports 50 000: most are parked.

Coordination Tools in Practice¶

The standard library offers several primitives. Pick the one that matches the question you are answering.

Question	Tool
"Wait for N goroutines to finish."	`sync.WaitGroup`
"Wait for N goroutines, collect first error."	`errgroup.Group`
"Run code exactly once."	`sync.Once`
"Pass a value safely from one goroutine to another."	channels
"Cancel a tree of goroutines."	`context.Context`
"Protect shared mutable state."	`sync.Mutex` / `sync.RWMutex`
"Read or write an atomic value."	`sync/atomic`, `atomic.Value`, `atomic.Pointer[T]`
"Sequence work between two goroutines."	unbuffered channel
"Pool short-lived buffers."	`sync.Pool`

Two recurring beginner mistakes at middle level:

Reaching for Mutex when a channel would be cleaner. "Don't communicate by sharing memory; share memory by communicating." That said, mutexes are not evil — they are the right answer for "many goroutines reading and updating a small in-memory map."
Reaching for channels when a Mutex would be cleaner. Building a complex coordination dance with channels for simple shared state is over-engineering.

The rule of thumb: if the data flows linearly between goroutines, use channels; if it sits in a shared structure that several goroutines mutate, use a mutex.

Cancellation with `context.Context`¶

context.Context is how Go expresses "this work might be cancelled." Every long-running goroutine should accept a context.Context and return early when it is cancelled.

The contract¶

func DoWork(ctx context.Context, in <-chan Item) error {
    for {
        select {
        case <-ctx.Done():
            return ctx.Err() // context.Canceled or context.DeadlineExceeded
        case item, ok := <-in:
            if !ok {
                return nil
            }
            if err := process(ctx, item); err != nil {
                return err
            }
        }
    }
}

The pattern: every blocking operation is in a select with <-ctx.Done(). If the parent cancels, the goroutine exits within one iteration.

Constructing contexts¶

ctx, cancel := context.WithCancel(parent)
defer cancel()

ctx, cancel := context.WithTimeout(parent, 5*time.Second)
defer cancel()

ctx, cancel := context.WithDeadline(parent, time.Now().Add(time.Minute))
defer cancel()

Always call cancel() to release resources. The defer is the safe default.

Propagating cancellation¶

go fetchA(ctx)
go fetchB(ctx)

When cancel() is called, both fetchA and fetchB see ctx.Done() close and return. The cancellation tree mirrors the goroutine tree.

`context` is not a goroutine manager¶

context.Context does not start, stop, or count goroutines. It carries cancellation signals and request-scoped values. Pair it with errgroup.Group, WaitGroup, or your own coordination to actually wait for goroutines to exit.

Goroutine Leaks: Detection and Prevention¶

A goroutine leak is a goroutine that never returns. Common causes:

Sending to an unbuffered channel that nobody reads.
Receiving from a channel that nobody closes.
Holding a mutex forever.
Looping with no exit condition.
Forgotten background ticker.

Detection¶

`runtime.NumGoroutine` over time¶

go func() {
    for range time.Tick(10 * time.Second) {
        log.Printf("goroutines: %d", runtime.NumGoroutine())
    }
}()

Trend up = leak. Stable = healthy. Save the result to a metric (Prometheus, OpenTelemetry, etc.).

`pprof` goroutine dump¶

Expose net/http/pprof:

import _ "net/http/pprof"
go http.ListenAndServe("localhost:6060", nil)

Then:

curl localhost:6060/debug/pprof/goroutine?debug=2

You get a stack trace per goroutine. Group by stack: thousands of identical stacks point straight at the leak.

`goleak` in tests¶

import "go.uber.org/goleak"

func TestMain(m *testing.M) {
    goleak.VerifyTestMain(m)
}

After every test, goleak checks that no goroutines remain that did not exist at start. A leaking test fails the test suite.

Prevention patterns¶

Buffered channel of size 1 for "send result and exit"¶

res := make(chan int, 1)
go func() { res <- compute() }()

select {
case v := <-res:
    use(v)
case <-ctx.Done():
    // goroutine still finishes and writes to res, but no one cares
}

The buffer of 1 means the send always completes, so the goroutine always exits. Without the buffer, if the receiver gives up via ctx.Done(), the sender blocks forever — leak.

Always pair a goroutine with a way to stop it¶

type Worker struct {
    cancel context.CancelFunc
    done   chan struct{}
}

func NewWorker() *Worker {
    ctx, cancel := context.WithCancel(context.Background())
    w := &Worker{cancel: cancel, done: make(chan struct{})}
    go w.run(ctx)
    return w
}

func (w *Worker) run(ctx context.Context) {
    defer close(w.done)
    for {
        select {
        case <-ctx.Done():
            return
        case <-time.After(time.Second):
            tick()
        }
    }
}

func (w *Worker) Stop() {
    w.cancel()
    <-w.done // wait for run() to actually return
}

Two channels: cancel to signal stop, done to confirm exit. The Stop is synchronous — when it returns, the goroutine has truly finished.

Worker Pools¶

A worker pool is the standard answer to "I have a stream of work, I want bounded concurrency."

Skeleton¶

func RunPool(ctx context.Context, workers int, jobs <-chan Job, results chan<- Result) {
    var wg sync.WaitGroup
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for {
                select {
                case <-ctx.Done():
                    return
                case job, ok := <-jobs:
                    if !ok {
                        return
                    }
                    results <- process(job)
                }
            }
        }()
    }
    go func() {
        wg.Wait()
        close(results)
    }()
}

Sizing the pool¶

Workload	Heuristic
CPU-bound	`runtime.GOMAXPROCS(0)` workers
I/O-bound	tens to thousands; profile to find the sweet spot
Network-bound to a single downstream	match the downstream's connection limit
Mixed	start with `2 * GOMAXPROCS`, tune from metrics

Oversizing a pool wastes memory and adds scheduler overhead. Undersizing leaves CPUs idle. Measure.

Why not "one goroutine per job"?¶

For unbounded input — say, an HTTP server's request stream — naive "one goroutine per request" is fine because each request is independent and short-lived. For tasks that hammer a fixed downstream (a database, a third-party API), a pool is the right answer: it bounds the concurrency you impose on that downstream.

Fan-Out / Fan-In¶

A pattern for parallelising a slow per-item operation.

Fan-out¶

func FanOut(ctx context.Context, items []Item, workers int) <-chan Result {
    out := make(chan Result)
    in := make(chan Item)

    var wg sync.WaitGroup
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for item := range in {
                select {
                case out <- process(item):
                case <-ctx.Done():
                    return
                }
            }
        }()
    }

    go func() {
        defer close(in)
        for _, it := range items {
            select {
            case in <- it:
            case <-ctx.Done():
                return
            }
        }
    }()

    go func() {
        wg.Wait()
        close(out)
    }()

    return out
}

Fan-in¶

Merging multiple result channels into one:

func FanIn(channels ...<-chan Result) <-chan Result {
    out := make(chan Result)
    var wg sync.WaitGroup
    for _, c := range channels {
        wg.Add(1)
        go func(c <-chan Result) {
            defer wg.Done()
            for v := range c {
                out <- v
            }
        }(c)
    }
    go func() { wg.Wait(); close(out) }()
    return out
}

FanIn is one of the most reused snippets in production Go.

Pipelines¶

A pipeline is a chain of stages, each running in its own goroutine, connected by channels.

items --> [parse] --> [enrich] --> [persist] --> done

func parse(in <-chan []byte) <-chan Record {
    out := make(chan Record)
    go func() {
        defer close(out)
        for raw := range in {
            r, err := decode(raw)
            if err != nil {
                continue
            }
            out <- r
        }
    }()
    return out
}

func enrich(in <-chan Record) <-chan Record {
    out := make(chan Record)
    go func() {
        defer close(out)
        for r := range in {
            r.Score = score(r)
            out <- r
        }
    }()
    return out
}

func persist(in <-chan Record) <-chan error {
    errs := make(chan error)
    go func() {
        defer close(errs)
        for r := range in {
            if err := save(r); err != nil {
                errs <- err
            }
        }
    }()
    return errs
}

Compose:

errs := persist(enrich(parse(input)))
for err := range errs { log.Println(err) }

Pipelines decouple stages, allow per-stage parallelism (each stage can spawn N workers), and are easy to test (each stage takes channels in, returns a channel out).

`errgroup` and Structured Coordination¶

golang.org/x/sync/errgroup is the canonical "spawn N goroutines, wait for all, collect first error" tool.

import "golang.org/x/sync/errgroup"

func FetchAll(ctx context.Context, urls []string) error {
    g, ctx := errgroup.WithContext(ctx)
    for _, url := range urls {
        url := url
        g.Go(func() error {
            return fetch(ctx, url)
        })
    }
    return g.Wait()
}

Key behaviours:

g.Go(f) spawns a goroutine running f.
If any f returns a non-nil error, the derived ctx is cancelled — the rest of the goroutines should observe ctx.Done() and exit.
g.Wait returns the first non-nil error (or nil if all succeed).

In Go 1.20+, errgroup.SetLimit(n) bounds parallel goroutines without writing your own pool.

g.SetLimit(8)
for _, url := range urls {
    url := url
    g.Go(func() error { return fetch(ctx, url) })
}

errgroup is the right answer 80% of the time when you would otherwise reach for WaitGroup.

Goroutine-Safe APIs¶

A type is goroutine-safe (or "concurrent-safe") if multiple goroutines can call its methods simultaneously without corrupting state. Document this in the doc comment of the type.

Examples from the standard library¶

Type	Goroutine-safe?
`sync.Mutex`, `sync.RWMutex`, `sync.WaitGroup`, `sync.Once`, `sync.Pool`, `sync.Map`	Yes — that's their purpose
`time.Timer`, `time.Ticker`	Concurrent stop is safe
`net/http.Client`, `net/http.Transport`	Yes — designed for sharing
`database/sql.DB`	Yes — connection pool is internal
Most `*os.File` operations	Yes for concurrent reads/writes; semantics depend on OS
Built-in `map`	No — must guard with `Mutex` or use `sync.Map`
Built-in `slice`	No — concurrent append races

When you write a new type¶

Make a deliberate choice:

// Cache stores values keyed by string. It is safe for concurrent use by
// multiple goroutines.
type Cache struct {
    mu sync.RWMutex
    m  map[string]string
}

Or:

// Builder collects fragments. Methods on Builder are NOT safe for concurrent
// use; pass an instance through one goroutine at a time.
type Builder struct { ... }

Either is acceptable. Silent ambiguity is not.

Race Detection in CI¶

The Go race detector is the single most useful concurrency tool. It instruments memory accesses at compile time and reports unsynchronised reads/writes at runtime.

Run mode¶

go test -race ./...
go run -race main.go
go build -race -o myapp

CI integration¶

Every CI run should include:

- name: Test with race
  run: go test -race -count=1 ./...

-count=1 defeats the test cache so the race detector runs every time.

Cost¶

The race detector adds ~5–15× CPU and ~5–10× memory. Acceptable for tests; prohibitive for production. Some teams ship a -race build to canary servers; most do not.

What it finds¶

Data races — two goroutines, at least one writing, no synchronisation between them.
Not deadlocks (use the standard runtime detector or go test -timeout).
Not logical concurrency bugs (use code review and tests).

A test that does not exercise the racy code path will not catch the race. Race detection is only as good as your tests.

Best Practices for Established Codebases¶

Never spawn a goroutine in a function whose name does not signal it. Functions named Get, Find, Compute should not have side-effect background work. Use Watch, StartLoop, Run, Serve.
Every long-running goroutine takes a context.Context. No exceptions for production code.
Every long-running goroutine has a way to confirm it has stopped. Either by closing a done channel, by calling wg.Done, or by signaling through errgroup.
Pair channels with their close path. Document who closes and when. A channel without a close path is a leak.
Pass loop variables by parameter. Even in Go 1.22+, the explicit form go func(x X){...}(x) reads more clearly.
Test with the race detector. Make -race mandatory in CI.
Use errgroup over hand-rolled WaitGroup + chan error. Less code, fewer bugs.
Profile before assuming the scheduler is the bottleneck. Most "Go is slow" reports are CPU or GC, not the scheduler.

Pitfalls You Will Meet¶

Reading from a closed channel does not block¶

ch := make(chan int)
close(ch)
v, ok := <-ch // v == 0, ok == false, no block

A for v := range ch on a closed channel finishes the loop. Write code that depends on this fact, not on a separate "done" signal.

Sending to a closed channel panics¶

ch := make(chan int)
close(ch)
ch <- 1 // panic: send on closed channel

This is why the convention is "the sender closes." If multiple goroutines send, none of them can safely close.

`select` with multiple ready cases picks one at random¶

select {
case <-ch1:
case <-ch2:
}

If both ch1 and ch2 are ready, Go picks one pseudo-randomly. Do not rely on priority.

`ctx.Done()` returns a channel, not a value¶

if ctx.Done() {           // BUG: channels are always truthy
    return
}
select {
case <-ctx.Done():        // correct
    return
default:
}

`WithTimeout` does not "cancel itself" — you still need `defer cancel()`¶

ctx, cancel := context.WithTimeout(parent, time.Second)
// MISSING defer cancel() — the timer goroutine and ctx struct leak until timeout

go vet will warn about this. Always call cancel.

`runtime.Gosched()` is not preemption¶

It only yields to runnable goroutines. If no other goroutine is runnable, the calling goroutine continues immediately. Go 1.14+ has async preemption, which makes Gosched rarely useful in production.

Self-Assessment¶

Summary¶

Goroutines pay off in production when each one does substantial blocking or independent work. The patterns are few and reusable — spawn-and-wait, worker pool, fan-out/fan-in, pipeline — and standard primitives (WaitGroup, errgroup, context.Context, channels, sync.Mutex) cover almost every scenario.

Three habits separate code that ships from code that leaks:

Every goroutine has an exit story. Closed channel, cancelled context, finished work — pick one before you write go.
context.Context is the cancellation mechanism, not a value bag. Pass it explicitly.
Race detector in CI. Always.

The next level — senior — focuses on architectural decisions: structured concurrency, supervisor patterns, designing pools that scale, and avoiding the failure modes of the patterns described here.