Memory Model — Hands-on Tasks¶

Exercises that build intuition for the Go memory model and the race detector.

Easy¶

Task 1 — Reproduce a data race¶

Write code that has an obvious data race:

var counter int
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        counter++
    }()
}
wg.Wait()
fmt.Println(counter)

Run with go run main.go and go run -race main.go. Compare outputs. Verify the race detector reports the race.

Goal. See the race detector in action.

Task 2 — Fix with atomic¶

Take Task 1 and fix with atomic.Int64. Verify the result is always 1000.

Task 3 — Fix with mutex¶

Same problem, fix with sync.Mutex. Verify correctness.

Task 4 — Fix with channel¶

Same problem, fix with a channel-based design: each goroutine sends 1 on a channel; a single counting goroutine reads.

Task 5 — Lazy init with `sync.Once`¶

Write a GetCache() function that lazily initialises and returns a *Cache. Multiple goroutines call it concurrently. Use sync.Once to ensure init happens once.

Verify with race detector and stress test.

Task 6 — Configuration hot-reload¶

Write a small in-process config service:

currentConfig() *Config — readers call this on every operation.
reload(*Config) — operators call this to update.

Use atomic.Pointer[Config]. Verify race-free under stress.

Medium¶

Task 7 — `RWMutex` vs `Mutex` benchmark¶

Benchmark a cache with two implementations:

sync.Mutex.
sync.RWMutex.

Use testing.B.RunParallel with read-heavy and balanced workloads. Compare ns/op.

Goal. See when RWMutex wins.

Task 8 — `sync.Map` vs sharded mutex¶

Implement a concurrent cache two ways:

sync.Map.
16 shards, each a map[K]V + sync.Mutex.

Benchmark both under read-heavy and write-heavy workloads.

Task 9 — Goroutine leak with channels¶

Write a function that returns a result through a channel, with a timeout:

func fetch(ctx context.Context) int {
    ch := make(chan int)
    go func() { ch <- compute() }()
    select {
    case v := <-ch:
        return v
    case <-ctx.Done():
        return 0
    }
}

Test what happens on timeout. Use goleak to detect the leak. Fix by buffering the channel.

Write two counter structs:

Adjacent int64 fields (no padding).
int64 fields padded to cache line size.

Each accessed atomically from a different goroutine. Benchmark both. Observe the padded version is significantly faster on multi-core machines.

Task 11 — Memory model educational test¶

Write a small program that intentionally relies on undefined behaviour:

var x int
var done bool
go func() {
    x = 42
    done = true
}()
for !done {}
fmt.Println(x)

Run with go run. Note that on x86 it "works" (prints 42). Run with go run -race. Note the race report.

Now compile with optimisations disabled (go run -gcflags='-N -l' main.go). Does it still work? (Probably yes, on x86.) Now imagine running on ARM or with future compiler optimisations.

Goal. Internalise that "works on my machine" is not enough.

Task 12 — `sync.Pool` benchmark¶

Allocate a 4 KB buffer per call in a hot loop. Compare with a sync.Pool-based version. Measure allocations with b.ReportAllocs() and b.Run().

Observe the dramatic reduction in allocs/op with pooling.

Hard¶

Task 13 — Implement a concurrent LRU cache¶

Build an LRU cache that supports Get, Set, Delete from many goroutines. Bounded size. LRU eviction on overflow.

Use a sync.Mutex to protect the LRU list and the lookup map.
Make sure Get under high contention does not block too long.
Test with -race.

Goal. Apply synchronisation to a real data structure.

Task 14 — Lock-free single-producer single-consumer ring buffer¶

Implement a ring buffer with one producer and one consumer goroutine. Use only atomic operations on the head and tail indices. No mutex.

Verify correctness with -race and stress runs.

Task 15 — Property-based concurrent test¶

Use pgregory.net/rapid to generate random sequences of Get, Set, Delete operations on your LRU cache from Task 13. After each sequence, verify invariants:

All keys in the map are also in the list.
The list size matches the map size.
No duplicates in the list.

Run the test for 10 minutes; treat any failure as a bug.

Task 16 — Mutex profile diagnosis¶

Take an existing concurrent program (your worker pool from earlier tasks). Enable mutex profiling:

runtime.SetMutexProfileFraction(1)

Run under load. Capture a profile:

go tool pprof http://localhost:6060/debug/pprof/mutex

Identify the mutex with the highest contention. Reason about whether to refactor.

Task 17 — Implementing a singleflight¶

Build a singleflight-like structure that dedupes concurrent calls for the same key:

g.Do("key", func() interface{} { return compute("key") })

100 goroutines calling with the same key should result in compute("key") running once; all 100 receive the same result.

Use channels + mutex. Compare to golang.org/x/sync/singleflight.

Task 18 — Concurrent state machine¶

Implement a state machine (e.g., a connection's lifecycle: Idle → Connecting → Connected → Disconnecting → Closed). Multiple goroutines may request state transitions. Ensure transitions are atomic (no goroutine sees an invalid intermediate state).

Options: - Mutex around state. - Atomic CAS on state. - Owner goroutine processing events from a channel.

Pick one; explain why.

Task 19 — Cross-shard read consistency¶

Build a sharded counter (each goroutine owns a per-CPU shard). On read, sum all shards. Verify that two consecutive reads from the same goroutine see monotonically non-decreasing values — even under heavy concurrent writes.

Hint: this is subtle because reading shard 1, then shard 2 may see an update that happened after shard 1's read.

Goal. Understand snapshot consistency.

Task 20 — Write a "happens-before" exercise¶

For each of the following pairs, decide whether A happens-before B per the Go memory model:

Inside one goroutine: A = x = 1, B = y = 2.
A = ch <- 1 in goroutine G1; B = <-ch in G2.
A = close(done) in G1; B = <-done in G2.
A = mu.Unlock() in G1; B = mu.Lock() in G2.
A = wg.Done() in G1; B = wg.Wait() in G2 returning.
A = some operation in G1; B = some operation in G2, no synchronisation.
A = goroutine G1's exit; B = the parent calling wg.Wait() returning.

Justify each.

Goal. Verify your mental model of happens-before.

Solutions and hints¶

Task 1 expected output¶

Without -race: result varies, usually < 1000. With -race: race report with two stack traces.

Task 2 sketch¶

var counter atomic.Int64
// ...
counter.Add(1)
// ...
fmt.Println(counter.Load())

Always 1000.

Task 6 sketch¶

var cfg atomic.Pointer[Config]

func Init() {
    cfg.Store(&Config{...})
}

func Current() *Config {
    return cfg.Load()
}

func Reload(newCfg *Config) {
    cfg.Store(newCfg)
}

Treat the returned *Config as immutable.

Task 14 sketch¶

type RingBuffer struct {
    buf  [1024]int
    head atomic.Uint64 // producer writes
    tail atomic.Uint64 // consumer reads
}

func (r *RingBuffer) Push(v int) bool {
    h := r.head.Load()
    t := r.tail.Load()
    if h-t == uint64(len(r.buf)) { return false }
    r.buf[h%uint64(len(r.buf))] = v
    r.head.Store(h + 1)
    return true
}

func (r *RingBuffer) Pop() (int, bool) {
    t := r.tail.Load()
    h := r.head.Load()
    if h == t { return 0, false }
    v := r.buf[t%uint64(len(r.buf))]
    r.tail.Store(t + 1)
    return v, true
}

Single producer / single consumer is safe because each side owns one index.

Task 20 answers¶

A happens-before B (sequential within goroutine).
A happens-before B (channel synchronisation).
A happens-before B (close synchronises with close-receive).
A happens-before B (mutex synchronisation).
A happens-before B (waitgroup synchronisation).
A and B are concurrent (no synchronisation).
Depends. If the parent's wg.Wait() corresponds to G1's wg.Done(), then G1's Done happens-before the return. The exit itself is not directly part of the model; the Done() call is.

Wrap-up¶

After these tasks you should:

Reproduce and fix data races with atomic, mutex, channel, and sync.Once.
Recognise common patterns (LRU cache, ring buffer, sharded counter).
Use goleak and pgregory.net/rapid for concurrent testing.
Reason about happens-before in concrete code.
Use the race detector as a daily tool.

The next file (find-bug.md) tests your race-spotting skills on broken code.