Memory Management in Depth — Hands-on Tasks¶

Work through these in order. Each has explicit acceptance criteria. Use Go 1.22+ (1.24+ for cleanup-API tasks).

Task 1: Reading `MemStats`¶

Write a program that prints HeapAlloc, HeapInuse, HeapIdle, and NumGC once a second for 10 seconds.

Acceptance criteria - [ ] Output shows values updating each second. - [ ] You can describe in one sentence what each field means. - [ ] You replace ReadMemStats with runtime/metrics.Read reading the equivalent metrics and confirm the numbers agree to within a few KiB.

Task 2: Observe a GC cycle¶

Run any allocation-heavy program with GODEBUG=gctrace=1.

Acceptance criteria - [ ] You capture at least 5 GC-trace lines from stderr. - [ ] You annotate one of the lines field-by-field (cycle, time, %CPU, phase times, heap sizes, goal, P count). - [ ] You change GOGC (e.g., to 50 and 200) and observe that the cycle count and heap goals change accordingly.

Task 3: Watch escape analysis¶

Write func returnsInt() *int { x := 42; return &x } and func returnsValue() int { x := 42; return x }.

Acceptance criteria - [ ] go build -gcflags="-m" reports moved to heap: x for the first and nothing for the second. - [ ] You write a third function passToInterface(x int) { fmt.Println(x) } and observe the escape note for the interface conversion. - [ ] You rewrite one of them so it stops escaping and confirm the message goes away.

Task 4: Bench an allocation regression¶

Use the snippet below as a starting point.

func BenchmarkBuild(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        var s string
        for _, p := range []string{"a", "b", "c", "d"} {
            s += p
        }
        _ = s
    }
}

Acceptance criteria - [ ] go test -bench=. -benchmem reports a baseline. - [ ] You replace s += with strings.Builder (with Grow) and the new bench shows fewer allocs/op. - [ ] You capture both runs and compute the diff with benchstat.

Task 5: A `sync.Pool` that helps (and one that doesn't)¶

Build two HTTP handlers: one that allocates a 4 KiB scratch slice per request, one that obtains it from a sync.Pool. Hit them with hey -n 100000 -c 100.

Acceptance criteria - [ ] The pooled version's pprof -alloc_objects profile shows ~10× fewer allocations. - [ ] You then change the workload to "allocate a 64 MiB buffer" and observe that pooling now retains 64 MiB indefinitely. - [ ] You add a cap-based discard in defer Put and verify the residency drops.

Task 6: Heap leak via slice retention¶

Write func header(path string) []byte that reads a file and returns the first 100 bytes via raw[:100].

Acceptance criteria - [ ] You write a benchmark that calls it 100 times on a 50 MiB file and confirm HeapAlloc grows by ~5 GiB. - [ ] You change the function to slices.Clone(raw[:100]) and confirm HeapAlloc stays flat across iterations. - [ ] You write a short comment in the code explaining why the original leaked.

Task 7: Goroutine leak¶

Write a producer that sends on an unbuffered channel with no consumer. Spawn 1000 producers.

Acceptance criteria - [ ] runtime.NumGoroutine() reports >1000 after spawning. - [ ] Capture /debug/pprof/goroutine?debug=2 and find the leaked goroutines in the dump. - [ ] Fix by adding a context.Context with cancellation; verify goroutine count returns to baseline.

Task 8: `GOMEMLIMIT` in action¶

Write a program that allocates 100 MiB slices in a loop and drops the reference each iteration (so each is garbage at the start of the next).

Acceptance criteria - [ ] Without GOMEMLIMIT, the program reaches steady-state RSS around 200 MiB (one live + headroom). - [ ] With GOMEMLIMIT=150MiB, GC runs more often, steady-state RSS drops, and GCCPUFraction rises. - [ ] You annotate the result with the trade-off in your own words.

Task 9: From `SetFinalizer` to `AddCleanup`¶

Pick a small type that owns an OS resource (e.g., wraps an os.File) and attach a finalizer.

Acceptance criteria - [ ] You demonstrate (with runtime.GC()) that the finalizer runs once. - [ ] You demonstrate that a cycle of two finalizer-bearing objects pointing at each other never collects. - [ ] You rewrite using runtime.AddCleanup (Go 1.24+) and confirm the cycle no longer blocks collection.

Task 10: pprof a real heap¶

Pick any Go service you've written, expose /debug/pprof/heap, and generate steady load.

Acceptance criteria - [ ] You capture an in-use profile and identify the top-3 allocation sites. - [ ] You capture an alloc-objects profile and explain how it differs from in-use. - [ ] You pick one site and reduce its allocations by ≥30% (preallocation, pooling, or value-vs-pointer change). You bench the change before and after.

Task 11: Stack growth, observed¶

Write a recursive function whose argument is a struct slightly under the initial stack size, recursing N times.

Acceptance criteria - [ ] With GODEBUG=schedtrace=1000, the goroutine's stack size visibly grows for large N. - [ ] You rewrite iteratively and confirm the stack stays at the initial size. - [ ] You explain why the runtime is forced to copy on growth and what that costs.

Task 12: Capacity planning¶

For an HTTP service you maintain (or build a small one), produce a one-page capacity plan.

Acceptance criteria - [ ] You measure steady-state live_heap under realistic load. - [ ] You measure peak goroutine count and average stack size. - [ ] You compute peak_rss ≈ live_heap × (1 + GOGC/100) + stacks × goroutines + overheads. - [ ] You compare to actual RSS and explain any gap (cgo, mapped files, retained idle pages). - [ ] You write the value you'd set for GOMEMLIMIT and GOGC in production, with a one-paragraph justification.

Stretch — Task 13: Allocation-free hot path¶

Pick a small but realistic kernel (e.g., parse a fixed binary header, sum a slice with conditions, render a tiny template).

Acceptance criteria - [ ] A benchmark with -benchmem reports 0 allocs/op. - [ ] You document each technique used (preallocated output, value receivers, avoided interface{}, etc.) with a one-line note. - [ ] You write a regression benchmark that fails CI if the count rises above zero.

Submission¶

Each task should produce:

A short writeup (5–15 lines) of what you observed.
The code you ran or modified.
The benchmark or profile output that backs your conclusions.

These artifacts are what turn "I read about memory" into "I can debug it in production".