Cleanup Ordering — Hands-On Tasks¶
Practical exercises for mastering cleanup ordering in Go. Each task has a problem statement, hints, and a reference solution. Work through them in order.
Task 1: Basic File Copy with Proper Close¶
Problem. Write a function copyFile(src, dst string) error that copies the contents of src to dst. Use defer for cleanup. Handle close errors properly: if the read or write succeeded but a close fails, return the close error. If multiple operations fail, return the first failure.
Hint. Use a named return value. The deferred close should only set the return if no other error has been reported.
Reference solution.
package main
import (
"io"
"os"
)
func copyFile(src, dst string) (err error) {
in, err := os.Open(src)
if err != nil {
return err
}
defer func() {
if cerr := in.Close(); cerr != nil && err == nil {
err = cerr
}
}()
out, err := os.Create(dst)
if err != nil {
return err
}
defer func() {
if cerr := out.Close(); cerr != nil && err == nil {
err = cerr
}
}()
_, err = io.Copy(out, in)
return err
}
Task 2: Worker with Cancel-Drain-Close¶
Problem. Write a Worker type that processes string messages from a channel. It should: 1. Have a Submit(msg string) method. 2. Have a Stop(ctx context.Context) error method that: - Stops accepting new messages. - Drains buffered messages. - Returns when all messages are processed or the context expires. 3. Be safe for concurrent use of Submit and Stop.
Hint. Use a context, a WaitGroup, and sync.Once. The processing goroutine should drain on cancel.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
"sync"
)
type Worker struct {
in chan string
ctx context.Context
cancel context.CancelFunc
wg sync.WaitGroup
once sync.Once
err error
}
func NewWorker(buf int) *Worker {
ctx, cancel := context.WithCancel(context.Background())
w := &Worker{
in: make(chan string, buf),
ctx: ctx,
cancel: cancel,
}
w.wg.Add(1)
go w.run()
return w
}
func (w *Worker) run() {
defer w.wg.Done()
for {
select {
case <-w.ctx.Done():
// drain
for {
select {
case m := <-w.in:
process(m)
default:
return
}
}
case m := <-w.in:
process(m)
}
}
}
func process(m string) {
fmt.Println("processed:", m)
}
func (w *Worker) Submit(m string) error {
select {
case w.in <- m:
return nil
case <-w.ctx.Done():
return errors.New("worker stopped")
}
}
func (w *Worker) Stop(ctx context.Context) error {
w.once.Do(func() {
w.cancel()
done := make(chan struct{})
go func() { w.wg.Wait(); close(done) }()
select {
case <-done:
case <-ctx.Done():
w.err = ctx.Err()
}
})
return w.err
}
func main() {
w := NewWorker(10)
for i := 0; i < 5; i++ {
w.Submit(fmt.Sprintf("m%d", i))
}
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
w.Stop(ctx)
}
Task 3: AfterFunc with Synchronisation¶
Problem. Write a function runWithDeadlineAndCleanup(ctx context.Context, run func() error, cleanup func()) error that: 1. Runs run() in a goroutine. 2. If ctx is cancelled before run completes, calls cleanup(). 3. Always waits for cleanup (if triggered) to finish before returning. 4. Returns run's error or ctx.Err().
Hint. Use context.AfterFunc and a WaitGroup for the cleanup.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
"sync"
"time"
)
func runWithDeadlineAndCleanup(ctx context.Context, run func() error, cleanup func()) error {
var wg sync.WaitGroup
wg.Add(1)
stop := context.AfterFunc(ctx, func() {
defer wg.Done()
cleanup()
})
defer func() {
if stop() {
wg.Done() // never ran
}
wg.Wait()
}()
return run()
}
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
err := runWithDeadlineAndCleanup(ctx,
func() error {
time.Sleep(100 * time.Millisecond)
return errors.New("done")
},
func() {
fmt.Println("cleanup ran")
},
)
fmt.Println(err)
}
Note: this exercise has a subtle bug. If stop() returns true, we manually decrement the WaitGroup — but we're decrementing it once even though wg.Done would have decremented it once. So the total is 1 increment and 1 decrement. Correct.
If stop() returns false, the AfterFunc was already running. We wg.Wait() for it. Correct.
Task 4: LifecycleManager¶
Problem. Implement a LifecycleManager with these methods: - Add(c Component) to register components. - Start(ctx) error to start each in registration order; unwind on failure. - Stop(ctx) error to stop each in reverse order; join errors.
Component is an interface with Name() string, Start(ctx) error, and Stop(ctx) error.
Hint. Use errors.Join for multi-component failures.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
)
type Component interface {
Name() string
Start(ctx context.Context) error
Stop(ctx context.Context) error
}
type Manager struct {
items []Component
started []Component
}
func (m *Manager) Add(c Component) { m.items = append(m.items, c) }
func (m *Manager) Start(ctx context.Context) error {
for _, c := range m.items {
if err := c.Start(ctx); err != nil {
// unwind
for i := len(m.started) - 1; i >= 0; i-- {
_ = m.started[i].Stop(context.Background())
}
m.started = nil
return fmt.Errorf("start %s: %w", c.Name(), err)
}
m.started = append(m.started, c)
}
return nil
}
func (m *Manager) Stop(ctx context.Context) error {
var errs []error
for i := len(m.started) - 1; i >= 0; i-- {
if err := m.started[i].Stop(ctx); err != nil {
errs = append(errs, fmt.Errorf("stop %s: %w", m.started[i].Name(), err))
}
}
m.started = nil
return errors.Join(errs...)
}
Task 5: Graceful HTTP Shutdown¶
Problem. Write a main function that: 1. Starts an HTTP server on :0. 2. Handles SIGINT/SIGTERM to trigger graceful shutdown. 3. Shutdown should drain in-flight requests with a 10-second deadline. 4. Logs each shutdown step.
Reference solution.
package main
import (
"context"
"log/slog"
"net/http"
"os/signal"
"syscall"
"time"
)
func main() {
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer stop()
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
time.Sleep(100 * time.Millisecond)
w.Write([]byte("ok\n"))
})
srv := &http.Server{Addr: ":0", Handler: mux}
go func() {
if err := srv.ListenAndServe(); err != http.ErrServerClosed {
slog.Error("listen", "err", err)
}
}()
slog.Info("server started")
<-ctx.Done()
slog.Info("shutdown signal received")
shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := srv.Shutdown(shutdownCtx); err != nil {
slog.Error("shutdown", "err", err)
} else {
slog.Info("clean shutdown")
}
}
Task 6: Idempotent Close¶
Problem. Write a SafeCloser type that wraps an io.Closer and makes Close idempotent. Multiple calls to Close should return the same error (from the first call).
Hint. sync.Once.
Reference solution.
package main
import (
"fmt"
"io"
"sync"
)
type SafeCloser struct {
c io.Closer
once sync.Once
err error
}
func NewSafeCloser(c io.Closer) *SafeCloser {
return &SafeCloser{c: c}
}
func (s *SafeCloser) Close() error {
s.once.Do(func() { s.err = s.c.Close() })
return s.err
}
type fakeCloser struct{ n int }
func (f *fakeCloser) Close() error {
f.n++
return fmt.Errorf("close #%d", f.n)
}
func main() {
c := NewSafeCloser(&fakeCloser{})
fmt.Println(c.Close())
fmt.Println(c.Close())
fmt.Println(c.Close())
}
Output: same error three times. The underlying fakeCloser.Close runs once.
Task 7: Error Propagation from Multiple Closers¶
Problem. Write closeAll(closers ...io.Closer) error that closes each closer and joins all non-nil errors.
Reference solution.
package main
import (
"errors"
"io"
)
func closeAll(closers ...io.Closer) error {
var errs []error
for _, c := range closers {
if err := c.Close(); err != nil {
errs = append(errs, err)
}
}
return errors.Join(errs...)
}
Usage:
Or defer closeAll in reverse order if you want LIFO via explicit ordering.
Task 8: Time-Bounded Cleanup¶
Problem. Write a function cleanupWithDeadline(ctx context.Context, fn func() error) error that: 1. Runs fn() in a goroutine. 2. Returns when fn completes or ctx expires, whichever first. 3. If ctx expires, returns ctx.Err() immediately; the goroutine may continue running.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
"time"
)
func cleanupWithDeadline(ctx context.Context, fn func() error) error {
done := make(chan error, 1)
go func() { done <- fn() }()
select {
case err := <-done:
return err
case <-ctx.Done():
return ctx.Err()
}
}
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
err := cleanupWithDeadline(ctx, func() error {
time.Sleep(100 * time.Millisecond)
return nil
})
if errors.Is(err, context.DeadlineExceeded) {
fmt.Println("cleanup did not finish in time")
}
}
Task 9: Drain a Channel on Cancel¶
Problem. Write a function drainConsumer(ctx context.Context, in <-chan int, process func(int)) that consumes from in until cancelled, then drains remaining buffered values, then returns.
Reference solution.
package main
import (
"context"
"fmt"
"time"
)
func drainConsumer(ctx context.Context, in <-chan int, process func(int)) {
for {
select {
case <-ctx.Done():
// drain
for {
select {
case v, ok := <-in:
if !ok { return }
process(v)
default:
return
}
}
case v, ok := <-in:
if !ok { return }
process(v)
}
}
}
func main() {
ch := make(chan int, 10)
for i := 0; i < 5; i++ { ch <- i }
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
drainConsumer(ctx, ch, func(v int) { fmt.Println(v) })
}
Task 10: A Component with Full Lifecycle¶
Problem. Implement a MetricsExporter component that: 1. Has Start(ctx) that begins a background goroutine emitting metrics every 10ms. 2. Has Stop(ctx) that: - Stops the goroutine. - Flushes any pending metrics (with a 5-second internal timeout). - Returns when complete or ctx expires. 3. Is idempotent. 4. Has a Name() string method returning "metrics".
Hint. Use the same template as Worker (Task 2) with an explicit flush step.
Reference solution.
package main
import (
"context"
"fmt"
"sync"
"time"
)
type MetricsExporter struct {
cancel context.CancelFunc
done chan struct{}
once sync.Once
err error
queue chan string
}
func NewMetricsExporter() *MetricsExporter {
return &MetricsExporter{
done: make(chan struct{}),
queue: make(chan string, 100),
}
}
func (m *MetricsExporter) Name() string { return "metrics" }
func (m *MetricsExporter) Start(parent context.Context) error {
ctx, cancel := context.WithCancel(parent)
m.cancel = cancel
go m.run(ctx)
return nil
}
func (m *MetricsExporter) run(ctx context.Context) {
defer close(m.done)
t := time.NewTicker(10 * time.Millisecond)
defer t.Stop()
for {
select {
case <-ctx.Done():
// flush
for {
select {
case x := <-m.queue:
fmt.Println("emit:", x)
default:
return
}
}
case <-t.C:
fmt.Println("tick")
}
}
}
func (m *MetricsExporter) Stop(ctx context.Context) error {
m.once.Do(func() {
m.cancel()
select {
case <-m.done:
case <-ctx.Done():
m.err = ctx.Err()
}
})
return m.err
}
func main() {
e := NewMetricsExporter()
e.Start(context.Background())
time.Sleep(30 * time.Millisecond)
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
e.Stop(ctx)
e.Stop(ctx) // safe
}
Task 11: Trace Probe Helper¶
Problem. Write trace(name string) func() that returns a closure for use as defer trace("op")(). It should: 1. Log "enter
Reference solution.
package main
import (
"fmt"
"time"
)
func trace(name string) func() {
start := time.Now()
fmt.Println("enter", name)
return func() {
fmt.Println("exit", name, "after", time.Since(start))
}
}
func doWork() {
defer trace("doWork")()
time.Sleep(20 * time.Millisecond)
}
func main() {
doWork()
}
Task 12: Recover-and-Log Goroutine Launcher¶
Problem. Write safeGo(name string, fn func()) that starts fn in a goroutine, recovering from panics and logging them with the goroutine's name.
Reference solution.
package main
import (
"fmt"
"runtime/debug"
)
func safeGo(name string, fn func()) {
go func() {
defer func() {
if r := recover(); r != nil {
fmt.Printf("panic in %s: %v\n%s\n", name, r, debug.Stack())
}
}()
fn()
}()
}
func main() {
safeGo("worker-1", func() {
panic("boom")
})
safeGo("worker-2", func() {
fmt.Println("worker 2 ok")
})
time.Sleep(100 * time.Millisecond)
}
Task 13: Cleanup on Cause¶
Problem. Write a function cancelWithCause(ctx, parent context.Context) error that registers an AfterFunc on ctx to log the cause when ctx is cancelled. Return the cause when the function completes.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
"time"
)
func cancelWithCause(ctx context.Context) error {
stop := context.AfterFunc(ctx, func() {
cause := context.Cause(ctx)
fmt.Println("cancelled because:", cause)
})
defer stop()
<-ctx.Done()
return context.Cause(ctx)
}
func main() {
ctx, cancel := context.WithCancelCause(context.Background())
go func() {
time.Sleep(20 * time.Millisecond)
cancel(errors.New("user aborted"))
}()
err := cancelWithCause(ctx)
fmt.Println("final:", err)
}
Task 14: Compose Three Cleanups¶
Problem. Write a function that: 1. Acquires a mutex. 2. Opens a file. 3. Begins a transaction.
On any failure or successful return, releases in reverse order.
Reference solution.
package main
import (
"errors"
"fmt"
"os"
"sync"
)
var mu sync.Mutex
type tx struct{ committed bool }
func (t *tx) Commit() error { t.committed = true; return nil }
func (t *tx) Rollback() error { return nil }
func work(path string) (err error) {
mu.Lock()
defer mu.Unlock()
f, err := os.Open(path)
if err != nil { return err }
defer func() {
if cerr := f.Close(); cerr != nil && err == nil {
err = cerr
}
}()
t := &tx{}
defer func() {
if err != nil {
if rerr := t.Rollback(); rerr != nil {
err = errors.Join(err, rerr)
}
}
}()
fmt.Println("doing work")
return t.Commit()
}
func main() {
fmt.Println(work("/etc/hostname"))
}
Task 15: Two-Phase Shutdown¶
Problem. Build a service with two-phase shutdown: - Phase 1 (Drain): each component stops accepting new work. - Phase 2 (Close): each component releases resources.
The phases run sequentially across all components.
Reference solution.
package main
import (
"context"
"errors"
"fmt"
)
type Component interface {
Name() string
Drain(ctx context.Context) error
Close(ctx context.Context) error
}
type Service struct {
components []Component
}
func (s *Service) Shutdown(ctx context.Context) error {
var errs []error
for _, c := range s.components {
if err := c.Drain(ctx); err != nil {
errs = append(errs, fmt.Errorf("drain %s: %w", c.Name(), err))
}
}
for _, c := range s.components {
if err := c.Close(ctx); err != nil {
errs = append(errs, fmt.Errorf("close %s: %w", c.Name(), err))
}
}
return errors.Join(errs...)
}
Task 16: AfterFunc with WaitGroup¶
Problem. Write a wrapper around context.AfterFunc that adds a Wait method to block until the callback finishes (or never runs if deregistered).
Reference solution.
package main
import (
"context"
"fmt"
"sync"
)
type WaitableAfterFunc struct {
stop func() bool
done chan struct{}
once sync.Once
}
func NewWaitableAfterFunc(ctx context.Context, fn func()) *WaitableAfterFunc {
w := &WaitableAfterFunc{done: make(chan struct{})}
w.stop = context.AfterFunc(ctx, func() {
defer w.once.Do(func() { close(w.done) })
fn()
})
return w
}
func (w *WaitableAfterFunc) Stop() bool {
if w.stop() {
w.once.Do(func() { close(w.done) })
return true
}
return false
}
func (w *WaitableAfterFunc) Wait() { <-w.done }
func main() {
ctx, cancel := context.WithCancel(context.Background())
waf := NewWaitableAfterFunc(ctx, func() { fmt.Println("callback") })
cancel()
waf.Wait()
fmt.Println("done")
}
Task 17: errgroup with Cleanup¶
Problem. Use errgroup.WithContext to spawn three workers, each of which opens a file and processes it. Each worker's file must close cleanly. The first error cancels all workers.
Reference solution.
package main
import (
"context"
"fmt"
"os"
"golang.org/x/sync/errgroup"
)
func worker(ctx context.Context, path string) error {
f, err := os.Open(path)
if err != nil { return err }
defer f.Close()
// simulate work
return nil
}
func main() {
g, ctx := errgroup.WithContext(context.Background())
paths := []string{"/etc/hostname", "/etc/hosts", "/etc/passwd"}
for _, p := range paths {
p := p
g.Go(func() error { return worker(ctx, p) })
}
if err := g.Wait(); err != nil {
fmt.Println(err)
}
}
Task 18: A Test for Cleanup¶
Problem. Write a test that asserts a Service does not leak goroutines after Start/Stop.
Reference solution.
package myservice
import (
"context"
"runtime"
"testing"
"time"
)
func TestNoLeak(t *testing.T) {
baseline := runtime.NumGoroutine()
for i := 0; i < 100; i++ {
s := New()
if err := s.Start(context.Background()); err != nil {
t.Fatal(err)
}
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
if err := s.Stop(ctx); err != nil {
t.Fatal(err)
}
cancel()
}
runtime.GC()
runtime.GC()
if n := runtime.NumGoroutine(); n > baseline+2 {
t.Errorf("leaked goroutines: baseline=%d after=%d", baseline, n)
}
}
Task 19: Cleanup with t.Cleanup¶
Problem. Write a test helper tempDB(t *testing.T) *DB that creates a database connection and registers t.Cleanup to close it.
Reference solution.
package db_test
import (
"testing"
)
func tempDB(t *testing.T) *DB {
db, err := connect()
if err != nil { t.Fatal(err) }
t.Cleanup(func() { db.Close() })
return db
}
func TestUse(t *testing.T) {
db := tempDB(t)
// ... use db ...
// db is closed automatically when test ends
}
Task 20: Complete End-to-End Service¶
Problem. Write a main function that: 1. Creates a LifecycleManager. 2. Registers three components: HTTP server, worker pool, metrics exporter. 3. Starts all with a 30-second deadline. 4. Waits for SIGTERM. 5. Stops all in reverse order with a 30-second deadline. 6. Logs every phase.
Hint. Combine Tasks 4, 5, and 10. This is the integration test of your knowledge.
Reference solution. Left as an exercise. Combine the previous tasks. The structure is:
func main() {
ctx, stop := signal.NotifyContext(...)
defer stop()
var m Manager
m.Add(NewHTTPServer())
m.Add(NewWorkerPool())
m.Add(NewMetricsExporter())
startCtx, _ := context.WithTimeout(ctx, 30*time.Second)
if err := m.Start(startCtx); err != nil { ... }
<-ctx.Done()
stopCtx, _ := context.WithTimeout(context.Background(), 30*time.Second)
m.Stop(stopCtx)
}
Bonus Tasks¶
Bonus 1: Cleanup with Retry¶
Write a cleanupWithRetry(ctx, fn func() error, attempts int) that retries fn up to attempts times with exponential backoff, until success or context expires.
Bonus 2: Cleanup with Circuit Breaker¶
Modify Bonus 1 to skip future calls if a circuit breaker has opened.
Bonus 3: Test Multi-Phase Shutdown¶
Write a test that asserts components are stopped in correct order using a recording mechanism.
Bonus 4: Profile Defer Cost¶
Write a benchmark comparing open-coded defer, heap defer, and explicit cleanup. Use pprof to verify.
Bonus 5: Implement safeStop¶
Write a helper that wraps a Stop call with panic recovery, converting any panic into an error.
How to Practice¶
- Code each task by hand. Do not copy the reference solutions until you have tried.
- Run each program and verify the output.
- Modify tasks to add complications: more components, parallel shutdown, error injection.
- Write tests for each task.
- Time yourself: aim to complete Tasks 1-10 in an hour after practising.
Cleanup ordering is a skill built by repetition. The patterns become muscle memory.
End of tasks.