Go Empty Struct — Optimize¶
Instructions¶
Each exercise presents wasteful or sub-optimal use of (or alternative to) the empty struct. Identify the issue, write an optimised version, and explain. Difficulty: Easy, Medium, Hard.
Exercise 1 (Easy) — map[string]bool for Pure Membership¶
Problem:
seen := map[string]bool{}
for _, id := range ids {
seen[id] = true
}
for _, id := range ids {
if seen[id] {
process(id)
}
}
Question: How much memory is wasted, and how do you fix?
Solution
**Issue**: Each entry's value is a `bool` — one byte plus alignment. The bool is always `true`; `false` is never stored. The value byte is wasted. **Optimisation** — switch to `map[string]struct{}`: **Memory saved**: - 1 million entries × 1 byte = ~1 MB direct value bytes saved. - Bucket layout shrinks: in current Go map implementations the bucket stores 8 keys + 8 values per bucket. With `bool` values the value array occupies 8 bytes per bucket. With `struct{}` values it occupies 0. - Approximately 5-10% smaller bucket footprint, reflecting in cache hit rate. **Benchmark**: **Key insight**: When the value never carries information, choose `struct{}`.Exercise 2 (Easy) — Buffered chan struct{} Capacity 1 As One-Shot¶
Problem:
notify := make(chan struct{}, 1)
go func() {
work()
select {
case notify <- struct{}{}:
default:
}
}()
<-notify
Question: What is wrong, and how do you simplify?
Solution
**Issue**: The buffered channel of capacity 1 is being used as a one-shot notification. The select-default avoids blocking. But the consumer receives only once, and the channel is never closed. The pattern works but is unidiomatic and the channel may be leaked if the consumer never reads. **Optimisation** — close to broadcast: After close, the receive returns immediately with the zero value. No buffered slot to manage; no select-default; the worker cannot accidentally double-send. **Benefits**: - Multiple consumers can wait on the same `notify`; all wake up. - `defer close` makes the close path obvious. - No risk of panicking on a second send (there is no send). **Key insight**: For one-shot signals, prefer `close` over a buffered ping.Exercise 3 (Easy) — Verifying Set Memory Savings¶
Problem:
package main
import "testing"
func BenchmarkBoolSet(b *testing.B) {
for i := 0; i < b.N; i++ {
m := map[int]bool{}
for j := 0; j < 1024; j++ {
m[j] = true
}
}
}
func BenchmarkStructSet(b *testing.B) {
for i := 0; i < b.N; i++ {
m := map[int]struct{}{}
for j := 0; j < 1024; j++ {
m[j] = struct{}{}
}
}
}
Question: How do you make this benchmark show the actual memory saving?
Solution
**Issue**: `b.ResetTimer()` is missing; the benchmark mixes setup with measurement. Also `-benchmem` is needed. **Improvement**: Run with: Expected (Go 1.22, amd64): The struct version saves ~12% in bytes/op and ~10% in time/op for this size. **Key insight**: Memory savings exist; verify with `-benchmem` and report the alloc count.Exercise 4 (Medium) — Avoiding chan struct{} When Data Is Needed¶
Problem:
done := make(chan struct{})
go func() {
result := compute()
_ = result // discarded
close(done)
}()
<-done
Question: What is wrong with using chan struct{} here?
Solution
**Issue**: The result is discarded. The signal channel is correct in shape, but the producer computed something useful and threw it away. The fix is to either use the result or to remove the computation. If you need the result later: If the result is genuinely unwanted: **Key insight**: `chan struct{}` is for signals. If you have data to deliver, use a typed channel.Exercise 5 (Medium) — Set Reload Pattern Without Lock Contention¶
Problem:
var (
mu sync.RWMutex
blocked map[string]struct{}
)
func reload(ids []string) {
mu.Lock()
defer mu.Unlock()
blocked = make(map[string]struct{}, len(ids))
for _, id := range ids {
blocked[id] = struct{}{}
}
}
func isBlocked(id string) bool {
mu.RLock()
defer mu.RUnlock()
_, ok := blocked[id]
return ok
}
Question: A high-traffic service runs isBlocked millions of times per second. The RLock shows up in pprof. How do you optimise?
Solution
**Issue**: Each `isBlocked` takes the read lock. The lock is uncontended most of the time but still pays atomic-update overhead. **Optimisation** — atomic pointer swap:import "sync/atomic"
var blocked atomic.Pointer[map[string]struct{}]
func reload(ids []string) {
m := make(map[string]struct{}, len(ids))
for _, id := range ids {
m[id] = struct{}{}
}
blocked.Store(&m)
}
func isBlocked(id string) bool {
m := blocked.Load()
if m == nil { return false }
_, ok := (*m)[id]
return ok
}
Exercise 6 (Medium) — Goroutine Spawn Per Receiver vs Broadcast¶
Problem:
type Listener struct{ ch chan struct{} }
var listeners []Listener
func subscribe() chan struct{} {
ch := make(chan struct{}, 1)
listeners = append(listeners, Listener{ch})
return ch
}
func fire() {
for _, l := range listeners {
select {
case l.ch <- struct{}{}:
default:
}
}
}
Question: With 10000 listeners, fire is slow. Optimise.
Solution
**Issue**: `fire` walks every listener and tries to non-blocking send. With 10000 listeners that is 10000 channel operations per fire. **Optimisation** — single broadcast channel:type Broadcast struct {
mu sync.Mutex
ch chan struct{}
}
func New() *Broadcast { return &Broadcast{ch: make(chan struct{})} }
func (b *Broadcast) Done() <-chan struct{} {
b.mu.Lock()
defer b.mu.Unlock()
return b.ch
}
func (b *Broadcast) Fire() {
b.mu.Lock()
defer b.mu.Unlock()
close(b.ch)
b.ch = make(chan struct{})
}
Exercise 7 (Medium) — Method-Only Type vs Stateful Struct¶
Problem:
type NopLogger struct{ name string }
func (l NopLogger) Info(msg string) {}
func (l NopLogger) Error(msg string) {}
func handle(l NopLogger) {
l.Info("hello")
}
Question: Is name carrying its weight?
Solution
**Issue**: `name` is never used. The struct stores 16 bytes (string header) per instance for nothing. Calls like `handle(NopLogger{name: "..."})` allocate a `string`. **Optimisation**: After: zero bytes per instance. The single shared value is a typed constant. **Benchmark** (passing 1M instances): **Key insight**: If a type has no per-instance state, drop the fields. The struct collapses to zero bytes.Exercise 8 (Hard) — Trailing Zero-Size Field Wasting Bytes¶
Problem:
Question: What is unsafe.Sizeof(Header{}), and why?
Solution
**Issue**: The trailing zero-size field forces the compiler to add padding so taking `&h._` produces a unique address inside the struct. On amd64, the compiler aligns to the next word. For a binary protocol struct expecting 8 bytes, this is silently wrong. cgo bindings break. **Optimisation** — move the marker earlier or remove it: Option A — remove the marker (preferred): Option B — place the marker first: Option C — use a non-empty unexported field if you really want to forbid positional literals: **Key insight**: Trailing zero-size fields cost a word. Remove or move them.Exercise 9 (Hard) — chan struct{} In Hot Path Loops¶
Problem:
Question: Is this efficient for very tight loops?
Solution
**Issue**: `select { case <-quit: ... default: }` is an atomic check that costs a few nanoseconds per iteration. For tight `process` calls (sub-microsecond), the cancellation check dominates. **Optimisation** — batch the check: Now cancellation is checked every 1024 iterations. Latency to honour cancellation is bounded by 1024 × cost(process), typically tens of microseconds. CPU overhead drops to negligible. **Benchmark** (1M iter, no-op `process`): **Key insight**: Cancellation checks have a cost. Batch them when latency tolerance allows.Exercise 10 (Hard) — Verifying No Allocation for Empty Struct¶
Problem:
Question: Does this allocate?
Solution
**Discussion**: A literal `struct{}{}` does not allocate. The compiler folds it to nothing at SSA time. The loop emits no allocations. **Verify**: Look for absence of "moved to heap" and "escapes" lines for the literal. **Try with `&struct{}{}`** — does it allocate? Each `&struct{}{}` returns the same address (`runtime.zerobase`). No heap allocation; no GC cost. **Verify with runtime.MemStats**: `HeapAlloc` does not grow proportionally with the number of `&struct{}{}` literals. **Key insight**: Empty struct values and pointers are free at the runtime level. The slice of pointers grows; the values themselves do not.Bonus Exercise (Hard) — Migrating a Large Codebase¶
Problem: A 200-file project has 60 instances of map[string]bool used as sets. Plan a migration to map[string]struct{}.
Solution
**Plan**: 1. **Identify true sets**: grep for `map[X]bool` and inspect each. If the only writes are `m[k] = true` and the only reads are `m[k]` (or `if m[k]`), it is a set. 2. **Mark exclusions**: any case where `m[k] = false` is meaningful (default-on-with-explicit-disable) is NOT a set. Skip those. 3. **Introduce a typed wrapper**: define a single `Set[T comparable] map[T]struct{}` in a shared package. 4. **Refactor**: replace each true-set occurrence: - `m := map[string]bool{}` → `m := Set[string]{}` - `m[k] = true` → `m.Add(k)` (or `m[k] = struct{}{}` if not using the wrapper) - `if m[k]` → `if m.Has(k)` - `for k := range m` (already correct) 5. **Run tests**: `go test ./...` and `go test -race ./...`. 6. **Benchmark hot paths**: confirm savings or no-regression with `-benchmem`. 7. **Lint cleanup**: configure `staticcheck` and `gocritic` to permit the new pattern; suppress any false positives. 8. **Document**: add a short note to the package doc explaining the set type. **Migration metrics** (typical): - Reduction in alloc bytes for hot maps: 5-10%. - Code-clarity gain: high — `Set[T]` reads better than `map[T]bool`. - Risk: low — the semantics are identical for true sets. **Key insight**: Migration is a careful grep-and-replace plus a typed wrapper. Do it once for the whole codebase rather than ad-hoc.Summary¶
The empty struct pattern is mostly already optimal — the optimisations covered here are about choosing the empty-struct idiom over alternatives (bool maps, buffered ping channels, stateful structs) and about avoiding the two known cost surfaces (trailing zero-size field, per-iteration cancellation checks). When the empty struct is in the right place, it costs zero memory and zero CPU; the optimisation work is mostly about picking it for the right job.