Unnecessary Allocation — Exercises¶
Category: Performance Anti-Patterns → Unnecessary Allocation — hands-on practice cutting throwaway objects in a measured hot path.
These are do-it exercises, not recognition quizzes. Each gives a problem, starting code (Go, Java, or Python — the language varies on purpose), acceptance criteria, and a collapsible solution. The recurring discipline is the one from the level files: prove the allocation with -benchmem / a profiler before you fix it, fix it without changing behavior, then prove the fix with the same measurement.
How to use this file. Try each in your editor before opening the solution — and where it says "prove it," actually run the benchmark. The
allocs/opnumber is the point; the gap between what you expected and what the tool reported is where the learning is. Refer back tomiddle.mdfor the tooling andsenior.mdfor the judgment about when to stop.
Table of Contents¶
| # | Exercise | Form | Lang | Key tool |
|---|---|---|---|---|
| 1 | Fix the O(n²) string build | String building | Go | -benchmem, strings.Builder |
| 2 | Presize a collection | Un-presized growth | Go | -benchmem, make(…,0,n) |
| 3 | Remove the autoboxing | Boxing | Java | JMH -prof gc |
| 4 | Reuse a buffer in a hot loop | Reuse vs re-create | Go | -benchmem, aliasing check |
| 5 | Confirm a value stays on the stack | Escape analysis | Go | -gcflags=-m |
| 6 | Collapse the pipeline's intermediates | Intermediate collections | Python | tracemalloc |
Exercise 1 — Fix the O(n²) string build¶
Form: string building. Goal: make Join linear and drop allocations to a constant. Acceptance: identical output for all inputs; allocs/op independent of input length; a benchmark proving the before/after.
// Before — quadratic, n allocations.
func Join(parts []string) string {
s := ""
for _, p := range parts {
s = s + p + "\n"
}
return s
}
Solution
// After — O(n), presized buffer → 1 allocation.
func Join(parts []string) string {
n := 0
for _, p := range parts {
n += len(p) + 1 // +1 for '\n'
}
var b strings.Builder
b.Grow(n) // presize the buffer: no growth reallocations
for _, p := range parts {
b.WriteString(p)
b.WriteByte('\n')
}
return b.String()
}
Exercise 2 — Presize a collection¶
Form: un-presized growth. Goal: eliminate the reallocation chain when the final size is known. Acceptance: same slice contents; allocs/op drops to 1; benchmark proving it.
// Before — grows from nil; reallocates ~log2(n) times.
func Evens(n int) []int {
out := []int{}
for i := 0; i < n; i++ {
if i%2 == 0 {
out = append(out, i)
}
}
return out
}
Solution
// After — exact capacity known: n/2 evens in [0,n).
func Evens(n int) []int {
out := make([]int, 0, (n+1)/2) // presize to the count of evens
for i := 0; i < n; i++ {
if i%2 == 0 {
out = append(out, i)
}
}
return out
}
Exercise 3 — Remove the autoboxing¶
Form: boxing. Goal: eliminate per-element Integer allocation in a hot sum. Acceptance: same result; gc.alloc.rate.norm drops toward zero; JMH proof.
// Before — List<Integer>: every element is a boxed heap object.
long sumOf(List<Integer> nums) {
long total = 0;
for (Integer n : nums) { // unboxing each iteration
total += n;
}
return total;
}
Solution
// After — int[]: flat, contiguous, zero per-element objects.
long sumOf(int[] nums) {
long total = 0;
for (int n : nums) { // no boxing, no unboxing
total += n;
}
return total;
}
@Benchmark
public long boxed(BoxedState s) { return sumOf(s.list); } // List<Integer>
@Benchmark
public long primitive(PrimState s) { return sumOf(s.arr); } // int[]
# mvn ... -prof gc
boxed avgt 42.1 us/op boxed:gc.alloc.rate.norm 0.0 B/op (the LIST upstream did the boxing)
primitive avgt 9.7 us/op primitive:gc.alloc.rate.norm 0.0 B/op
Exercise 4 — Reuse a buffer in a hot loop¶
Form: reuse vs re-create. Goal: allocate the scratch buffer once, not per record. Acceptance: same emitted bytes; allocs/op drops to a constant; and you must verify no aliasing bug.
// Before — a fresh buffer every record.
func EncodeAll(records [][]byte, w io.Writer) error {
for _, r := range records {
buf := make([]byte, 0, 256) // allocated every iteration
buf = encode(buf, r)
if _, err := w.Write(buf); err != nil {
return err
}
}
return nil
}
Solution
// After — one buffer, reset with buf[:0] each iteration.
func EncodeAll(records [][]byte, w io.Writer) error {
buf := make([]byte, 0, 256) // allocated once
for _, r := range records {
buf = encode(buf[:0], r) // reuse the backing array
if _, err := w.Write(buf); err != nil {
return err
}
}
return nil
}
Exercise 5 — Confirm a value stays on the stack¶
Form: escape analysis. Goal: make a helper return its result without heap-allocating, and prove it with -gcflags=-m. Acceptance: -gcflags=-m shows "does not escape" (or no "moved to heap") for the local; allocs/op == 0.
type Vec struct{ X, Y, Z float64 }
// Before — returns *Vec; the local escapes to the heap.
func AddPtr(a, b Vec) *Vec {
v := Vec{a.X + b.X, a.Y + b.Y, a.Z + b.Z}
return &v // escapes: pointer outlives the call
}
Solution
// After — return the value; nothing escapes, stays on the stack.
func Add(a, b Vec) Vec {
return Vec{a.X + b.X, a.Y + b.Y, a.Z + b.Z}
}
# before
./vec.go:N: moved to heap: v ← AddPtr heap-allocates
# after
./vec.go:M: Add ... does not escape ← no allocation
Exercise 6 — Collapse the pipeline's intermediates¶
Form: intermediate collections. Goal: stop materializing a list at every transformation step. Acceptance: same total; peak allocation drops; tracemalloc proof.
# Before — three full lists in memory for a one-pass computation.
def total_active_spend(users):
active = [u for u in users if u.is_active] # list 1
spends = [u.monthly_spend for u in active] # list 2
annual = [s * 12 for s in spends] # list 3
return sum(annual)
Solution
# After — one lazy generator pipeline; nothing materialized.
def total_active_spend(users):
return sum(u.monthly_spend * 12 for u in users if u.is_active)
import tracemalloc
def measure(fn, users):
tracemalloc.start()
fn(users)
_, peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
return peak
users = make_users(1_000_000, active_ratio=0.5)
print("before peak:", measure(total_active_spend_before, users)) # e.g. 24.1 MB
print("after peak:", measure(total_active_spend_after, users)) # ~0.0 MB
The discipline — recap¶
Every exercise ran the same loop:
- Measure before and after.
go test -benchmem, JMH-prof gc, Pythontracemalloc. A fix that doesn't moveallocs/op/gc.alloc.rate.norm/ peak fixed nothing. - Same behavior. Every fix above is output-identical. If the answer changed, it's a bug, not an optimization.
- Defeat the measurement traps. Assign benchmark results to a sink (DCE), use realistic sizes (O(n²) hides at n=5), warm up on the JVM (scalar replacement is steady-state only).
- Know when to stop. Each "when it mattered" note is the point — these fixes earn their keep only on a hot path. On cold code, the clearer original wins.
| Fix | Form | Allocation result |
|---|---|---|
strings.Builder + Grow | string building | n → 1 |
make([]T, 0, n) | un-presized growth | log n → 1 |
int[] over List<Integer> | boxing | per-element → 0 |
hoist + buf[:0] | reuse | n → 1 (mind aliasing) |
| return value, not pointer | escape | heap → stack (0) |
| generator pipeline | intermediates | n lists → constant |
Related Topics¶
find-bug.md— the spotting counterpart: identify the needless allocation (and the one that's necessary).optimize.md— take a full allocation-heavy hot path, profile it, and cut allocations end-to-end.middle.md·senior.md— the forms and tooling, then hot-path judgment.- Premature Optimization Traps — the measure-first discipline these exercises rest on.
- The
profiling-techniquesandbig-o-analysisskills.
In this topic