N+1 in Code — Practice Tasks¶
Category: Performance Anti-Patterns → N+1 in Code — per-item work in a loop that should have been done once.
These are build-the-cure exercises. Each gives you an N+1 (or its setup) and asks you to apply the right move — batch, preload, hoist, set-membership, or a guard test. Every solution states the call/comparison count before and after, because that deterministic number is how you prove the N+1 is gone, not merely sped up.
How to use this file: write your solution and predict the call-count drop before expanding. The gap between your prediction and the measured count is where the learning is.
Table of Contents¶
| # | Exercise | Cure | Lang |
|---|---|---|---|
| 1 | Convert per-item calls to a batch | Bulk call + de-dupe | Go |
| 2 | Preload a lookup map before a loop | Preload (Identity Map) | Java |
| 3 | Hoist the loop-invariant work | Hoist | Python |
| 4 | Replace a nested-loop scan with a set | Set membership | Go |
| 5 | Write a test that asserts the call count | Guard test | Python |
| 6 | Chunk an oversized batch | Paginated batching | Go |
Exercise 1 — Convert per-item calls to a batch¶
Cure: bulk call + key de-duplication. Goal: turn 1 + N RPCs into 2. Constraint: identical enrichment result; fetch each distinct user only once.
// Before — one GetUser RPC per order. 400 orders, 12 ms each ≈ 4.8 s.
func enrich(ctx context.Context, orders []Order) error {
for i := range orders {
u, err := userClient.GetUser(ctx, orders[i].UserID) // +N RPCs
if err != nil {
return err
}
orders[i].User = u
}
return nil
}
Write the batched version (assume userClient.GetUsers(ctx, []int64) ([]User, error) exists).
Solution
// After — 1 batched RPC regardless of order count.
func enrich(ctx context.Context, orders []Order) error {
seen := make(map[int64]struct{})
ids := make([]int64, 0, len(orders))
for _, o := range orders {
if _, ok := seen[o.UserID]; !ok { // de-dupe: distinct ids only
seen[o.UserID] = struct{}{}
ids = append(ids, o.UserID)
}
}
users, err := userClient.GetUsers(ctx, ids) // 1 batched RPC
if err != nil {
return err
}
byID := make(map[int64]User, len(users))
for _, u := range users {
byID[u.ID] = u
}
for i := range orders {
orders[i].User = byID[orders[i].UserID] // in-memory, O(1)
}
return nil
}
Exercise 2 — Preload a lookup map before a loop¶
Cure: preload into a map (Fowler's Identity Map, by hand). Goal: turn 1 + N queries into 2. Constraint: no batch method exists, but you can widen a query to WHERE id IN (...).
// Before — 1 query for products + N queries for categories.
List<Product> products = productRepo.findAll(); // 1
for (Product p : products) {
Category c = categoryRepo.findById(p.getCategoryId()); // +N
p.setCategory(c);
}
Rewrite assuming categoryRepo.findAllById(Collection<Long>) runs one WHERE id IN (...) query.
Solution
// After — preload all needed categories once; index by id.
List<Product> products = productRepo.findAll(); // 1
Set<Long> ids = products.stream()
.map(Product::getCategoryId)
.collect(Collectors.toSet()); // distinct keys
Map<Long, Category> categories = categoryRepo.findAllById(ids).stream() // 1
.collect(Collectors.toMap(Category::getId, c -> c));
for (Product p : products) {
p.setCategory(categories.get(p.getCategoryId())); // in-memory
}
Exercise 3 — Hoist the loop-invariant work¶
Cure: hoist invariant computation out of the loop. Goal: stop recomputing a value that doesn't depend on the item. Constraint: identical output; only move work that's truly loop-invariant.
# Before — recompiles the regex AND reloads the config every iteration.
import re
def filter_messages(messages, config):
out = []
for m in messages:
pattern = re.compile(config.get_banned_regex()) # recompiled N times
max_len = config.get_max_length() # re-fetched N times
if len(m.text) <= max_len and not pattern.search(m.text):
out.append(m)
return out
Solution
**Regex compilations: `N` → `1`; config reads: `N` → `1`.** For 100k messages this is 100k regex compilations down to one. The body still runs `n` times (the `len`/`search` are genuinely per-item) — only the *invariant* parts moved out. **Watch for the trap:** anything that depends on `m` (the loop variable) cannot be hoisted; moving it out would be a correctness bug, not an optimization.Exercise 4 — Replace a nested-loop scan with a set¶
Cure: set membership. Goal: turn O(n × m) into O(n + m). Constraint: same filtered result.
// Before — for each event, linearly scan the blocked-IP slice. O(n × m).
func allowedEvents(events []Event, blockedIPs []string) []Event {
var out []Event
for _, e := range events { // n
blocked := false
for _, ip := range blockedIPs { // × m
if e.SourceIP == ip {
blocked = true
break
}
}
if !blocked {
out = append(out, e)
}
}
return out
}
Solution
// After — build the blocked set once; membership is O(1). Total O(n + m).
func allowedEvents(events []Event, blockedIPs []string) []Event {
blocked := make(map[string]struct{}, len(blockedIPs))
for _, ip := range blockedIPs { // m, once
blocked[ip] = struct{}{}
}
out := make([]Event, 0, len(events))
for _, e := range events { // n
if _, isBlocked := blocked[e.SourceIP]; !isBlocked { // O(1)
out = append(out, e)
}
}
return out
}
Exercise 5 — Write a test that asserts the call count¶
Cure: a guard test that locks the batched shape. Goal: a test that fails if an N+1 regresses. Constraint: assert count, not latency.
You've batched enrich_orders to call user_service.get_many(ids) once and never user_service.get(id) per item. Write a test that protects this.
Solution
# Guard test — asserts the SHAPE: one batch call, zero per-item calls.
def test_enrich_orders_is_batched(mocker):
batch = mocker.spy(user_service, "get_many") # the batched call
per_item = mocker.spy(user_service, "get") # must NOT be used in a loop
orders = make_orders(count=200) # 200 orders, 25 distinct users
enrich_orders(orders)
assert batch.call_count == 1, \
f"expected 1 batched call, got {batch.call_count}"
assert per_item.call_count == 0, \
f"N+1 regressed: {per_item.call_count} per-item calls"
# Optional: prove de-dupe — the batch received distinct ids only.
(ids_arg,), _ = batch.call_args
assert len(ids_arg) == len(set(ids_arg)) == 25
Exercise 6 — Chunk an oversized batch¶
Cure: paginated (chunked) batching. Goal: one IN (...) would exceed the parameter limit; split it. Constraint: same result; no query exceeds the chunk size.
// Before — one giant IN with potentially 100k+ ids: hits the param cap, errors.
func loadUsers(ctx context.Context, ids []int64) (map[int64]User, error) {
users, err := userRepo.FindByIDs(ctx, ids) // breaks if len(ids) > ~65k
if err != nil {
return nil, err
}
byID := make(map[int64]User, len(users))
for _, u := range users {
byID[u.ID] = u
}
return byID, nil
}
Rewrite so no single query exceeds 1,000 ids.
Solution
// After — chunked batching: O(n / chunk) queries, none over the limit.
func loadUsers(ctx context.Context, ids []int64) (map[int64]User, error) {
const chunk = 1000
byID := make(map[int64]User, len(ids))
for start := 0; start < len(ids); start += chunk {
end := min(start+chunk, len(ids))
users, err := userRepo.FindByIDs(ctx, ids[start:end]) // 1 query per chunk
if err != nil {
return nil, err
}
for _, u := range users {
byID[u.ID] = u
}
}
return byID, nil
}
The Pattern Behind All Six¶
Every exercise answers one question: "What expensive thing am I doing once per item that I could do once for all items?"
| Move | Turns | Into | Exercise |
|---|---|---|---|
| Bulk call + de-dupe | N RPCs | 1 RPC | 1 |
| Preload into map | 1+N queries | 2 queries | 2 |
| Hoist invariant | N computations | 1 | 3 |
| Set membership | n×m comparisons | n+m | 4 |
| Guard test | (locks the win) | red on regression | 5 |
| Chunked batching | one illegal giant call | n/chunk valid calls | 6 |
The deterministic before/after count is the proof. If the count didn't drop from O(n) to O(1) (or O(n/chunk)), you didn't fix the N+1 — you just moved it.
Related Topics¶
- N+1 in Code → middle.md — the four shapes worked through (the cures practiced here).
- N+1 in Code → senior.md — detection and the guard-test discipline behind Exercise 5.
- N+1 in Code → find-bug.md — spot the hidden N+1 (the counterpart skill).
- Wrong Data Structure → tasks.md — the set-vs-scan exercise behind Exercise 4.
- Refactoring → Refactoring Techniques — Split Loop, hoist invariant, replace loop with pipeline.
- The
database-performance,big-o-analysis, andcaching-strategiesskills.
In this topic