Proxy — Optimization Exercises¶

Each exercise shows a working proxy and a measurable improvement. Numbers are illustrative (go1.22, typical hardware); reproduce with go test -bench.

Exercise 1: Caching proxy on a hot read path¶

Before — every read hits the backend:

func (s *Store) Read(id string) (User, error) {
    return s.db.QueryUser(id) // ~2ms per call
}

After — a caching proxy with a bounded LRU:

func (c *CachingStore) Read(id string) (User, error) {
    if u, ok := c.lru.Get(id); ok {
        return u, nil // ~50ns
    }
    u, err := c.real.Read(id)
    if err == nil {
        c.lru.Add(id, u)
    }
    return u, err
}

Metric	Before	After (90% hit)
Mean read latency	2ms	~0.2ms
Backend QPS at 10k req/s	10,000	1,000

Break-even: below ~20% hit rate the lookup overhead outweighs the savings — measure your hit rate before keeping the proxy.

Exercise 2: Collapse concurrent misses (singleflight)¶

Before — a cold key hit by 200 goroutines triggers 200 backend calls.

After:

func (c *CachingStore) Read(id string) (User, error) {
    if u, ok := c.lru.Get(id); ok {
        return u, nil
    }
    v, err, _ := c.sf.Do(id, func() (any, error) {
        u, err := c.real.Read(id)
        if err == nil {
            c.lru.Add(id, u)
        }
        return u, err
    })
    return v.(User), err
}

Metric	Before	After
Backend calls for 200 concurrent misses	200	1
Backend peak load on cache expiry	spike	flat

Exercise 3: RWMutex → sharded cache under contention¶

Before — a single RWMutex around the cache map; at 64 cores the write lock and reader-counter cache line contend.

After — shard the cache by hash(key) % N:

type shardedCache struct {
    shards [256]struct {
        mu sync.RWMutex
        m  map[string]User
    }
}
func (c *shardedCache) shard(key string) *...{ return &c.shards[fnv(key)%256] }

Metric	Before	After
Read throughput @ 64 goroutines	8M ops/s	55M ops/s
Lock contention (mutex profile)	high	negligible

Each key touches only its shard, so independent keys never contend.

Exercise 4: Warm a virtual proxy to remove cold-start spikes¶

Before — a lazy proxy dials the DB on the first request; that request pays ~300ms.

After — warm during readiness:

func (s *Server) Ready() error {
    return s.lazyDB.Ping() // forces lazy init before traffic
}

Metric	Before	After
First-request latency	300ms	normal (~2ms)
Cold-start error rate during deploy	spikes	flat

The proxy stays lazy in code; you just control when the cost lands.

Exercise 5: Avoid interface-dispatch cost on a million-call inner loop¶

Before — a logging proxy wraps a method called in a tight loop 10M times; each call pays interface dispatch + a log call guarded by a level check.

After — move the proxy to a coarser boundary (per request, not per inner-loop iteration), and gate logging with an atomic level check, not a function call.

Metric	Before	After
Loop wall time	420ms	60ms
Allocations (log formatting)	10M	0 (skipped)

Lesson: proxies belong at coarse boundaries; proxying a hot inner loop multiplies indirection cost.

Exercise 6: Negative caching with short TTL¶

Before — repeated lookups for a missing key hit the backend every time (cache only stores hits).

After — cache "not found" with a short TTL (e.g., 5s) to absorb repeated misses without serving stale negatives long.

Metric	Before	After
Backend calls for a hot missing key	every request	1 per 5s

Be deliberate: negative caching trades a small staleness window for backend protection.

Measurement checklist¶

Measure hit rate before keeping a caching proxy (break-even ~20%).
Add singleflight where concurrent misses are possible.
Shard the cache only if the mutex profile shows contention.
Warm lazy proxies if first-call latency matters.
Keep proxies at coarse boundaries, not hot inner loops.
Bound the cache and expose size/hit-rate metrics.