Optimization Workflow — Senior¶

1. The senior shift: decisions over techniques¶

At the junior and middle levels the workflow is mechanical: measure, identify, change, re-measure. At the senior level the loop is the same, but the binding constraint shifts from "how do I make this fast" to "should I be working on this at all". The senior engineer's job in performance work is to decide:

Which problem to take on.
When to stop a line of work.
Whether the optimization is worth the long-term cost.
Whether the team can sustain the change.

The technique table is shared by every Go engineer above junior level. What separates senior is the judgment about which entry on the table to apply, when, and at what cost.

2. Opportunity cost is the senior's lens¶

Engineering time is the scarcest resource on a team. The right framing for a candidate optimization is not "is it possible?" but "is it the best use of the next two weeks?"

Candidate work	Opportunity cost
Shave 5% off a function used 10× per second	Two weeks of engineer time
Add a missing index to a query that runs 10k× per second	Same two weeks
Fix a goroutine leak crashing one pod a week	Same two weeks
Build a benchmark harness that catches regressions for the next year	Same two weeks

All four are "performance work." Only one of them is the right thing to do this sprint. The senior calls that out before anyone writes code.

3. The "don't optimize" decision¶

There are concrete situations where the right senior move is to decline the optimization request entirely.

Situation	Why decline
The code isn't on a hot path	Confirmed by a real production profile; cold paths don't move user-visible numbers
The system is bottlenecked elsewhere	Optimizing X helps nothing if database queries dominate
The change risks correctness or readability disproportionately	The bug introduced will cost more than the latency gained
The workload will change soon	The hot function next quarter will be different
The user-visible improvement is below the threshold of perception	A 3 ms improvement in a 200 ms API response is not noticed

"Not now" or "not ever" is a valid output of the optimization workflow. Saying it requires confidence, which requires data — even a "no" needs a profile to back it up.

4. Goal-setting at the system level¶

Junior goals are per-benchmark. Middle goals are per-endpoint. Senior goals are per-system, expressed as SLOs:

"p99 of /checkout ≤ 200 ms at 1000 RPS, 99.9% of the time, over rolling 28 days."
"Idle service memory ≤ 400 MiB, peak ≤ 800 MiB."
"Per-request CPU-seconds ≤ 8 ms at steady state."

These numbers come from the business (or implied by user expectations) and constrain which optimizations are worth doing. A 30% improvement in a function that's not on the SLO's critical path is, at the system level, often invisible.

The senior version of "set the goal" is "set the goal in terms of what users and operators perceive."

5. The cost model¶

Every optimization has a four-part cost. The senior tracks all four explicitly.

Cost dimension	Example
Engineering hours	"Two weeks to land, with a 60% chance of needing a follow-up"
Ongoing maintenance	"Adds a `sync.Pool` we'll have to remember when refactoring"
Readability	"The clear loop becomes a hand-unrolled, lookup-table version"
Failure modes	"Cache invalidation bugs, retention quirks"

A 5% latency improvement that costs four weeks of work and adds a new failure mode is a bad deal in most contexts. A 5% improvement that takes 30 minutes and removes a footgun is excellent. Senior judgment is in the multiplier you put on each cost.

6. Diminishing returns, formalized¶

The first time through the loop on a service typically nets 20–50% improvement in the hotspot. The second pass often nets another 15–25%. The third pass tends to be 5–10%. The fourth is 2–5%. After that you are spending engineering time for variance-level wins.

The Pareto curve is steep. The senior decision: claim "good enough" earlier than feels comfortable. The 80/20 split is the real one — get the first 80% of available wins quickly, then stop. The remaining 20% is rarely worth its long-term cost.

7. Where the next 5% comes from¶

When the first round of wins is gone, the second round is qualitatively different. Common second-round sources:

Source	Typical mechanism
Algorithmic recast	Replace the algorithm entirely — not optimize, rewrite
Data layout	Field reordering, SoA over AoS, denser representation
Batching	Amortize fixed costs across many items
Removing a step	Cache results, precompute at startup, eliminate redundant work
Concurrency	Parallelize a serial step, with care
Specialization	A separate fast path for the common case
Language-level move	Generics over interface, avoid reflection, inline-friendly helpers

The senior engineer notices when round-one moves stop yielding and steps back to ask "is there a different algorithm or shape entirely?" rather than continuing to inch the same loop forward.

8. When not to chase a profile result¶

A CPU profile is a snapshot. A 30-second sample of a service at 11 PM on a Tuesday is not the average workload. The senior reads profiles with these reservations:

Reservation	Implication
Sample may not be representative	Capture multiple profiles across times of day, traffic types
The hot function may be hot because of a bug	The fix is not optimization but correctness
Profile may show framework overhead	"X% is in gRPC unmarshaling" — irreducible unless you switch frameworks
The hot function may be intentionally hot	Sometimes the encryption is supposed to dominate
The numbers may be inflated by debug code	`if debug { ... }` paths still allocate the formatted string

The skill is reading the profile in context, not just sorting by sample count.

9. The "performance is a feature" principle¶

The opposite of "premature optimization is the root of all evil" is "performance is a feature". Both are correct in different contexts.

Context	Which principle wins
Prototype code, internal tool	"Don't optimize prematurely"
User-facing latency-sensitive endpoint	"Performance is a feature"
Library code	Both; design for performance, but don't micro-optimize without data
Code that runs at cost-relevant scale	"Performance is a feature" (cost is the customer)
Test helpers	"Don't optimize prematurely"

Most engineers absorb one principle and apply it everywhere. The senior knows when to switch contexts.

10. Three things juniors over-rotate on¶

Pattern	Why it's overdone
`sync.Pool` everywhere	Adds complexity; only pays off in hot allocation paths
`unsafe.String`/`unsafe.Slice`	Risk is real; the few hundred ns saved is rarely worth it
Replacing `interface{}` with generics indiscriminately	Generics aren't free; the call site cost depends on monomorphization shape

The senior knows these tools and uses them sparingly. The middle engineer uses them when needed. The junior uses them everywhere.

11. The opportunity for system-level wins¶

Single-function optimizations rarely move the needle on system-level latency. The wins that do, at the senior level:

System-level lever	Typical impact
Removing an N+1 query pattern	10×–100× on the affected endpoint
Adding a coherent cache layer with proper invalidation	5×–50× on read-heavy paths
Switching from sync to async where the API allows	2×–10× on throughput
Replacing a slow downstream call with a batched / parallel one	2×–10× on tail latency
Reducing payload size by half (smaller JSON, gzip, schema changes)	1.5×–3× on network-bound paths
Right-sizing the connection pool or worker count	Variable; often dramatic in contention-bound services

A senior optimization pass spends as much time on the dependency graph and architecture as on hot functions.

12. Pushback as a skill¶

Performance asks land on the senior engineer constantly. Many of them are wrong, or premature, or pointed at the wrong place. The senior is comfortable saying things like:

"Show me a profile from prod that says this function is the cost."
"What's the user-visible improvement? Not just the percentage."
"What does this break for the next engineer who reads the code?"
"What was the goal again? If we're already meeting it, why this?"
"We can do this, but the next sprint's roadmap loses item X. Is that the trade you want?"

The pushback isn't obstruction; it's the senior's contribution. Doing the wrong optimization fast is a worse outcome than doing the right one slowly.

13. The "is this regression real" judgment¶

When CI shows a benchmark slowed down 8%, three responses are reasonable depending on context:

Response	Reasoning
Block the PR	This benchmark is on a critical path; tolerance is < 2%
Investigate	It might be noise; re-run with `-count=20` and check the spread
Accept	The PR fixes a correctness bug; the trade is worth it

The senior decides which. Treating every regression as a block creates noise; treating none as a block creates drift. The judgment lives in knowing which benchmarks measure things that matter to users.

14. Documenting why this is fast¶

Every non-obvious optimization gets a comment that future readers can act on.

// We use a fixed-size [20]byte stack buffer because Itoa on int64 needs
// at most 20 bytes (including sign). Stack-allocating it avoids the heap
// allocation that strconv.FormatInt would do, which mattered in
// BenchmarkEncode_NoAlloc (allocs/op dropped from 3 to 1).
//
// Do NOT replace with strconv.FormatInt without first re-running that
// benchmark and the BenchmarkRender_p99 in pkg/render.
var buf [20]byte
b := strconv.AppendInt(buf[:0], n, 10)

The comment names:

The technique (fixed-size stack buffer).
The reason (avoid heap allocation in a hot path).
The proof (benchmark name + measured improvement).
The hazard (the obvious-looking refactor that would silently undo it).

Future-you, six months from now, will not remember any of this. The comment is what protects the optimization from being reverted by accident.

15. The senior optimization checklist¶

Before declaring an optimization done, the senior verifies:

Check	Why
The numbers are reproducible (`benchstat` p < 0.05)	No win exists without statistical significance
The hot path no longer appears in `top10`	The optimization solved the original problem, not a side issue
Functional tests, including race, still pass	Performance isn't worth correctness
The change is reviewable	Five small commits are better than one giant one
A regression test exists	If the optimization gets undone, CI catches it
The trade-offs are documented	Readability, memory, complexity costs noted
The team understands the change	Not just the author

Skipping any item is an open invitation for the optimization to silently regress, fail, or confuse the next reader.

16. Summary¶

Senior performance work is mostly judgment: about which problem to take on, which signal to trust, when to stop, and how to communicate trade-offs. The mechanical loop you learned at the junior and middle levels is still the engine, but the senior decides when to start it, what target to aim it at, and when to declare it finished. The technical tools are public; the discipline of using them well is what makes the role.