Builder Pattern — Optimize¶
1. Goal of this file¶
This file is about when a naïve builder is slow or wasteful, and when the fix is worth shipping. Junior taught the shape, middle taught the variants. Optimize is about the cases where a textbook builder shows up in a CPU or allocation profile and you have to do something about it.
The honest envelope: most builders construct one server at startup, one SQL query per HTTP handler, one test fixture per t.Run. At those frequencies, the pattern is essentially free — a single *Builder allocation, a handful of field writes, and a Build() call that copies into the target. Nobody notices.
It becomes visible when:
- The builder runs per request (SQL query per handler, HTTP request per outbound call, log entry per write).
- The builder does string concatenation with
+=instead ofstrings.Builder. - The builder value-copies on every step instead of pointer-mutating.
- The builder deep-clones maps/slices that the caller never mutates.
- The builder has a
sync.Mutexon every step for thread-safety nobody asked for. - The builder is rebuilt for every call when the prefix is identical.
Baseline you need to beat. From middle.md §12:
BenchmarkDirectStructInit-8 500000000 2.1 ns/op 0 B/op 0 allocs/op
BenchmarkPointerBuilder-8 20000000 54.7 ns/op 48 B/op 1 allocs/op
BenchmarkValueBuilder-8 5000000 213.5 ns/op 240 B/op 5 allocs/op
The pointer-receiver builder costs ~55 ns and one allocation. That's the number to beat — or, more often, the number that's already fine.
Structure of the file:
- Real wins (§3–§9): receiver choice,
strings.Builder, lazy init, copy-on-write, removing mutexes,sync.Pool, prefix caching. - Wins that aren't always wins (§10–§14): closure lists, terminal memoization, validation split, config caching, reflection vs codegen.
- Cost-benefit framing (§15).
2. Table of Contents¶
- Goal of this file
- Table of Contents
- Exercise 1: Value-receiver builder allocating on every step
- Exercise 2: SQL builder using
+=instead ofstrings.Builder - Exercise 3: Defensive slice/map allocation in
New() - Exercise 4: Deep-clone allocating maps caller never mutates
- Exercise 5:
sync.Mutexon every step - Exercise 6:
sync.Poolfor per-request HTTP request builders - Exercise 7: SQL builder regenerating identical prefix bytes
- Exercise 8: Generic
Builder[T]with closure-list — replace with direct field set - Exercise 9: Multi-terminal builder recomputing the same SQL twice
- Exercise 10: Validation in
Build()repeated per call - Exercise 11: Config file re-parsed on every
Build() - Exercise 12: Reflection in
Build()— replace with code generation - When NOT to optimize
- The optimization checklist
- Summary
3. Exercise 1: Value-receiver builder allocating on every step¶
Scenario¶
Someone wrote the builder with value-receiver semantics (middle.md §5.2) because "it's safer / immutable / fork-friendly". The builder is never forked. Every chain step copies the entire builder struct. For a 6-step chain on a 96-byte builder, that's 6 heap copies — six allocations per construction — for no actual benefit.
Before¶
package query
import (
"errors"
"fmt"
"strings"
)
type Builder struct {
table string
columns []string
wheres []string
args []any
orderBy string
limit int
err error
}
// Value receiver — each step returns a *copy* of the builder.
func Select(cols ...string) Builder {
return Builder{columns: cols}
}
func (b Builder) From(t string) Builder {
if b.err != nil { return b }
if t == "" { b.err = errors.New("From: empty table"); return b }
b.table = t
return b
}
func (b Builder) Where(cond string, args ...any) Builder {
if b.err != nil { return b }
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b Builder) OrderBy(col string) Builder {
if b.err != nil { return b }
b.orderBy = col
return b
}
func (b Builder) Limit(n int) Builder {
if b.err != nil { return b }
if n < 0 { b.err = fmt.Errorf("Limit: negative"); return b }
b.limit = n
return b
}
func (b Builder) Build() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
if b.orderBy != "" {
sb.WriteString(" ORDER BY ")
sb.WriteString(b.orderBy)
}
if b.limit > 0 {
fmt.Fprintf(&sb, " LIMIT %d", b.limit)
}
return sb.String(), b.args, nil
}
Benchmark¶
func BenchmarkValueBuilder(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _, _ = Select("id", "name", "email").
From("users").
Where("active = ?", true).
Where("created_at > ?", "2024-01-01").
OrderBy("created_at DESC").
Limit(100).
Build()
}
}
Six chain steps × the slice-grow allocations × the final SQL string. Eleven allocations for one query.
After
Switch the receiver to pointer. The chain semantics stay identical at the call site; only the internals change.type Builder struct {
table string
columns []string
wheres []string
args []any
orderBy string
limit int
err error
}
// Pointer receiver — one allocation, mutations in place.
func Select(cols ...string) *Builder {
return &Builder{columns: cols}
}
func (b *Builder) From(t string) *Builder {
if b.err != nil { return b }
if t == "" { b.err = errors.New("From: empty table"); return b }
b.table = t
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
if b.err != nil { return b }
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) OrderBy(col string) *Builder {
if b.err != nil { return b }
b.orderBy = col
return b
}
func (b *Builder) Limit(n int) *Builder {
if b.err != nil { return b }
if n < 0 { b.err = fmt.Errorf("Limit: negative"); return b }
b.limit = n
return b
}
func (b *Builder) Build() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
if b.orderBy != "" {
sb.WriteString(" ORDER BY ")
sb.WriteString(b.orderBy)
}
if b.limit > 0 {
fmt.Fprintf(&sb, " LIMIT %d", b.limit)
}
return sb.String(), b.args, nil
}
4. Exercise 2: SQL builder using += instead of strings.Builder¶
Scenario¶
The SQL builder accumulates the query as a string field, appending with += in each step. Every += allocates a fresh string holding the concatenation. For a 6-clause query, that's 6 string allocations whose sizes grow with each step.
Before¶
package query
import "fmt"
type Builder struct {
sql string // accumulated query
args []any
err error
}
func Select(cols ...string) *Builder {
return &Builder{sql: "SELECT " + columnsJoin(cols)}
}
func columnsJoin(cols []string) string {
out := ""
for i, c := range cols {
if i > 0 { out += ", " }
out += c // each += allocates
}
return out
}
func (b *Builder) From(t string) *Builder {
if b.err != nil { return b }
b.sql += " FROM " + t // allocates
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
if b.err != nil { return b }
if !contains(b.sql, " WHERE ") {
b.sql += " WHERE " + cond
} else {
b.sql += " AND " + cond
}
b.args = append(b.args, args...)
return b
}
func (b *Builder) OrderBy(col string) *Builder {
if b.err != nil { return b }
b.sql += " ORDER BY " + col
return b
}
func (b *Builder) Limit(n int) *Builder {
if b.err != nil { return b }
b.sql += fmt.Sprintf(" LIMIT %d", n)
return b
}
func (b *Builder) Build() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
return b.sql, b.args, nil
}
func contains(s, sub string) bool { /* strings.Contains inlined */ return false }
Benchmark¶
func BenchmarkStringConcat(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _, _ = Select("id", "name", "email").
From("users").
Where("active = ?", true).
Where("created_at > ?", "2024-01-01").
OrderBy("created_at DESC").
Limit(100).
Build()
}
}
Every += allocates a new string. The contains(b.sql, " WHERE ") scan adds O(n) work per Where call.
After
Accumulate fragments and assemble once in `Build()`. The fragments live in a slice; the final string is one allocation backed by `strings.Builder`.package query
import (
"fmt"
"strings"
)
type Builder struct {
columns []string
table string
wheres []string
args []any
orderBy string
limit int
err error
}
func Select(cols ...string) *Builder {
return &Builder{columns: cols}
}
func (b *Builder) From(t string) *Builder {
if b.err != nil { return b }
b.table = t
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
if b.err != nil { return b }
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) OrderBy(col string) *Builder {
if b.err != nil { return b }
b.orderBy = col
return b
}
func (b *Builder) Limit(n int) *Builder {
if b.err != nil { return b }
b.limit = n
return b
}
func (b *Builder) Build() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
// Pre-size the buffer based on expected output length.
sb.Grow(64 + 16*len(b.wheres))
sb.WriteString("SELECT ")
for i, c := range b.columns {
if i > 0 { sb.WriteString(", ") }
sb.WriteString(c)
}
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
for i, w := range b.wheres {
if i > 0 { sb.WriteString(" AND ") }
sb.WriteString(w)
}
}
if b.orderBy != "" {
sb.WriteString(" ORDER BY ")
sb.WriteString(b.orderBy)
}
if b.limit > 0 {
fmt.Fprintf(&sb, " LIMIT %d", b.limit)
}
return sb.String(), b.args, nil
}
5. Exercise 3: Defensive slice/map allocation in New()¶
Scenario¶
The builder constructor allocates empty slices and maps "just in case" the caller will append. Many callers never do — they call Build() with the defaults. The allocations are pure waste.
Before¶
package server
import "log"
type Server struct {
addr string
headers map[string]string
handlers []Handler
tags []string
}
type Builder struct {
addr string
headers map[string]string
handlers []Handler
tags []string
err error
}
// Defensive: pre-allocate everything in case caller wants to append.
func NewBuilder() *Builder {
return &Builder{
headers: make(map[string]string, 16), // allocated even if unused
handlers: make([]Handler, 0, 8),
tags: make([]string, 0, 8),
}
}
func (b *Builder) Addr(a string) *Builder { b.addr = a; return b }
func (b *Builder) Header(k, v string) *Builder {
b.headers[k] = v
return b
}
func (b *Builder) Handler(h Handler) *Builder {
b.handlers = append(b.handlers, h)
return b
}
func (b *Builder) Build() *Server {
return &Server{
addr: b.addr,
headers: b.headers,
handlers: b.handlers,
tags: b.tags,
}
}
type Handler func()
Benchmark¶
func BenchmarkMinimalBuild(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_ = NewBuilder().Addr(":8080").Build() // never touches headers/handlers/tags
}
}
Five allocations: builder, headers map, handlers slice, tags slice, server. For a caller that only sets Addr, only the builder and server are actually used.
After
Lazy-init: allocate slices and maps only when the corresponding method is called.func NewBuilder() *Builder {
return &Builder{} // zero-initialized; no allocations beyond the builder itself
}
func (b *Builder) Addr(a string) *Builder { b.addr = a; return b }
func (b *Builder) Header(k, v string) *Builder {
if b.headers == nil {
b.headers = make(map[string]string, 4) // small initial size
}
b.headers[k] = v
return b
}
func (b *Builder) Handler(h Handler) *Builder {
b.handlers = append(b.handlers, h) // append handles nil slice
return b
}
func (b *Builder) Tag(t string) *Builder {
b.tags = append(b.tags, t)
return b
}
func (b *Builder) Build() *Server {
return &Server{
addr: b.addr,
headers: b.headers, // may be nil — Server should tolerate that
handlers: b.handlers,
tags: b.tags,
}
}
func BenchmarkFullBuild(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_ = NewBuilder().
Addr(":8080").
Header("X-A", "1").
Header("X-B", "2").
Handler(func() {}).
Tag("prod").
Build()
}
}
6. Exercise 4: Deep-clone allocating maps caller never mutates¶
Scenario¶
Following the "Build copies in" rule from junior.md §10.2, the builder deep-copies its maps into the resulting Server. The Server never mutates them. The defensive copy is pure waste.
Before¶
package server
type Server struct {
headers map[string]string
}
type Builder struct {
headers map[string]string
}
func NewBuilder() *Builder {
return &Builder{headers: make(map[string]string, 16)}
}
func (b *Builder) Header(k, v string) *Builder {
b.headers[k] = v
return b
}
func (b *Builder) Build() *Server {
// Defensive deep copy — costs O(n)
h := make(map[string]string, len(b.headers))
for k, v := range b.headers {
h[k] = v
}
return &Server{headers: h}
}
func (s *Server) Header(k string) string { return s.headers[k] } // read-only
Benchmark¶
func BenchmarkBuildWithDeepCopy(b *testing.B) {
b.ReportAllocs()
var bld *Builder
for i := 0; i < b.N; i++ {
bld = NewBuilder()
for j := 0; j < 20; j++ {
bld.Header(fmt.Sprintf("X-%d", j), "v")
}
_ = bld.Build()
}
}
The Build's deep copy is 2_500 ns of the total — about a third.
After
Copy-on-write: hand the map to the Server *by reference*, and only clone it if the builder is reused or the Server attempts a write. Since the Server is read-only in this scenario, the clone never happens.type Server struct {
headers map[string]string
}
type Builder struct {
headers map[string]string
builtOnce bool // marks the builder as consumed
}
func NewBuilder() *Builder {
return &Builder{headers: make(map[string]string, 16)}
}
func (b *Builder) Header(k, v string) *Builder {
if b.builtOnce {
// Builder was already consumed; the map is owned by a Server.
// Clone now to avoid corrupting that Server.
m := make(map[string]string, len(b.headers)+1)
for k, v := range b.headers { m[k] = v }
b.headers = m
b.builtOnce = false
}
b.headers[k] = v
return b
}
func (b *Builder) Build() *Server {
// Hand ownership of the map to the Server. No copy.
b.builtOnce = true
return &Server{headers: b.headers}
}
func (s *Server) Header(k string) string { return s.headers[k] }
7. Exercise 5: sync.Mutex on every step¶
Scenario¶
Someone added a sync.Mutex to the builder "to make it safe". The mutex is taken on every step method. No code path actually uses the builder concurrently — the lock is pure overhead.
Before¶
package query
import (
"fmt"
"strings"
"sync"
)
type Builder struct {
mu sync.Mutex
table string
columns []string
wheres []string
args []any
err error
}
func Select(cols ...string) *Builder {
return &Builder{columns: cols}
}
func (b *Builder) From(t string) *Builder {
b.mu.Lock()
defer b.mu.Unlock()
if b.err != nil { return b }
b.table = t
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
b.mu.Lock()
defer b.mu.Unlock()
if b.err != nil { return b }
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) Build() (string, []any, error) {
b.mu.Lock()
defer b.mu.Unlock()
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
return sb.String(), b.args, nil
}
Benchmark¶
func BenchmarkLockedBuilder(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _, _ = Select("id").
From("users").
Where("a = ?", 1).
Where("b = ?", 2).
Where("c = ?", 3).
Build()
}
}
runtime.mutex_lock and runtime.mutex_unlock dominate the per-call CPU profile — ~25% of total time.
After
Remove the mutex. Document that the builder is single-threaded (which it always was, by design — middle.md §13.5).type Builder struct {
table string
columns []string
wheres []string
args []any
err error
}
func Select(cols ...string) *Builder {
return &Builder{columns: cols}
}
func (b *Builder) From(t string) *Builder {
if b.err != nil { return b }
b.table = t
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
if b.err != nil { return b }
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) Build() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
return sb.String(), b.args, nil
}
//go:build !race
type Builder struct { /* fields */ }
func (b *Builder) Where(...) *Builder { /* no lock */ }
8. Exercise 6: sync.Pool for per-request HTTP request builders¶
Scenario¶
An outbound HTTP client builds a request per call with the builder pattern. The builder itself is a single allocation, but at 100k requests/sec that's 100k builder allocations/sec hitting the GC.
Before¶
package httpx
import (
"io"
"net/http"
"strings"
)
type RequestBuilder struct {
method string
url string
headers http.Header
body io.Reader
err error
}
func NewRequest(method, url string) *RequestBuilder {
return &RequestBuilder{
method: method,
url: url,
headers: make(http.Header, 8),
}
}
func (b *RequestBuilder) Header(k, v string) *RequestBuilder {
if b.err != nil { return b }
b.headers.Set(k, v)
return b
}
func (b *RequestBuilder) Body(body io.Reader) *RequestBuilder {
if b.err != nil { return b }
b.body = body
return b
}
func (b *RequestBuilder) Build() (*http.Request, error) {
if b.err != nil { return nil, b.err }
req, err := http.NewRequest(b.method, b.url, b.body)
if err != nil { return nil, err }
for k, vs := range b.headers {
for _, v := range vs { req.Header.Add(k, v) }
}
return req, nil
}
Benchmark¶
func BenchmarkPerRequestBuilder(b *testing.B) {
b.ReportAllocs()
body := strings.NewReader("payload")
for i := 0; i < b.N; i++ {
_, _ = NewRequest("POST", "https://api.example.com/users").
Header("Content-Type", "application/json").
Header("X-Trace-ID", "abc123").
Header("X-Tenant", "t1").
Body(body).
Build()
}
}
The builder, headers map, and the http.Request itself dominate.
After
Pool the builder. Reset it between uses. The http.Request still allocates (you can't pool that without owning the http.Client), but the builder's allocation cost is amortized.package httpx
import (
"io"
"net/http"
"sync"
)
type RequestBuilder struct {
method string
url string
headers http.Header
body io.Reader
err error
}
func (b *RequestBuilder) reset() {
b.method = ""
b.url = ""
b.body = nil
b.err = nil
// Keep headers map; clear contents.
for k := range b.headers {
delete(b.headers, k)
}
}
var requestBuilderPool = sync.Pool{
New: func() any {
return &RequestBuilder{headers: make(http.Header, 8)}
},
}
func AcquireRequest(method, url string) *RequestBuilder {
b := requestBuilderPool.Get().(*RequestBuilder)
b.method = method
b.url = url
return b
}
func ReleaseRequest(b *RequestBuilder) {
b.reset()
requestBuilderPool.Put(b)
}
func (b *RequestBuilder) Header(k, v string) *RequestBuilder {
if b.err != nil { return b }
b.headers.Set(k, v)
return b
}
func (b *RequestBuilder) Body(body io.Reader) *RequestBuilder {
if b.err != nil { return b }
b.body = body
return b
}
func (b *RequestBuilder) Build() (*http.Request, error) {
if b.err != nil { return nil, b.err }
req, err := http.NewRequest(b.method, b.url, b.body)
if err != nil { return nil, err }
for k, vs := range b.headers {
for _, v := range vs { req.Header.Add(k, v) }
}
return req, nil
}
// Usage
func makeRequest() (*http.Request, error) {
b := AcquireRequest("POST", "https://api.example.com/users")
defer ReleaseRequest(b)
return b.
Header("Content-Type", "application/json").
Header("X-Trace-ID", "abc123").
Header("X-Tenant", "t1").
Build()
}
var headersPool = sync.Pool{
New: func() any { h := make(http.Header, 8); return &h },
}
func NewRequest(method, url string) *RequestBuilder {
h := headersPool.Get().(*http.Header)
return &RequestBuilder{method: method, url: url, headers: *h}
}
// Caller calls Release after Build() returns:
func (b *RequestBuilder) Release() {
for k := range b.headers { delete(b.headers, k) }
h := b.headers
headersPool.Put(&h)
}
9. Exercise 7: SQL builder regenerating identical prefix bytes¶
Scenario¶
A service runs the same SELECT id, name, email FROM users WHERE prefix for thousands of queries per second, with different WHERE clauses. The builder rebuilds the entire prefix string every time.
Before¶
package userq
import (
"fmt"
"strings"
)
type Builder struct {
wheres []string
args []any
}
func NewUserQuery() *Builder { return &Builder{} }
func (b *Builder) Where(cond string, args ...any) *Builder {
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) Build() (string, []any) {
var sb strings.Builder
sb.WriteString("SELECT id, name, email FROM users")
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
for i, w := range b.wheres {
if i > 0 { sb.WriteString(" AND ") }
sb.WriteString(w)
}
}
return sb.String(), b.args
}
Benchmark¶
func BenchmarkUserQuery(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _ = NewUserQuery().
Where("active = ?", true).
Where("created_at > ?", "2024-01-01").
Build()
}
}
Most of the 160 bytes is the prefix "SELECT id, name, email FROM users WHERE " being formed anew each call.
After
Cache the prefix bytes. The builder writes them into the output buffer with a single `Write([]byte)` call.var userQueryPrefix = []byte("SELECT id, name, email FROM users")
var whereSep = []byte(" WHERE ")
var andSep = []byte(" AND ")
func (b *Builder) Build() (string, []any) {
// Estimate output size to avoid Grow's geometric resizing.
size := len(userQueryPrefix)
if len(b.wheres) > 0 {
size += len(whereSep)
for i, w := range b.wheres {
if i > 0 { size += len(andSep) }
size += len(w)
}
}
var sb strings.Builder
sb.Grow(size)
sb.Write(userQueryPrefix)
if len(b.wheres) > 0 {
sb.Write(whereSep)
for i, w := range b.wheres {
if i > 0 { sb.Write(andSep) }
sb.WriteString(w)
}
}
return sb.String(), b.args
}
10. Exercise 8: Generic Builder[T] with closure-list — replace with direct field set¶
Scenario¶
Middle.md §5.1 showed a generic Builder[T] that accumulates func(*T) closures. Each With(func) allocates one closure. For T-types where you control the package, you can replace the closure list with direct field sets — keeping the API ergonomic but eliminating the per-call closure allocations.
Before¶
package builderx
type Builder[T any] struct {
apply []func(*T)
err error
}
func New[T any]() *Builder[T] { return &Builder[T]{} }
func (b *Builder[T]) With(f func(*T)) *Builder[T] {
if b.err != nil { return b }
b.apply = append(b.apply, f)
return b
}
func (b *Builder[T]) Build() (*T, error) {
if b.err != nil { return nil, b.err }
var t T
for _, f := range b.apply {
f(&t)
}
return &t, nil
}
// Caller
type Server struct {
addr string
timeout time.Duration
logger *log.Logger
}
func makeServer() *Server {
s, _ := builderx.New[Server]().
With(func(s *Server) { s.addr = ":8080" }).
With(func(s *Server) { s.timeout = 30 * time.Second }).
With(func(s *Server) { s.logger = log.Default() }).
Build()
return s
}
Benchmark¶
func BenchmarkGenericBuilder(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_ = makeServer()
}
}
Three closures + the slice growth + the builder + the server = 8 allocations.
After
Drop the generic. Write a hand-rolled builder for `Server` with direct field setters.package server
import (
"log"
"time"
)
type Server struct {
addr string
timeout time.Duration
logger *log.Logger
}
type Builder struct {
s Server
err error
}
func New() *Builder { return &Builder{} }
func (b *Builder) Addr(a string) *Builder { b.s.addr = a; return b }
func (b *Builder) Timeout(d time.Duration) *Builder { b.s.timeout = d; return b }
func (b *Builder) Logger(l *log.Logger) *Builder { b.s.logger = l; return b }
func (b *Builder) Build() (*Server, error) {
if b.err != nil { return nil, b.err }
out := b.s
return &out, nil
}
func makeServer() *Server {
s, _ := New().
Addr(":8080").
Timeout(30 * time.Second).
Logger(log.Default()).
Build()
return s
}
11. Exercise 9: Multi-terminal builder recomputing the same SQL twice¶
Scenario¶
The builder has multiple terminals (middle.md §7): .SQL() returns the query string for logging, .Run(ctx, db) executes it. A common pattern is:
b := query.Select(...).From(...).Where(...)
log.Println("executing:", b.SQL()) // call 1: builds SQL
rows, _ := b.Run(ctx, db) // call 2: builds SQL again
Each terminal calls the same internal assemble(). The second call redoes the work.
Before¶
package query
import (
"context"
"database/sql"
"strings"
)
type Builder struct {
table string
columns []string
wheres []string
args []any
err error
}
func (b *Builder) assemble() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
return sb.String(), b.args, nil
}
func (b *Builder) SQL() string {
s, _, _ := b.assemble()
return s
}
func (b *Builder) Run(ctx context.Context, db *sql.DB) (*sql.Rows, error) {
s, args, err := b.assemble()
if err != nil { return nil, err }
return db.QueryContext(ctx, s, args...)
}
Benchmark¶
func BenchmarkLogAndRun(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
q := Select("id", "name").From("users").Where("active = ?", true)
_ = q.SQL()
_ = q.SQL() // simulate logging + running
}
}
Two assembles, each doing a strings.Builder allocation and a strings.Join.
After
Memoize the result in the builder. The first `assemble()` computes; subsequent calls return the cached string.type Builder struct {
table string
columns []string
wheres []string
args []any
err error
// memoized result
cachedSQL string
cachedArgs []any
cached bool
}
// Any mutation invalidates the cache.
func (b *Builder) From(t string) *Builder {
b.cached = false
b.table = t
return b
}
func (b *Builder) Where(cond string, args ...any) *Builder {
b.cached = false
b.wheres = append(b.wheres, cond)
b.args = append(b.args, args...)
return b
}
func (b *Builder) assemble() (string, []any, error) {
if b.err != nil { return "", nil, b.err }
if b.cached {
return b.cachedSQL, b.cachedArgs, nil
}
var sb strings.Builder
sb.WriteString("SELECT ")
sb.WriteString(strings.Join(b.columns, ", "))
sb.WriteString(" FROM ")
sb.WriteString(b.table)
if len(b.wheres) > 0 {
sb.WriteString(" WHERE ")
sb.WriteString(strings.Join(b.wheres, " AND "))
}
b.cachedSQL = sb.String()
b.cachedArgs = b.args
b.cached = true
return b.cachedSQL, b.cachedArgs, nil
}
12. Exercise 10: Validation in Build() repeated per call¶
Scenario¶
Build() runs O(n) validation across all fields. When the builder is constructed fresh from the same options every time, the validation is doing the same checks repeatedly. The first build proves the configuration is valid; subsequent rebuilds with the same options re-prove it.
Before¶
package server
import (
"errors"
"fmt"
"net"
"time"
)
type Server struct {
addr string
timeout time.Duration
maxConn int
tlsCert string
tlsKey string
}
type Builder struct {
addr string
timeout time.Duration
maxConn int
tlsCert string
tlsKey string
}
func NewBuilder() *Builder {
return &Builder{timeout: 30 * time.Second, maxConn: 100}
}
func (b *Builder) Addr(a string) *Builder { b.addr = a; return b }
func (b *Builder) Timeout(d time.Duration) *Builder { b.timeout = d; return b }
func (b *Builder) MaxConn(n int) *Builder { b.maxConn = n; return b }
func (b *Builder) TLS(cert, key string) *Builder {
b.tlsCert = cert
b.tlsKey = key
return b
}
func (b *Builder) Build() (*Server, error) {
// Validation runs every time
if b.addr == "" { return nil, errors.New("addr required") }
if _, _, err := net.SplitHostPort(b.addr); err != nil {
return nil, fmt.Errorf("addr: %w", err)
}
if b.timeout <= 0 { return nil, errors.New("timeout must be positive") }
if b.timeout > time.Hour { return nil, errors.New("timeout too large") }
if b.maxConn <= 0 { return nil, errors.New("maxConn must be positive") }
if b.maxConn > 100000 { return nil, errors.New("maxConn too large") }
if (b.tlsCert == "") != (b.tlsKey == "") {
return nil, errors.New("tlsCert and tlsKey must both be set or both empty")
}
if b.tlsCert != "" {
// imagine we also check files exist
if !fileExists(b.tlsCert) { return nil, errors.New("tlsCert file not found") }
if !fileExists(b.tlsKey) { return nil, errors.New("tlsKey file not found") }
}
return &Server{
addr: b.addr, timeout: b.timeout, maxConn: b.maxConn,
tlsCert: b.tlsCert, tlsKey: b.tlsKey,
}, nil
}
func fileExists(p string) bool { /* stat */ return true }
Benchmark¶
func BenchmarkRepeatedBuild(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _ = NewBuilder().
Addr(":8080").
Timeout(30 * time.Second).
MaxConn(1000).
TLS("cert.pem", "key.pem").
Build()
}
}
net.SplitHostPort is ~200 ns; the fileExists calls are ~200 ns each. That's ~600 ns of validation per build.
After
Split validation into "one-time" and "per-build". Validate once at startup; produce a `ValidatedConfig` that the per-call constructor trusts.type ValidatedConfig struct {
addr string
timeout time.Duration
maxConn int
tlsCert string
tlsKey string
}
// Heavy validation, called once at startup.
func (b *Builder) Validate() (*ValidatedConfig, error) {
if b.addr == "" { return nil, errors.New("addr required") }
if _, _, err := net.SplitHostPort(b.addr); err != nil {
return nil, fmt.Errorf("addr: %w", err)
}
if b.timeout <= 0 { return nil, errors.New("timeout must be positive") }
if b.timeout > time.Hour { return nil, errors.New("timeout too large") }
if b.maxConn <= 0 { return nil, errors.New("maxConn must be positive") }
if b.maxConn > 100000 { return nil, errors.New("maxConn too large") }
if (b.tlsCert == "") != (b.tlsKey == "") {
return nil, errors.New("tlsCert and tlsKey must both be set or both empty")
}
if b.tlsCert != "" {
if !fileExists(b.tlsCert) { return nil, errors.New("tlsCert file not found") }
if !fileExists(b.tlsKey) { return nil, errors.New("tlsKey file not found") }
}
return &ValidatedConfig{
addr: b.addr, timeout: b.timeout, maxConn: b.maxConn,
tlsCert: b.tlsCert, tlsKey: b.tlsKey,
}, nil
}
// Cheap, called per request — no validation.
func (c *ValidatedConfig) NewServer() *Server {
return &Server{
addr: c.addr, timeout: c.timeout, maxConn: c.maxConn,
tlsCert: c.tlsCert, tlsKey: c.tlsKey,
}
}
// Once at startup
cfg, err := NewBuilder().Addr(":8080").Timeout(30*time.Second).MaxConn(1000).
TLS("cert.pem", "key.pem").Validate()
if err != nil { log.Fatal(err) }
// Per request
func handler() *Server { return cfg.NewServer() }
func (b *Builder) Addr(a string) *Builder {
if a == "" { b.err = errors.New("Addr: empty"); return b }
if _, _, err := net.SplitHostPort(a); err != nil {
b.err = fmt.Errorf("Addr: %w", err); return b
}
b.addr = a
return b
}
func (b *Builder) Build() (*Server, error) {
if b.err != nil { return nil, b.err }
if !b.crossValidated {
// Run cross-field checks once
if (b.tlsCert == "") != (b.tlsKey == "") {
return nil, errors.New("TLS cert/key mismatch")
}
b.crossValidated = true
}
return &Server{ /* ... */ }, nil
}
13. Exercise 11: Config file re-parsed on every Build()¶
Scenario¶
The builder is fed by a config file. The naïve implementation re-reads and re-parses the file on every Build(). For a service that rebuilds objects per request from the same config file, that's a file-system syscall and JSON parse per call.
Before¶
package server
import (
"encoding/json"
"fmt"
"os"
"time"
)
type fileConfig struct {
Addr string `json:"addr"`
Timeout time.Duration `json:"timeout"`
Logger string `json:"logger"`
}
type Builder struct {
configPath string
overrides map[string]any
err error
}
func FromFile(path string) *Builder {
return &Builder{configPath: path}
}
func (b *Builder) Override(k string, v any) *Builder {
if b.overrides == nil { b.overrides = make(map[string]any) }
b.overrides[k] = v
return b
}
func (b *Builder) Build() (*Server, error) {
if b.err != nil { return nil, b.err }
// Re-read on every build
data, err := os.ReadFile(b.configPath)
if err != nil { return nil, fmt.Errorf("read config: %w", err) }
var fc fileConfig
if err := json.Unmarshal(data, &fc); err != nil {
return nil, fmt.Errorf("parse config: %w", err)
}
s := &Server{
addr: fc.Addr,
timeout: fc.Timeout,
}
if v, ok := b.overrides["addr"]; ok { s.addr = v.(string) }
if v, ok := b.overrides["timeout"]; ok { s.timeout = v.(time.Duration) }
return s, nil
}
Benchmark¶
func BenchmarkFromFile(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _ = FromFile("/tmp/server.json").Build()
}
}
41 microseconds. Dominated by os.ReadFile (a syscall + buffer allocation) and json.Unmarshal.
After
Cache the parsed config. Re-parse only if the file's modtime changed (cheap stat call).package server
import (
"encoding/json"
"fmt"
"os"
"sync"
"time"
)
type cachedConfig struct {
mtime time.Time
cfg fileConfig
}
var (
configCache = map[string]cachedConfig{}
configCacheMu sync.RWMutex
)
func loadConfig(path string) (fileConfig, error) {
// Stat first to check freshness
info, err := os.Stat(path)
if err != nil { return fileConfig{}, fmt.Errorf("stat: %w", err) }
configCacheMu.RLock()
c, ok := configCache[path]
configCacheMu.RUnlock()
if ok && c.mtime.Equal(info.ModTime()) {
return c.cfg, nil
}
// Cache miss or stale: re-read
data, err := os.ReadFile(path)
if err != nil { return fileConfig{}, fmt.Errorf("read: %w", err) }
var fc fileConfig
if err := json.Unmarshal(data, &fc); err != nil {
return fileConfig{}, fmt.Errorf("parse: %w", err)
}
configCacheMu.Lock()
configCache[path] = cachedConfig{mtime: info.ModTime(), cfg: fc}
configCacheMu.Unlock()
return fc, nil
}
func (b *Builder) Build() (*Server, error) {
if b.err != nil { return nil, b.err }
fc, err := loadConfig(b.configPath)
if err != nil { return nil, err }
s := &Server{addr: fc.Addr, timeout: fc.Timeout}
if v, ok := b.overrides["addr"]; ok { s.addr = v.(string) }
if v, ok := b.overrides["timeout"]; ok { s.timeout = v.(time.Duration) }
return s, nil
}
14. Exercise 12: Reflection in Build() — replace with code generation¶
Scenario¶
A "framework-style" builder uses reflection in Build() to populate fields by tag. It works for any target type but pays a reflection cost on every call.
Before¶
package builderx
import (
"fmt"
"reflect"
)
type Builder struct {
values map[string]any
err error
}
func New() *Builder { return &Builder{values: map[string]any{}} }
func (b *Builder) Set(field string, value any) *Builder {
b.values[field] = value
return b
}
func (b *Builder) Build(target any) error {
v := reflect.ValueOf(target)
if v.Kind() != reflect.Ptr || v.Elem().Kind() != reflect.Struct {
return fmt.Errorf("Build: target must be *struct")
}
elem := v.Elem()
t := elem.Type()
for i := 0; i < t.NumField(); i++ {
f := t.Field(i)
tag := f.Tag.Get("builder")
if tag == "" { continue }
val, ok := b.values[tag]
if !ok { continue }
fv := elem.Field(i)
if !fv.CanSet() { continue }
rv := reflect.ValueOf(val)
if !rv.Type().AssignableTo(fv.Type()) {
return fmt.Errorf("Build: %s: type mismatch", tag)
}
fv.Set(rv)
}
return nil
}
// Caller
type Server struct {
Addr string `builder:"addr"`
Timeout time.Duration `builder:"timeout"`
}
Benchmark¶
func BenchmarkReflectionBuild(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
var s Server
_ = New().
Set("addr", ":8080").
Set("timeout", 30*time.Second).
Build(&s)
}
}
The reflection is the entire cost: reflect.ValueOf, Tag.Get, Field, Set — every call.
After
`go generate` produces a typed builder for each target type. The framework-level reflection vanishes; the generated code is direct field assignment.//go:generate go run gen.go -type=Server
// generated_server_builder.go (DO NOT EDIT)
package server
import "time"
type ServerBuilder struct {
addr string
timeout time.Duration
addrSet, timeoutSet bool
err error
}
func NewServerBuilder() *ServerBuilder { return &ServerBuilder{} }
func (b *ServerBuilder) Addr(v string) *ServerBuilder {
b.addr = v; b.addrSet = true; return b
}
func (b *ServerBuilder) Timeout(v time.Duration) *ServerBuilder {
b.timeout = v; b.timeoutSet = true; return b
}
func (b *ServerBuilder) Build() (*Server, error) {
if b.err != nil { return nil, b.err }
s := &Server{}
if b.addrSet { s.Addr = b.addr }
if b.timeoutSet { s.Timeout = b.timeout }
return s, nil
}
// gen.go
package main
import (
"go/ast"
"go/parser"
"go/token"
"os"
"text/template"
)
const tmpl = `// Code generated. DO NOT EDIT.
package {{.Pkg}}
type {{.Name}}Builder struct {
{{- range .Fields }}
{{.LowerName}} {{.Type}}
{{.LowerName}}Set bool
{{- end }}
err error
}
func New{{.Name}}Builder() *{{.Name}}Builder { return &{{.Name}}Builder{} }
{{ range .Fields }}
func (b *{{$.Name}}Builder) {{.Name}}(v {{.Type}}) *{{$.Name}}Builder {
b.{{.LowerName}} = v
b.{{.LowerName}}Set = true
return b
}
{{ end }}
func (b *{{.Name}}Builder) Build() (*{{.Name}}, error) {
if b.err != nil { return nil, b.err }
s := &{{.Name}}{}
{{- range .Fields }}
if b.{{.LowerName}}Set { s.{{.Name}} = b.{{.LowerName}} }
{{- end }}
return s, nil
}
`
func main() {
// Parse the source, find the target type, extract fields, run template.
// (Implementation omitted for brevity; ~80 lines of go/ast.)
}
func BenchmarkGeneratedBuilder(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
_, _ = NewServerBuilder().
Addr(":8080").
Timeout(30*time.Second).
Build()
}
}
type typeInfo struct {
fieldsByTag map[string]int // tag -> field index
types []reflect.Type
}
var typeCache sync.Map // reflect.Type -> *typeInfo
func resolveType(t reflect.Type) *typeInfo {
if v, ok := typeCache.Load(t); ok { return v.(*typeInfo) }
ti := &typeInfo{fieldsByTag: map[string]int{}}
for i := 0; i < t.NumField(); i++ {
if tag := t.Field(i).Tag.Get("builder"); tag != "" {
ti.fieldsByTag[tag] = i
ti.types = append(ti.types, t.Field(i).Type)
}
}
typeCache.Store(t, ti)
return ti
}
func (b *Builder) Build(target any) error {
v := reflect.ValueOf(target).Elem()
ti := resolveType(v.Type())
for tag, idx := range ti.fieldsByTag {
val, ok := b.values[tag]
if !ok { continue }
v.Field(idx).Set(reflect.ValueOf(val))
}
return nil
}
15. When NOT to optimize¶
The honest framing: most builders should not be optimized. The pattern is cheap. The wins exist only when:
| Condition | Threshold to bother |
|---|---|
| Builder frequency | > 10k calls/sec sustained |
| Profile shows builder methods in top 5 % CPU | Yes |
| Allocation profile shows builder closures/copies in top 10 | Yes |
| The "fix" doesn't break correctness (single-thread assumption, shared maps, COW) | Yes |
| You can write a regression test | Yes |
| The fix survives a Go version bump | Probably yes |
If you can't tick most of those, don't optimize. The builders in sqlx, squirrel, resty, protobuf-go are all "naïve" by the standards of this file — they ship because the simple version is good enough.
Specific anti-patterns to avoid:
| Anti-pattern | Why it's bad |
|---|---|
Removing the deep-copy in Build() "for speed" without documenting the shared-state contract | Subtle aliasing bugs that survive code review |
| Switching to value-receivers "for immutability" without checking call sites | Slower (Exercise 1) and breaks chains that rely on mutation |
Adding sync.Mutex "for safety" | Two-thirds slower (Exercise 5) for a guarantee nobody needed |
| Memoizing terminals (Exercise 9) when each builder is used once | Wasted memory for never-hit cache |
| Code generation (Exercise 12) for one or two target types | Build complexity exceeds the benefit |
sync.Pool (Exercise 6) below 10k builds/sec | Pool overhead matches savings |
The default answer to "can we make this builder faster?" is no, it's fine. The yes cases are narrow and benchmark-justified.
16. The optimization checklist¶
Before shipping any optimization from this file:
- Baseline benchmark exists (the unoptimized builder).
- Optimized benchmark shows ≥ 2× improvement OR saves ≥ 1 allocation per call.
-
pprofconfirms the optimization targets a real hot spot (top 5 % CPU or top 10 allocs). - The new code passes the same tests as the old.
-
-gcflags=-mshows no unexpected escapes. -
-raceis clean (especially for COW, snapshot, lock-removal patterns). - Documentation explains the assumption the optimization makes ("do not retain after Release", "do not mutate the builder after Build").
- CI regression test (
benchstat) compares against the baseline. - Code review has signed off on the trade-off.
- The "When NOT to do this" condition from the relevant exercise has been checked.
If any item is missing, the optimization isn't ready.
17. Summary¶
The pointer-receiver builder is already fast: ~55 ns and one allocation per construction. Most optimizations in this file save 50–500 ns and 1–6 allocations. That matters at 100k QPS. It does not matter at 100 QPS.
The wins worth shipping cluster in six areas:
- Switch value-receivers to pointer-receivers (Exercise 1) — eliminate per-step copies. Almost always correct.
strings.Builderover+=(Exercise 2) — O(N) instead of O(N²) string assembly. Pure win.- Lazy-init defensive slices/maps (Exercise 3) — zero-cost if unused, cheap if used. Pure win.
- Cache prefix bytes / pre-size buffers (Exercise 7) — single-allocation output strings. Marginal but easy.
- Pool builders that run per-request (Exercise 6) — amortize the builder allocation. Real win above ~10k QPS.
- Split one-time validation from per-build construction (Exercise 10) — move expensive checks off the hot path. Big win when applicable.
The wins that don't always pay off:
sync.Poolfor low-frequency callers (Exercise 6) — pool overhead exceeds savings below 10k QPS.- Memoizing terminal calls (Exercise 9) — useless if each builder is used once.
- Code generation for moderate speedups (Exercise 12) — build complexity isn't free; reserve for ≥ 10× wins.
- Removing defensive deep-copies (Exercise 4) — introduces aliasing contracts; one bug and the whole thing leaks.
- Removing
sync.Mutex(Exercise 5) — correct only if you can prove single-thread usage; for public APIs the lock is the price of safety.
Always benchmark. Always check -race. Always confirm the optimization survives a Go version bump. Most production codebases need none of these optimizations; the pattern is fine as written in junior.md and middle.md.
Further reading¶
sync.Pool: https://pkg.go.dev/sync#Poolstrings.Builder: https://pkg.go.dev/strings#Builder- Escape analysis: https://github.com/golang/go/wiki/CompilerOptimizations
benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat- Sibling: middle.md — variant choices
- Sibling: junior.md — the baseline shape
- Related: 01-functional-options/optimize.md — same shape of file for the alternative pattern
- Inspiration (zero-allocation patterns): https://github.com/Masterminds/squirrel
- Inspiration (codegen builders): https://github.com/ent/ent