Interface Internals — Optimize¶
This file focuses on performance, allocation control, and clean code in the runtime layer of interfaces. Every section maps to a real cost: itab lookups, boxing allocations, comparison panics, or escape regressions.
1. Avoid boxing primitives¶
Problem¶
int is not a pointer. The interface data word must hold the address of an int. The compiler heap-allocates one int per call.
Fix — specialise the hot signature¶
Two signatures cost zero allocations.
Fix — pre-box once¶
If the values are constants, allocate once and reuse.
Measurement¶
The 1 allocs/op is the boxing cost. After the fix it should be 0.
2. Prefer concrete types in hot loops¶
Problem¶
type Adder interface{ Add(int) int }
func sum(items []Adder, x int) int {
total := 0
for _, it := range items {
total += it.Add(x) // itab indirection per element
}
return total
}
Each call goes through itab.fun[0]. The compiler cannot inline an interface call.
Fix¶
type IntAdder struct{ n int }
func (a IntAdder) Add(x int) int { return a.n + x }
func sum(items []IntAdder, x int) int {
total := 0
for _, it := range items {
total += it.Add(x) // statically dispatched, often inlined
}
return total
}
Static dispatch lets the compiler inline Add and unroll. Benchmarks typically show 2–4x speedups.
When to keep the interface¶
- Heterogeneous collection (
Adder,Multiplier,Logger). - Mocking boundary in tests.
- Public API surface.
Inside a single hot function, drop the interface.
3. Reduce itab pressure with type switches¶
Problem¶
if c, ok := s.(Circle); ok { return c.Area() }
if r, ok := s.(Rectangle); ok { return r.Area() }
if t, ok := s.(Triangle); ok { return t.Area() }
Three separate itab lookups. The runtime hashes (Shape, X) for each X.
Fix¶
switch v := s.(type) {
case Circle: return v.Area()
case Rectangle: return v.Area()
case Triangle: return v.Area()
}
The compiler emits a single dispatch table. The runtime reads the type descriptor once.
Tip¶
If you have a few hot types and a long tail, branch on the hot ones first:
switch v := op.(type) {
case AddOp: return v.do() // 90% of traffic
case MulOp: return v.do()
default: return op.Run()
}
4. Stop the typed-nil escape¶
Problem¶
func find(id int) (*User, error) {
var u *User
if err := db.Get(id, &u); err != nil {
return nil, err // returns interface (*MyErr, nil) — typed nil
}
return u, nil
}
If db.Get returns a typed-nil *MyErr, the wrapped error is non-nil at the call site even though "no error" is the intent.
Fix¶
Or normalise at boundaries:
Better: never return *MyErr directly. Return the error interface and produce nil by hand.
5. Avoid allocation from method values¶
Problem¶
go w.Run // method value — escapes; closure on heap
ch <- conn.Read // same problem
defer s.Close() // same problem (sometimes)
A method value binds the receiver to the function. If the value escapes (channel, goroutine, slice), the runtime heap-allocates the closure.
Fix — method expression¶
Or eliminate the closure with a struct that owns the receiver:
Profiling¶
6. Batch reflect calls outside the hot path¶
Problem¶
func encode(v any) []byte {
t := reflect.TypeOf(v) // every call
fields := []reflect.StructField{}
for i := 0; i < t.NumField(); i++ {
fields = append(fields, t.Field(i))
}
/* ... */
}
reflect.TypeOf is cheap (it just reads the type word), but Field(i) walks an internal table. Repeated calls add up.
Fix — cache the type metadata¶
var fieldCache sync.Map // map[reflect.Type][]reflect.StructField
func fields(t reflect.Type) []reflect.StructField {
if v, ok := fieldCache.Load(t); ok {
return v.([]reflect.StructField)
}
out := make([]reflect.StructField, t.NumField())
for i := range out { out[i] = t.Field(i) }
fieldCache.Store(t, out)
return out
}
encoding/json, encoding/gob, and database/sql all use this technique.
7. Skip reflect entirely when you can¶
Problem¶
func clone(v any) any {
rv := reflect.ValueOf(v)
out := reflect.New(rv.Type()).Elem()
out.Set(rv)
return out.Interface()
}
Generic-looking, slow.
Fix — generics¶
No reflection, no boxing, no itab. Use reflect only when the type is genuinely unknown at compile time (decoders, schema engines).
8. Comparison without panic¶
Problem¶
Fix — check up front¶
Or normalise to a stable key:
For known shapes, hand-roll a key:
Hand-rolled keys are zero-allocation and never panic.
9. Generics over any for performance¶
Problem¶
func Sum(values []any) any {
total := 0
for _, v := range values {
total += v.(int) // type assertion + panic risk
}
return total
}
Every element is boxed; every iteration runs an itab assertion.
Fix¶
func Sum[T int | float64](values []T) T {
var total T
for _, v := range values { total += v }
return total
}
The compiler generates one specialised version per T shape. No boxing. No itab. The inner loop becomes a tight scalar add.
10. Watch out for interface { ... } as a tag¶
Problem¶
type Tag interface{ tag() }
type A struct{}; func (A) tag() {}
type B struct{}; func (B) tag() {}
var things []Tag
for _, x := range raw {
things = append(things, A{x}) // boxes each element
}
A "marker interface" forces every concrete value into an iface header.
Fix — use a sum type pattern¶
A single struct, no boxing, branch on kind. Slightly more memory per element, no allocations during iteration.
This is what the standard library's go/ast and database/sql.(*Rows).Scan do.
11. Inline-friendly methods¶
Problem¶
type Counter interface{ Get() int }
type intCounter int
func (c intCounter) Get() int { return int(c) }
for i := 0; i < n; i++ {
total += counter.Get() // not inlined — interface
}
Interface dispatch blocks inlining.
Fix — escape the interface inside the loop¶
if c, ok := counter.(intCounter); ok {
for i := 0; i < n; i++ {
total += c.Get() // inlined — concrete
}
} else {
for i := 0; i < n; i++ {
total += counter.Get()
}
}
Pay the type assertion once; let the compiler inline the rest.
12. Profile the runtime cost¶
CPU¶
Symbols to look for: - runtime.convT64, runtime.convTstring, runtime.convTslice — boxing. - runtime.getitab — itab lookups in a hot loop. - runtime.assertI2T2 — failing type assertions. - runtime.ifaceeq — interface comparisons.
Memory¶
A spike in runtime.convT* callers tells you exactly where boxing happens.
Trace¶
Look for goroutine GC waits — boxing-heavy code triggers more cycles.
13. Cheat Sheet¶
ALLOCATION CONTROL
─────────────────────────────
boxing primitive → 1 alloc + 1 itab (avoid in hot loops)
boxing pointer → 0 alloc, data = ptr
boxing slice/map → 1 alloc for the header
DISPATCH
─────────────────────────────
concrete call → inlined, ~1 ns
generic call → inlined per shape
interface call → itab + indirect, ~3 ns
reflect call → ~100 ns
COMPARISON
─────────────────────────────
== on iface : type word + data word (or deep ifaceeq)
panic if underlying type is uncomparable
typed nil : type != nil, data == nil → not equal to nil
PROFILING SYMBOLS
─────────────────────────────
runtime.convT* : boxing
runtime.getitab : itab lookup
runtime.assertI2T : type assertion
runtime.ifaceeq : iface compare
Summary¶
The two-word interface header is cheap, but only when you respect it:
- Do not box primitives in hot loops — specialise or use generics.
- Prefer concrete types when the static type is known — let the compiler inline.
- Collapse repeated assertions into a type switch — fewer itab lookups.
- Treat typed-nil as a bug to be eliminated at the boundary.
- Avoid method values that escape; use method expressions.
- Cache reflect metadata; never reflect inside the inner loop.
- Skip reflect entirely with generics where the type is statically known.
- Use comparable hand-rolled keys when uncomparable types reach interfaces.
- Replace marker interfaces with sum-type structs when the variant set is fixed.
- Profile with
convT*andgetitabsymbols — they flag exactly where the runtime pays.