Methods on Generic Types — Optimize¶
Table of Contents¶
- Receiver kind impact
- Method dispatch cost on generic types
- Method values and accidental heap allocation
- Inlining of generic methods
- Embedded methods and indirection
- Benchmark patterns for generic methods
- Practical optimization checklist
- Summary
Receiver kind impact¶
Choosing pointer vs value receiver on a generic type is a performance decision as well as a correctness decision.
Value receiver — copies on every call¶
Every call to Method() copies all eight int fields plus the slice header. For tiny types this is cheap; for fat structs it is wasteful.
Pointer receiver — no copy¶
A pointer receiver passes one machine word. Always cheap — but the caller must have an addressable Big[T] or a pointer.
Rule of thumb¶
| Struct size | Receiver |
|---|---|
| ≤ 16 bytes (1-2 words) | Value is fine |
| > 16 bytes | Pointer |
| Contains slices/maps/channels | Pointer (almost always) |
| Mutating | Pointer |
This rule is independent of generics. Generics inherit it unchanged.
Cost in numbers¶
Approximate per-call overhead from copying the receiver:
| Receiver size | Overhead vs pointer |
|---|---|
| 16 bytes | ~0 ns (registers) |
| 32 bytes | ~1 ns |
| 64 bytes | ~2-3 ns |
| 128 bytes | ~4-6 ns |
| 256 bytes | ~8-12 ns |
Tiny per-call costs that explode in tight loops with millions of calls.
Method dispatch cost on generic types¶
A method on a generic type is dispatched directly — no interface table, no virtual dispatch. The compiler stencils a body per GC shape and the call site goes straight to the right stencil.
Numeric path — essentially free¶
Calling c.Inc() for *Counter[int] is functionally identical to a hand-written func IncCounter(c *IntCounter) { c.n++ }. The compiler typically inlines both.
Pointer-shaped types — dictionary cost¶
For a generic method that does == or hashing on T, when T is pointer-shaped, the body cannot inline the operation — it must look it up in the runtime dictionary.
For Set[int], the map lookup uses the inlined integer hash. For Set[*Foo], the map lookup goes through the dictionary's hash function. Cost: a few extra nanoseconds per call.
Interface satisfaction — extra indirection¶
When a generic-instantiated value is held through an interface, the call goes through the interface table, not the direct dispatch:
Inside the method body, generic logic still runs as compiled. The overhead is purely the interface dispatch on the outer call.
Method values and accidental heap allocation¶
A method value captures the receiver. For pointer receivers, this typically forces the receiver onto the heap.
The escape¶
s cannot live on the stack — the closure (the method value) outlives makePusher. The compiler allocates s on the heap.
Why it matters¶
In a hot loop:
This allocates a method-value structure per iteration if the compiler cannot eliminate it. Replacement:
A 100x improvement in the inner loop is typical.
Detection¶
Use go build -gcflags="-m=2" to see escape decisions:
Look for these in performance-sensitive code.
Mitigation¶
- Don't create method values in hot loops — use direct method calls.
- Pass pointers explicitly when the method-value pattern is needed across function boundaries.
- Prefer free functions when the receiver does not need to be hidden.
Inlining of generic methods¶
The Go compiler inlines small functions including generic-type methods, but the rules are subtle.
When inlining works¶
- Method body is small (a few statements)
Tis a single concrete shape used at the call site- No type-dependent operations (equality, hashing) inside the body
- No
defer, no closures, no loops with too many iterations
For most simple methods like Get, Set, Push, Pop, Len, inlining works well.
When inlining fails¶
- Method uses generic operations (
==,<,rangeovermap[T]V) - Method calls into the runtime dictionary
- Method body is too large
- Method is exported and may be called from many shapes
Use -gcflags="-m=2" to see inlining decisions:
PGO (profile-guided optimisation)¶
Go 1.21+ introduced PGO. With a profile, the compiler can specialise hot generic functions and inline more aggressively. For a hot generic method, the speedup is typically 5-15%.
Embedded methods and indirection¶
When a generic type embeds another generic type, calling promoted methods involves an extra indirection.
type Inner[T any] struct{}
func (Inner[T]) Foo() {}
type Outer[T any] struct{ Inner[T] }
o := Outer[int]{}
o.Foo() // resolved as o.Inner.Foo()
The compiler's job is to expand o.Foo() to o.Inner.Foo() and inline the call. Most of the time this is free.
The "embedded pointer" pattern¶
A common pattern is to embed a pointer to a generic type:
Now method calls dereference the pointer first. If Inner[T] is large or shared, this saves copying. If Inner[T] is tiny, value embedding is cheaper.
Cost per embedding level¶
Each embedding level adds a tiny offset computation on access. For 1-2 levels, this is invisible. For deeply nested types (5+), call sites can become slower than direct fields.
Benchmark patterns for generic methods¶
Always benchmark before optimizing. Here are useful patterns.
Pattern 1 — Compare with hand-written¶
func BenchmarkPushGeneric(b *testing.B) {
s := &Stack[int]{}
for i := 0; i < b.N; i++ {
s.Push(i)
}
}
func BenchmarkPushHand(b *testing.B) {
var s []int
for i := 0; i < b.N; i++ {
s = append(s, i)
}
}
Result: usually within 1-3% on numeric types.
Pattern 2 — Compare with interface{} baseline¶
func BenchmarkPushIface(b *testing.B) {
var s []interface{}
for i := 0; i < b.N; i++ {
s = append(s, i)
}
}
The generic path is typically 5-10x faster than the interface path due to no boxing.
Pattern 3 — Method value vs direct call¶
func BenchmarkMethodValue(b *testing.B) {
s := &Stack[int]{}
push := s.Push
for i := 0; i < b.N; i++ { push(i) }
}
func BenchmarkDirect(b *testing.B) {
s := &Stack[int]{}
for i := 0; i < b.N; i++ { s.Push(i) }
}
The method-value version is usually slightly slower due to the captured receiver.
Pattern 4 — Pointer vs value receiver¶
Same struct, different receivers. The pointer version is typically faster for big structs.
Practical setup¶
Run with -count=10 and inspect benchstat output to filter noise.
Practical optimization checklist¶
Before merging generic code that might be performance-sensitive:
- Receiver is pointer for any method that mutates state
- Receiver is pointer for any struct larger than 16 bytes
- Method values are not created inside hot loops
-
-gcflags="-m=2"shows expected inlining for hot methods - No hidden interface satisfaction adds dispatch overhead
- Embedded types are positioned to minimize indirection
- Benchmarks compare generic, interface-based, and hand-written variants
- PGO is enabled for production builds if hot paths use generics
- No accidental escape to heap (check with
-gcflags="-m")
Summary¶
Optimizing methods on generic types comes down to the same fundamentals as optimizing classic Go methods, with two generic-specific concerns:
- GC shape stenciling can add small dictionary indirection for type-dependent operations on pointer-shaped types.
- Method values of generic methods cause the receiver to escape — avoid in hot loops.
The big wins:
- Pointer receivers for any non-trivial generic struct.
- Direct method calls instead of method values in hot paths.
- Free functions for shape-changing operations (no method dispatch overhead).
- PGO when the hot path crosses generic boundaries.
The biggest "do not panic" lesson: generic methods are usually within a few percent of hand-written code. The simplicity benefits — one implementation, type safety — almost always outweigh the small per-call cost. Optimize only when benchmarks demand it.