Methods on Generic Types — Optimize¶

Table of Contents¶

Receiver kind impact
Method dispatch cost on generic types
Method values and accidental heap allocation
Inlining of generic methods
Embedded methods and indirection
Benchmark patterns for generic methods
Practical optimization checklist
Summary

Receiver kind impact¶

Choosing pointer vs value receiver on a generic type is a performance decision as well as a correctness decision.

Value receiver — copies on every call¶

type Big[T any] struct {
    a, b, c, d, e, f, g, h int
    items                   []T
}

func (b Big[T]) Method() {}

Every call to Method() copies all eight int fields plus the slice header. For tiny types this is cheap; for fat structs it is wasteful.

Pointer receiver — no copy¶

func (b *Big[T]) Method() {}

A pointer receiver passes one machine word. Always cheap — but the caller must have an addressable Big[T] or a pointer.

Rule of thumb¶

Struct size	Receiver
≤ 16 bytes (1-2 words)	Value is fine
> 16 bytes	Pointer
Contains slices/maps/channels	Pointer (almost always)
Mutating	Pointer

This rule is independent of generics. Generics inherit it unchanged.

Cost in numbers¶

type T16 struct { a, b int }                       // 16 bytes
type T64 struct { a, b, c, d int; arr [4]int }    // 64 bytes

Approximate per-call overhead from copying the receiver:

Receiver size	Overhead vs pointer
16 bytes	~0 ns (registers)
32 bytes	~1 ns
64 bytes	~2-3 ns
128 bytes	~4-6 ns
256 bytes	~8-12 ns

Tiny per-call costs that explode in tight loops with millions of calls.

Method dispatch cost on generic types¶

A method on a generic type is dispatched directly — no interface table, no virtual dispatch. The compiler stencils a body per GC shape and the call site goes straight to the right stencil.

Numeric path — essentially free¶

type Counter[T ~int | ~int64] struct{ n T }
func (c *Counter[T]) Inc() { c.n++ }

Calling c.Inc() for *Counter[int] is functionally identical to a hand-written func IncCounter(c *IntCounter) { c.n++ }. The compiler typically inlines both.

Pointer-shaped types — dictionary cost¶

For a generic method that does == or hashing on T, when T is pointer-shaped, the body cannot inline the operation — it must look it up in the runtime dictionary.

func (s *Set[T]) Has(v T) bool {
    _, ok := s.m[v]
    return ok
}

For Set[int], the map lookup uses the inlined integer hash. For Set[*Foo], the map lookup goes through the dictionary's hash function. Cost: a few extra nanoseconds per call.

Interface satisfaction — extra indirection¶

When a generic-instantiated value is held through an interface, the call goes through the interface table, not the direct dispatch:

var p IntPusher = &Stack[int]{}
p.Push(1)   // interface call — slower than s.Push(1)

Inside the method body, generic logic still runs as compiled. The overhead is purely the interface dispatch on the outer call.

Method values and accidental heap allocation¶

A method value captures the receiver. For pointer receivers, this typically forces the receiver onto the heap.

The escape¶

func makePusher() func(int) {
    s := &Stack[int]{}
    return s.Push   // s escapes to the heap
}

s cannot live on the stack — the closure (the method value) outlives makePusher. The compiler allocates s on the heap.

Why it matters¶

In a hot loop:

for i := 0; i < n; i++ {
    push := s.Push       // creates a method value each iteration
    push(i)
}

This allocates a method-value structure per iteration if the compiler cannot eliminate it. Replacement:

for i := 0; i < n; i++ {
    s.Push(i)            // direct call, zero allocation
}

A 100x improvement in the inner loop is typical.

Detection¶

Use go build -gcflags="-m=2" to see escape decisions:

./main.go:12:9: s escapes to heap
./main.go:12:9: s.Push escapes to heap

Look for these in performance-sensitive code.

Mitigation¶

Don't create method values in hot loops — use direct method calls.
Pass pointers explicitly when the method-value pattern is needed across function boundaries.
Prefer free functions when the receiver does not need to be hidden.

Inlining of generic methods¶

The Go compiler inlines small functions including generic-type methods, but the rules are subtle.

When inlining works¶

Method body is small (a few statements)
T is a single concrete shape used at the call site
No type-dependent operations (equality, hashing) inside the body
No defer, no closures, no loops with too many iterations

For most simple methods like Get, Set, Push, Pop, Len, inlining works well.

When inlining fails¶

Method uses generic operations (==, <, range over map[T]V)
Method calls into the runtime dictionary
Method body is too large
Method is exported and may be called from many shapes

Use -gcflags="-m=2" to see inlining decisions:

./stack.go:7:6: can inline (*Stack[T]).Push
./stack.go:7:6: inlining call to (*Stack[int]).Push

PGO (profile-guided optimisation)¶

Go 1.21+ introduced PGO. With a profile, the compiler can specialise hot generic functions and inline more aggressively. For a hot generic method, the speedup is typically 5-15%.

go build -pgo=default.pgo ./...

Embedded methods and indirection¶

When a generic type embeds another generic type, calling promoted methods involves an extra indirection.

type Inner[T any] struct{}
func (Inner[T]) Foo() {}

type Outer[T any] struct{ Inner[T] }

o := Outer[int]{}
o.Foo()   // resolved as o.Inner.Foo()

The compiler's job is to expand o.Foo() to o.Inner.Foo() and inline the call. Most of the time this is free.

The "embedded pointer" pattern¶

A common pattern is to embed a pointer to a generic type:

type Outer[T any] struct{ *Inner[T] }

Now method calls dereference the pointer first. If Inner[T] is large or shared, this saves copying. If Inner[T] is tiny, value embedding is cheaper.

Cost per embedding level¶

Each embedding level adds a tiny offset computation on access. For 1-2 levels, this is invisible. For deeply nested types (5+), call sites can become slower than direct fields.

Benchmark patterns for generic methods¶

Always benchmark before optimizing. Here are useful patterns.

Pattern 1 — Compare with hand-written¶

func BenchmarkPushGeneric(b *testing.B) {
    s := &Stack[int]{}
    for i := 0; i < b.N; i++ {
        s.Push(i)
    }
}

func BenchmarkPushHand(b *testing.B) {
    var s []int
    for i := 0; i < b.N; i++ {
        s = append(s, i)
    }
}

Result: usually within 1-3% on numeric types.

Pattern 2 — Compare with `interface{}` baseline¶

func BenchmarkPushIface(b *testing.B) {
    var s []interface{}
    for i := 0; i < b.N; i++ {
        s = append(s, i)
    }
}

The generic path is typically 5-10x faster than the interface path due to no boxing.

Pattern 3 — Method value vs direct call¶

func BenchmarkMethodValue(b *testing.B) {
    s := &Stack[int]{}
    push := s.Push
    for i := 0; i < b.N; i++ { push(i) }
}

func BenchmarkDirect(b *testing.B) {
    s := &Stack[int]{}
    for i := 0; i < b.N; i++ { s.Push(i) }
}

The method-value version is usually slightly slower due to the captured receiver.

Pattern 4 — Pointer vs value receiver¶

type Big struct { /* 64 bytes */ }
func (b Big) DoVal() {}
func (b *Big) DoPtr() {}

Same struct, different receivers. The pointer version is typically faster for big structs.

Practical setup¶

go test -bench=. -benchmem -count=10 ./...

Run with -count=10 and inspect benchstat output to filter noise.

Practical optimization checklist¶

Before merging generic code that might be performance-sensitive:

Receiver is pointer for any method that mutates state
Receiver is pointer for any struct larger than 16 bytes
Method values are not created inside hot loops
-gcflags="-m=2" shows expected inlining for hot methods
No hidden interface satisfaction adds dispatch overhead
Embedded types are positioned to minimize indirection
Benchmarks compare generic, interface-based, and hand-written variants
PGO is enabled for production builds if hot paths use generics
No accidental escape to heap (check with -gcflags="-m")

Summary¶

Optimizing methods on generic types comes down to the same fundamentals as optimizing classic Go methods, with two generic-specific concerns:

GC shape stenciling can add small dictionary indirection for type-dependent operations on pointer-shaped types.
Method values of generic methods cause the receiver to escape — avoid in hot loops.

The big wins:

Pointer receivers for any non-trivial generic struct.
Direct method calls instead of method values in hot paths.
Free functions for shape-changing operations (no method dispatch overhead).
PGO when the hot path crosses generic boundaries.

The biggest "do not panic" lesson: generic methods are usually within a few percent of hand-written code. The simplicity benefits — one implementation, type safety — almost always outweigh the small per-call cost. Optimize only when benchmarks demand it.