Type Object — Optimize¶
10 inefficient implementations + benchmarks + optimized version + tradeoffs.
Benchmarks: Apple M2 Pro, single thread. Figures are illustrative orders of magnitude, not lab-grade measurements; the point is the relative cost.
Table of Contents¶
- Optimization 1: Share one type object instead of copying
- Optimization 2: Intern/cache the type lookup
- Optimization 3: Avoid the map lookup on the hot path
- Optimization 4: Copy-down inheritance instead of delegation chains
- Optimization 5: Struct-of-arrays for instances
- Optimization 6: Integer type IDs instead of string keys
- Optimization 7: Lazy type loading
- Optimization 8: Copy-on-write registry instead of locking reads
- Optimization 9: Cache derived per-type data
- Optimization 10: Validate once at load, not per access
Optimization 1: Share one type object instead of copying¶
Inefficient¶
type Monster struct {
breed Breed // value: each monster embeds a full copy of the breed
health int
}
func NewMonster(b Breed) *Monster { return &Monster{breed: b, health: b.MaxHealth} }
Why it's wasteful¶
The whole point of Type Object is sharing. A by-value field copies the breed (name, strings, stats) into every instance.
Optimized¶
type Monster struct {
breed *Breed // pointer: one shared breed for all monsters of this kind
health int
}
func NewMonster(b *Breed) *Monster { return &Monster{breed: b, health: b.MaxHealth} }
~15× less memory, and breeds stay in sync (patch the breed, every monster sees it).
Tradeoff¶
Shared mutable state is dangerous — the breed must be immutable. You trade copy-isolation for a discipline (no breed setters).
Optimization 2: Intern/cache the type lookup¶
Inefficient¶
def spawn(name: str, registry: dict, world):
breed = registry[name] # dict lookup + string hash EVERY spawn
return breed.new_monster()
# Wave spawner calls spawn("goblin", ...) 5,000 times per wave.
Why it's slow¶
Hashing the string "goblin" and probing the dict 5,000 times per wave is pure overhead when the breed never changes during the wave.
Optimized¶
def spawn_wave(name: str, count: int, registry: dict, world):
breed = registry[name] # resolve ONCE, hoist out of the loop
return [breed.new_monster() for _ in range(count)]
Tradeoff¶
You must hold the resolved breed reference. Fine when many instances of one kind are created together; if every spawn is a different kind, there's nothing to hoist.
Optimization 3: Avoid the map lookup on the hot path¶
Inefficient¶
// Called every frame for every monster during AI update.
String attackString(Monster m) {
return registry.get(m.breedName).attackString; // map lookup per monster per frame
}
Why it's slow¶
The monster stores a breedName string and re-resolves it through the registry on every access. With 10,000 monsters at 60 fps that's 600,000 map lookups/second for data that never moves.
Optimized¶
// Resolve the breed reference ONCE, at construction; store the object, not the name.
final class Monster {
final Breed breed; // direct reference
Monster(Breed breed) { this.breed = breed; }
}
String attackString(Monster m) { return m.breed.attackString; } // field deref, no map
Tradeoff¶
The monster now holds an object reference instead of a portable string id. If you serialize monsters, you serialize breed.name and re-resolve on load — which is correct anyway.
Optimization 4: Copy-down inheritance instead of delegation chains¶
Inefficient¶
int maxHealth() { // delegation: walk parents every read
return own != null ? own : parent.maxHealth();
}
Why it's slow¶
A field unset on a deep breed (e.g. eliteFireTroll → fireTroll → troll → goblin) walks four objects per read. Reads happen constantly; the chain length is paid every time.
Optimized¶
// Resolve the full chain ONCE at load (copy-down), so reads are plain field access.
final class Breed {
final int maxHealth; // already includes inherited value
Breed(Builder b, Breed parent) {
this.maxHealth = b.maxHealth != null ? b.maxHealth : parent.maxHealth;
}
}
int maxHealth() { return maxHealth; } // one field load, no walk
Tradeoff¶
Copy-down uses slightly more memory (each breed stores all inherited fields) and means changing a parent requires rebuilding descendants. Worth it: type objects are built rarely, read constantly.
Optimization 5: Struct-of-arrays for instances¶
Inefficient¶
type Monster struct { // array-of-structs
breed *Breed
health int
posX, posY float64
cooldown float64
}
var monsters []Monster
func tickHealth(ms []Monster) { // touches only `health`...
for i := range ms {
ms[i].health = min(ms[i].health+1, ms[i].breed.MaxHealth)
}
}
Why it's slow¶
A health-regen pass reads only health and breed.MaxHealth, but each Monster struct is large; the CPU pulls whole structs (pos, cooldown) into cache, wasting bandwidth.
Optimized¶
type Monsters struct { // struct-of-arrays: hot fields packed together
breed []*Breed
health []int
posX, posY []float64
cooldown []float64
}
func tickHealth(m *Monsters) {
for i := range m.health { // tight loop over a contiguous int slice
if max := m.breed[i].MaxHealth; m.health[i] < max {
m.health[i]++
}
}
}
Tradeoff¶
SoA complicates code (no single Monster object to pass around) and is overkill for small N. It also moves you toward ECS — which is the right destination if you're doing this seriously. Use for hot bulk passes over many instances.
Optimization 6: Integer type IDs instead of string keys¶
Inefficient¶
# Breed referenced everywhere by string name.
registry: dict[str, Breed] = {}
breed = registry["fire_troll"] # string hash on every cross-reference
Why it's slow¶
String keys hash the whole string each lookup and store fat keys. In a system resolving millions of breed references (loading saves, spawn tables, drop tables), string hashing adds up.
Optimized¶
# Assign each breed a small integer id at load; index a list, not a dict.
breeds: list[Breed] = [] # id == index
name_to_id: dict[str, int] = {} # resolve names → ids ONCE, at load
def breed_id(name: str) -> int: return name_to_id[name] # only at load/deserialize
def get(bid: int) -> Breed: return breeds[bid] # O(1) list index, no hashing
Tradeoff¶
Integer ids are not human-readable and must stay stable across reloads/saves (or you remap). Keep the name↔id table for tooling and serialization; use ids only on hot internal paths.
Optimization 7: Lazy type loading¶
Inefficient¶
// Startup: parse + build EVERY breed, including 900 that this level never spawns.
void boot() {
for (var entry : allBreedData.entrySet()) // 1,000 breeds, full parse
registry.put(entry.getKey(), buildBreed(entry.getValue()));
}
Why it's slow¶
A game with 1,000 breeds but a level that uses 40 pays the full parse/validate/build cost up front, lengthening load time and resident memory.
Optimized¶
// Build a breed on first request; cache it. (Validate schema eagerly, build lazily.)
Breed get(String name) {
return cache.computeIfAbsent(name, n -> buildBreed(rawData.get(n)));
}
Tradeoff¶
Lazy building hides bad data until first use — so still validate all data eagerly at load (cheap), and only defer the build. First spawn of a new kind has a one-time build cost (a possible frame hitch); pre-warm known-needed breeds before a wave.
Optimization 8: Copy-on-write registry instead of locking reads¶
Inefficient¶
type Registry struct {
mu sync.RWMutex
breeds map[string]*Breed
}
func (r *Registry) Get(name string) *Breed {
r.mu.RLock() // lock on EVERY read, even though writes are rare
defer r.mu.RUnlock()
return r.breeds[name]
}
Why it's slow¶
Hot reload happens seconds apart; reads happen millions of times. An RLock/RUnlock per read adds atomic contention on the shared lock across all reader goroutines.
Optimized¶
type Registry struct {
snap atomic.Pointer[map[string]*Breed] // immutable snapshot
}
func (r *Registry) Get(name string) (*Breed, bool) {
b, ok := (*r.snap.Load())[name] // single atomic load, no lock
return b, ok
}
// Writers build a new map and Store() it (see find-bug Bug 5).
Tradeoff¶
Writers do more work (copy the whole map per registration) and there's a brief window where a reader sees the pre-swap snapshot. Both are fine because breeds are immutable and writes are rare. Don't use COW if writes are frequent and the map is huge.
Optimization 9: Cache derived per-type data¶
Inefficient¶
def loot_weight_total(breed: Breed) -> int:
return sum(d.weight for d in breed.drops) # recomputed every time we roll loot
Why it's slow¶
Every loot roll re-sums the drop weights, even though the drop table is fixed per breed and shared. With heavy loot churn this is repeated work over immutable data.
Optimized¶
@dataclass(frozen=True)
class Breed:
name: str
drops: tuple
total_weight: int # precomputed at load, since drops are immutable
def loot_weight_total(breed: Breed) -> int:
return breed.total_weight # field read
Tradeoff¶
Only valid because the breed (and its drop table) is immutable — derived data can be computed once and never invalidated. If breeds were mutable you'd have a cache-coherence problem. Compute derived fields in the registry's build step.
Optimization 10: Validate once at load, not per access¶
Inefficient¶
int maxDamage(Monster m) {
int lo = m.breed.minDamage, hi = m.breed.maxDamage;
if (lo < 0 || hi < lo) // defensive check on EVERY access
throw new IllegalStateException("bad damage range for " + m.breed.name);
return hi;
}
Why it's slow (and wrong-shaped)¶
The breed is immutable and validated at load — re-checking its invariants on every gameplay access is pure overhead, and it scatters validation logic through the codebase.
Optimized¶
// Validate ONCE when the breed is built; if a Breed exists, it's valid.
final class Breed {
final int minDamage, maxDamage;
Breed(int minDamage, int maxDamage) {
if (minDamage < 0 || maxDamage < minDamage)
throw new IllegalArgumentException("bad damage range");
this.minDamage = minDamage; this.maxDamage = maxDamage;
}
}
int maxDamage(Monster m) { return m.breed.maxDamage; } // trust the invariant
Tradeoff¶
Validation must be complete at the boundary (structural + referential + semantic) so gameplay can trust the invariant. That's exactly the professional-level pipeline — the cost moves from per-access to once-per-load, where it belongs.
Summary Table¶
| # | Optimization | Win | Key tradeoff |
|---|---|---|---|
| 1 | Share by pointer, not by value | ~15× memory | Breed must be immutable |
| 2 | Hoist the type lookup out of loops | ~6× spawn cost | Only helps batched same-kind spawns |
| 3 | Store the breed reference, not its name | ~80× hot-path | Re-resolve name on deserialize |
| 4 | Copy-down inheritance | ~18× read | Rebuild descendants on parent change |
| 5 | Struct-of-arrays for instances | ~3.4× bulk pass | Code complexity; leads toward ECS |
| 6 | Integer type ids | ~13× lookup | Not human-readable; must stay stable |
| 7 | Lazy type building | ~22× load time | Validate eagerly; first-use hitch |
| 8 | Copy-on-write registry | ~9× concurrent read | Writers copy the whole map |
| 9 | Cache derived per-type data | ~20× | Only sound because breeds are immutable |
| 10 | Validate at load, not per access | ~14× | Boundary validation must be complete |
The recurring theme: immutability + share-by-reference + resolve-once-at-load. Almost every optimization is "do the work once when the type object is built, then make the hot path a plain field read."
← Find the Bug · Other Patterns · Design Patterns · Next: Type Object — Tasks
In this topic