Organizing Data — Professional Level¶
Memory layout, field alignment, escape analysis on value objects, and the runtime cost of choosing classes vs. primitives.
Table of Contents¶
- Object header overhead
- Replace Data Value with Object: the cost of boxing
- Project Valhalla and value classes
- Type code via enum: how the JVM handles it
- Enum vs. polymorphism: dispatch costs
- Encapsulate Field and JIT inlining
- Encapsulate Collection: defensive copy cost
- Reference vs. value: GC and equality cost
- Field alignment in Go and Rust
- Python: dict-based attributes vs. slots
- Review questions
Object header overhead¶
Every Java object has a header: - 12 bytes (compressed oops, default on 64-bit) or 16 bytes (full pointers). - Holds class pointer + identity hash + lock state + GC mark.
Every wrapping (new Email(s), new Money(amount, currency)) costs at least 12 bytes per instance, plus the contained fields, plus reference bytes.
For 1M instances of a wrapper around a single String: 16 MB of headers, ignoring the String content itself.
Replace Data Value with Object: the cost of boxing¶
vs.
Memory: - Variant A: header + 4-byte ref → ~16 bytes/Person. - Variant B: header + 4-byte ref + Email object (header + 4-byte ref to String) → ~28 bytes/Person.
For 100M instances, that's 1.2 GB more.
When this matters¶
- Massive in-memory caches.
- Tight numerical loops.
- Mobile / embedded.
Mitigations¶
- Don't promote in hot containers; keep the primitive on the entity, validate at boundaries.
- Use
@ValueLombok types (compile-time only — same memory cost, but less code). - Wait for Project Valhalla (value classes — no header).
In typical web services with 10K req/s and short-lived data, the cost is invisible. Profile before you optimize.
Project Valhalla and value classes¶
Java's evolution that will make Replace Data Value with Object essentially free:
Email instances would be: - Stored inline (no separate heap allocation, no header). - Compared by value (== works as you'd expect). - Compatible with generics.
Until Valhalla ships: - Records (Java 14+) reduce code; don't reduce memory. - Inline classes (preview) — early Valhalla. - Workaround: keep primitives, use type-system tricks (newtype-like wrappers, validation at boundaries).
Type code via enum: how the JVM handles it¶
Java enums are full classes. Each constant is a singleton instance.
- 3 instances allocated at class init.
- Each is a heap object with a header.
Status.ACTIVE == statusis a pointer compare — fast.switchover enum compiles totableswitch(jump table) — O(1).
Memory¶
An enum with 100 constants allocates 100 instances. For most apps, negligible. For a per-tenant enum dynamically generated, it could matter (and you'd usually avoid that pattern).
Performance¶
Enum switch is among the fastest dispatches available — faster than virtual calls in many cases, because the JIT can generate specialized code for each branch.
Enum vs. polymorphism: dispatch costs¶
Replace Type Code with Subclasses introduces a virtual call:
abstract class Employee { abstract double pay(); }
class Engineer extends Employee { double pay() { return 5000; } }
vs. enum dispatch:
Costs¶
| Variant | Dispatch |
|---|---|
| Subclass virtual call | invokevirtual → vtable lookup (1 virtual call, ~1 ns post-JIT) |
| Enum abstract method | invokevirtual on the enum constant (same dispatch cost) |
| Plain enum + switch | tableswitch (1 jump, no virtual call) |
For most applications, all three are within 10% of each other. For very hot paths, the plain enum + switch can have a slight edge because it doesn't depend on JIT inlining decisions.
Verify with JMH for your workload. Don't decide based on theory.
Encapsulate Field and JIT inlining¶
In bytecode, balance() is a method call. Post-JIT, the call is inlined to a direct field access. Encapsulate Field is free at runtime.
The exception: if balance() is overridden in subclasses (and the call site is megamorphic), inlining stalls.
For 99% of code, encapsulation has zero cost. Don't avoid it for "performance reasons."
Encapsulate Collection: defensive copy cost¶
Each call allocates a new list — copies the references. For a list of 1000 elements: - 1 list object header. - ~4-8 KB for the underlying array (1000 references).
If getOrders() is called 1000 times per request, that's MB of garbage per request.
Mitigation¶
Returns a view — no copy. But the view shares the underlying list, so mutating orders is visible through it. Document carefully.
For most services, the copy is fine. For hot paths, return the view (or a stream).
Streams as encapsulation¶
No copy; consumer can't mutate. But you can't size() directly — caller must collect.
Reference vs. value: GC and equality cost¶
Reference equality¶
One pointer compare. ~1 nanosecond.
Value equality¶
Calls equals, which usually compares one or more fields. ~3–10 nanoseconds.
For 1B comparisons, the difference is single-digit seconds. For most workloads, irrelevant.
Hash-based collections¶
HashMap<Customer, X> calls hashCode() and equals() on lookup. If Customer.equals is value-based, each operation is several field reads. For huge maps, can dominate.
Mitigations: - Cache hashCode in a final field. - Use a primitive id as map key, store Customer separately. - Prefer IdentityHashMap if reference semantics are correct.
Field alignment in Go and Rust¶
Go¶
Go doesn't reorder struct fields. Alignment matters:
type Bad struct { // 24 bytes
a bool
b int64
c bool
d int64
}
type Good struct { // 24 bytes? No — 24 too. But `Best` is 16:
b int64
d int64
a bool
c bool
}
Tools: fieldalignment (a go vet analyzer) finds suboptimal layouts.
For 10M instances, going from 24 to 16 bytes saves 80 MB.
Rust¶
Rust does reorder by default for #[repr(Rust)] (the default). For C-compatible layout, use #[repr(C)] and pay attention to alignment.
Implication for Organizing Data¶
Replace Type Code with Class adds a field. In Go especially, watch alignment — and don't accidentally bloat hot structures.
Python: dict-based attributes vs. slots¶
CPython instances use a __dict__ for attributes by default — flexible but expensive: - ~280 bytes per instance with __dict__. - ~50–80 bytes per instance with __slots__.
For Replace Data Value with Object on a hot type (millions of instances):
Or use @dataclass(slots=True) (Python 3.10+):
frozen=True makes it immutable (value semantics). slots=True makes it small.
Faster equivalents¶
- NamedTuple — immutable, tuple-backed.
- attrs library — older but rich.
- Pydantic v2 with frozen models — Rust-backed validation.
When dict overhead is fine¶
For domain entities (one per request), the overhead is negligible. For caches, batch processing, columnar data — switch to __slots__, NumPy, polars, or Pandas.
Review questions¶
- What's the size of a typical Java object header?
- Why does Replace Data Value with Object cost memory in pre-Valhalla Java?
- How will Project Valhalla change the calculus?
- How is enum dispatch implemented in JVM bytecode?
- Compare enum dispatch vs. virtual call vs. switch in terms of cost.
- Is Encapsulate Field a runtime cost?
- What's the cost of
List.copyOf(orders)per call? - When does reference equality vs. value equality matter at scale?
- What is
fieldalignmentin Go? - When should a Python class use
__slots__?