Simplifying Method Calls — Professional Level¶
Method dispatch costs, exception performance, parameter passing conventions, and how API shape affects JIT.
Table of Contents¶
- Method dispatch revisited
- Exception performance
- Parameter Object: allocation cost
- Builder pattern: builder allocation
- Factory method: cached vs. fresh
- Encapsulate Downcast and JIT
- Varargs and array allocation
- Go: receiver shape and copy cost
- Python: keyword arguments and dict construction
- Review questions
Method dispatch revisited¶
A quick tour, applied to this category:
| Refactoring | Dispatch impact |
|---|---|
| Rename Method | Zero. Bytecode identifier changes; runtime same. |
| Add Parameter | Zero impact on dispatch; small impact on inlining (more bytes). |
| Hide Method (private) | Slightly faster — invokespecial instead of invokevirtual, no override possible. |
| Replace Constructor with Factory Method | Trades new + <init> for invokestatic — comparable cost. |
| Encapsulate Downcast | Pushes a checkcast opcode from N callers to 1 method body — tiny win in code size. |
For 99% of refactorings here, dispatch cost is invisible. The exceptions (parameter object allocation, Builder pattern) are about allocation, not dispatch.
Exception performance¶
Exceptions in Java cost:
- Stack trace capture — the slow part. ~5-50µs per exception.
- Throw + unwind — fast (similar to a return).
- Catch — almost free.
So: throwing exceptions in a hot loop where they're caught is very slow.
Workarounds¶
// Pre-allocated exception (no stack trace re-capture):
private static final ParseException PRE = new ParseException();
static { PRE.setStackTrace(new StackTraceElement[0]); }
if (...) throw PRE;
Or:
public static class FastException extends RuntimeException {
@Override
public Throwable fillInStackTrace() { return this; } // no stack trace
}
Used by parser libraries (ANTLR, fastjson) for known parse errors. Don't do this for unexpected exceptions — you lose debuggability.
Modern approach: don't throw in hot loops¶
If errors are expected, use Optional<T>, Result<T, E>, or boolean tryParse(String s, Out<T> result). Saves the throw cost entirely.
Replace Exception with Test is sometimes a 100× perf win for hot paths.
Parameter Object: allocation cost¶
Each call allocates two DateRange objects (if constructed at the call site).
For 10K req/s with 5 such calls per request: 50K small allocations/sec. JFR shows the cost; usually irrelevant.
Escape analysis to the rescue¶
If the DateRange doesn't escape the call (the callee doesn't store it), HotSpot's EA + scalar replacement eliminates the allocation. Verify:
When EA fails¶
- Parameter Object is stored, returned, or captured by a lambda.
- Method body is too large to inline (EA can't see lifecycle).
Fix¶
Hot path: keep primitives, pay the verbose signature. Or wait for Project Valhalla.
Builder pattern: builder allocation¶
A typical builder allocates: - The Builder object itself. - A String[] or List<> for headers / collections. - Possibly intermediate objects for fluent calls.
For one-time configuration (per-application setup): no concern.
For per-request hot paths: profile.
Mitigations¶
- ThreadLocal builder pool. Reuse builders across requests.
- Skip the builder. Provide a one-shot factory:
HttpRequest.of(url, headers, body, options). - Compile-time builders. Lombok
@BuilderwithtoBuilder = falsereduces overhead.
For most cases: don't worry.
Factory method: cached vs. fresh¶
class Currency {
private static final Map<String, Currency> CACHE = ...;
public static Currency of(String code) {
return CACHE.computeIfAbsent(code, Currency::new);
}
}
Caching factory: zero allocation for repeated lookups.
Fresh-each-call factory: same allocation as new.
When caching helps¶
- Small set of distinct values (currencies, statuses).
- Immutable instances.
- Lookup is read-mostly.
When caching hurts¶
- Cached instances pile up over time (memory leak).
- Identity-based comparisons break elsewhere (
==notequals). - Cache itself becomes a contention bottleneck (synchronized map).
See Java's Integer cache for an example of how caching factory methods can subtly affect identity semantics.
Encapsulate Downcast and JIT¶
Inside the method: one checkcast opcode. After JIT, often eliminated entirely (the JIT proves the cast is always valid).
Original (in N callers):
After Encapsulate Downcast: 1 cast (eliminated by JIT) instead of N (each potentially eliminated separately).
Net¶
Tiny code-size win. JIT effectively eliminates downcasts in steady state; main benefit is maintenance, not perf.
Varargs and array allocation¶
Java varargs:
public void log(String msg, Object... args) { ... }
log("hello {}", name); // allocates new Object[] { name }
Each call allocates an array of arguments.
Cost¶
For a hot logger called millions of times: GC pressure.
Mitigations¶
- Provide non-varargs overloads: SLF4J does this for the common 1-3 argument cases.
- Defer formatting:
log.info("hello {}", name);doesn't format unless info is enabled. - Conditional:
Modern approach¶
Java's String.format and MessageFormat allocate too. Use SLF4J / log4j parameterized logging — handles the lazy evaluation.
Go: receiver shape and copy cost¶
Go method on a struct:
func (o Order) Total() Money { ... } // value receiver: copies Order
func (o *Order) Total() Money { ... } // pointer receiver: passes pointer
For a 100-byte Order struct, value receiver = 100-byte memcpy per call.
Default¶
Use pointer receivers for any non-trivial struct. Use value receivers only for very small immutable types (e.g., time.Time, custom enum types).
Implication for refactoring¶
When applying Replace Constructor with Factory Method or Hide Method, the receiver style is part of the signature. Switching from value to pointer receiver changes the dispatch cost.
Python: keyword arguments and dict construction¶
Python keyword arguments cost more than positional:
When this matters¶
- 1M calls/sec in tight loops.
- Heavy use of
**kwargsforwarding.
Mitigations¶
- Use positional in hot paths.
- Use
__slots__on classes to reduce attribute access cost. - Profile with
cProfileto find hot calls.
Modern Python¶
Python 3.11+ optimized many of these. PyPy is much faster. Cython compiles to C with no kwargs overhead.
Review questions¶
- What's the dispatch cost difference between
privateandpublicmethods? - Why is throwing an exception per call slow?
- How can you skip stack trace capture?
- How does escape analysis save Parameter Object's allocation?
- When does Builder pattern allocation matter?
- What's the benefit of a caching factory method?
- What's the JIT effect on Encapsulate Downcast?
- Why are varargs sometimes slow?
- How do Go value vs. pointer receivers affect cost?
- When are Python keyword arguments slower than positional?