Simplifying Conditionals — Optimize¶
12 cases where refactors are correct but introduce a perf cost.
Optimize 1 — Decompose Conditional adds method-call overhead in CPython hot loop (Python)¶
For 10M items, that's 10M extra Python method calls.
Cost & Fix
CPython method calls cost ~100ns each. 10M × 100ns = 1 second of overhead. **Fix options:** 1. Inline for hot inner loops: 2. Use a list comprehension or generator (CPython optimizes these): 3. Vectorize with NumPy / pandas / polars. For PyPy / JIT'd code: irrelevant.Optimize 2 — Replace Conditional with Polymorphism creates megamorphic site (Java)¶
abstract class Discount { abstract double rate(); }
class FlatTen extends Discount { double rate() { return 0.10; } }
// ... and 30 other Discount subclasses
In a hot loop:
Cost & Fix
If `o.discount().rate()` sees 30+ types, the call site is megamorphic. JIT falls back to vtable lookup. ~5-15× slower than monomorphic. **Fix options:** 1. **Reduce variety:** can rates be parameters? `class Discount { final double rate; }` — one type, parameterized. 2. **Specialize hot paths:** if 90% of discounts are FlatTen, special-case it: 3. **Use enum + abstract method:** for closed sets, enum dispatch is often faster.Optimize 3 — Guard clauses + Decompose adds inlining pressure (Java)¶
double process(Order o) {
if (isInvalid(o)) return Money.ZERO;
if (isCancelled(o)) return Money.ZERO;
if (isOnHold(o)) return Money.ZERO;
if (isFraud(o)) return Money.ZERO;
return computePrice(o);
}
Each is* is a separate method.
Cost & Fix
Each is small enough to inline (probably < 35 bytes). HotSpot inlines them all into `process`. After inlining, `process` may exceed `FreqInlineSize` (325 bytes), preventing it from being inlined into its callers. **Fix:** Re-inline to keep `process` small if needed: Or accept the inlining cliff if the perf cost is minor. Profile.Optimize 4 — Pattern matching instanceof chain (Java 21)¶
return switch (s) {
case Circle c -> ...;
case Square sq -> ...;
case Triangle t -> ...;
case Pentagon p -> ...;
case Hexagon h -> ...;
// ... 20 cases
};
Cost & Fix
For sealed types with N cases, the JIT generates an `instanceof` chain. Linear in N for the unmatched path. For 5 cases: fast. For 50: starts costing. **Fix:** 1. Sort cases by frequency. Most-common first. 2. For very many cases, use a `MapOptimize 5 — Null Object instance allocation in Go (Go)¶
type NullCustomer struct{}
func (NullCustomer) Name() string { return "guest" }
func GetCustomer(id int) Customer {
if /* not found */ { return NullCustomer{} }
return realCustomer
}
Cost & Fix
Each call to `GetCustomer` returning NullCustomer **may** allocate a new value. Escape analysis usually catches it (NullCustomer{} is empty, no fields), but variations may not. **Fix:** Singleton. Or use a pointer if the interface dispatch matters:Optimize 6 — Consolidate Conditional with side effects (Java)¶
if (auditLog.isEnabled()) auditLog.write(msg); // ❌ extra call
if (cacheStats.shouldUpdate()) cacheStats.bump();
becomes
if (auditEnabled() || cacheUpdateNeeded()) emit(...);
private boolean auditEnabled() { return auditLog.isEnabled(); } // duplicate work
Cost & Fix
The "consolidation" loses short-circuit semantics. Both methods may be called even when one alone would have sufficed. **Fix:** Don't consolidate ifs with side effects. Keep them separate.Optimize 7 — Replace Conditional with Polymorphism: extra allocation per call (Java)¶
abstract class PaymentMethod { abstract void charge(Money m); }
new CreditCard("4111...").charge(m); // ❌ allocates per call if not memoized
Cost & Fix
If you're constructing a fresh `CreditCard` per request just to dispatch, you've added an allocation. **Fix:** Reuse instances (typically held in a registry / DI container). For a one-off: Or, if the method object holds parameters per-request: that allocation is necessary; ensure escape analysis can elide it.Optimize 8 — Decompose Conditional with virtual call (Java)¶
abstract class OrderRule { abstract boolean isEligible(Order o); }
List<OrderRule> rules = ...;
for (OrderRule rule : rules) {
if (rule.isEligible(o)) ...; // ❌ virtual call per iteration
}
Cost & Fix
Each `rule.isEligible` is a virtual call. For 10K orders × 20 rules, that's 200K virtual calls. If polymorphic across types, JIT can't inline. **Fix options:** 1. **Compile rules to a single predicate:** The combined predicate is *one* method call per order. 2. **For hot paths:** specialize (code-gen the rule check). 3. **Profile:** the cost may be invisible.Optimize 9 — Remove Control Flag changes loop optimizations (Java)¶
vs.
Cost & Fix
In tight numeric loops, JIT optimizers (vectorization, loop unrolling) sometimes prefer simpler termination conditions. The `&& !done` form may inhibit some optimizations. **Fix:** Prefer `break` form or extract a method that returns. Modern JIT handles both equally for small loops; for hot numerical loops, profile.Optimize 10 — Introduce Assertion on hot path (Java)¶
double charge(double amount, double rate) {
assert amount >= 0;
assert rate >= 0 && rate < 1;
return amount * rate;
}
Cost & Fix
With `-ea`, both assertions run per call. For 1M calls, that's 4M comparisons of overhead. In production, `-ea` is typically off. Cost is zero. **Fix:** No fix needed if production runs without `-ea`. If you want production checks, use `Objects.requireNonNull` (always on) — but only at boundaries.Optimize 11 — Decision table allocates per call (Java)¶
String tier(double spend, int years) {
record Rule(BiPredicate<Double, Integer> p, String tier) {}
return Stream.of(
new Rule((s, y) -> y > 10 && s > 5000, "GOLD"),
new Rule((s, y) -> s > 2000, "GOLD"),
// ... 30 rules
).filter(r -> r.p.test(spend, years))
.findFirst()
.map(Rule::tier)
.orElse("NONE");
}
Cost & Fix
Each call: - Allocates the array of Rules. - Creates a Stream. - Allocates lambda captures. For 10K calls/sec: ~MB/sec of garbage. **Fix:** Hoist the rules to a static field: Static rules + explicit loop. No per-call allocation.Optimize 12 — Switch to polymorphism breaks JIT-specialized switch (Java)¶
For a tableswitch case, the JIT generates ~1 cycle dispatch. For polymorphism on an enum, also fast (enum constants are singletons; call is monomorphic if all instances are the same type).
Cost & Fix
Both are fast. The polymorphism version may be slightly slower if the enum's `factor()` is non-trivial (small JIT inlining limit). **Fix:** Keep the simple switch when: - The set of cases is closed. - Each case is a one-liner. - Adding a new case is rare. Don't introduce polymorphism for clarity if the simple switch is clearer. Refactoring isn't always toward more abstraction.Patterns¶
| Refactor | Cost |
|---|---|
| Decompose Conditional in CPython hot loop | Per-iteration call overhead |
| Polymorphism on megamorphic site | Vtable lookup per call |
| Stream-based decision table | Per-call allocation |
| Inlined helpers exceeding inline budget | Caller no longer inlined |
Assertion in hot path with -ea | Comparison per call |