Factory Method — Optimize¶
10 inefficient implementations + benchmarks + optimized version + tradeoffs.
Table of Contents¶
- Optimization 1: Cache the Supplier instead of class lookups
- Optimization 2: Replace Reflection with LambdaMetafactory
- Optimization 3: Avoid megamorphism by splitting hot paths
- Optimization 4: Object pool for expensive products
- Optimization 5: Map.of vs ConcurrentHashMap for read-only registry
- Optimization 6: Cache prepared statements per type
- Optimization 7: Lazy import for heavy plugin chains (Python)
- Optimization 8: Specialize Go interface dispatch
- Optimization 9: Static factory methods for monomorphic call sites
- Optimization 10: Eliminate factory entirely for closed-set variants
Benchmarks: Apple M2 Pro, single thread.
Optimization 1: Cache the Supplier instead of class lookups¶
Slow code (Java)¶
public Object create(String type) throws Exception {
Class<?> c = Class.forName("com.example." + type);
return c.getDeclaredConstructor().newInstance();
}
Benchmark¶
Class.forName is slow on cold cache (~50 µs); even cached, the security and resolution path adds ~200 ns per call.
Optimized — pre-resolved Supplier¶
private static final Map<String, Supplier<Object>> FACTORIES = Map.of(
"Foo", Foo::new,
"Bar", Bar::new,
"Baz", Baz::new
);
public Object create(String type) {
Supplier<Object> s = FACTORIES.get(type);
if (s == null) throw new IllegalArgumentException(type);
return s.get();
}
Benchmark after¶
~60× speedup. Method references are JVM-internal optimizations; Supplier::get after JIT is a direct call.
Tradeoff¶
You must enumerate types at compile time. For genuinely dynamic types (loaded from JARs at runtime), reflection is unavoidable — but cache the Constructor<?> after first lookup.
Optimization 2: Replace Reflection with LambdaMetafactory¶
Slow code (Java)¶
public class DynamicFactory {
private final Constructor<?> ctor;
public DynamicFactory(Class<?> c) throws Exception {
this.ctor = c.getDeclaredConstructor();
}
public Object create() throws Exception {
return ctor.newInstance();
}
}
Benchmark¶
Reflective newInstance has overhead per call.
Optimized — LambdaMetafactory¶
import java.lang.invoke.*;
import java.util.function.Supplier;
public class FastFactory {
private final Supplier<?> supplier;
public FastFactory(Class<?> c) throws Throwable {
MethodHandles.Lookup lookup = MethodHandles.lookup();
MethodHandle ctor = lookup.findConstructor(c, MethodType.methodType(void.class));
CallSite site = LambdaMetafactory.metafactory(
lookup,
"get",
MethodType.methodType(Supplier.class),
MethodType.methodType(Object.class),
ctor,
MethodType.methodType(c)
);
this.supplier = (Supplier<?>) site.getTarget().invoke();
}
public Object create() { return supplier.get(); }
}
Benchmark after¶
~30× speedup. LambdaMetafactory generates an invokedynamic call site that the JIT inlines.
Tradeoff¶
- Setup is more complex.
- API may change between JDK versions (mostly stable since Java 8).
- Worth it for hot paths, not for one-off creations.
Optimization 3: Avoid megamorphism by splitting hot paths¶
Slow code¶
public class HandlerDispatch {
public void dispatch(Request req, HandlerFactory factory) {
Handler h = factory.create(req.type); // 50+ Concrete Creators
h.handle(req);
}
}
JIT sees 50 different Concrete Creators at this site. Cannot inline; falls back to vtable. ~5 ns per dispatch.
Optimized — split hot vs cold paths¶
public class HandlerDispatch {
private final HandlerFactory hotFactory; // 1-2 hot types
private final HandlerFactory coldFactory; // everything else
public void dispatch(Request req) {
Handler h;
if (req.type.equals("/login") || req.type.equals("/api/v1/users")) {
h = hotFactory.create(req.type); // monomorphic call site
} else {
h = coldFactory.create(req.type); // megamorphic, but only 5% of traffic
}
h.handle(req);
}
}
Benchmark¶
| Naive | Split | |
|---|---|---|
| Hot path call | 5 ns | 1 ns |
| Cold path call | 5 ns | 5 ns |
| 95% hot, 5% cold | 5 ns avg | 1.2 ns avg |
~4× speedup on aggregate for skewed traffic.
Tradeoff¶
- Code duplication: two factory paths.
- Profile-driven: requires production data.
- Worth it for genuinely hot, skewed paths.
Optimization 4: Object pool for expensive products¶
Slow code (Java)¶
public class ConnectionFactory {
public Connection create(String dsn) throws SQLException {
return DriverManager.getConnection(dsn); // ~50 ms TCP + auth
}
}
Each request opens a new connection. Average request time: 50 ms baseline.
Optimized — pool¶
public class ConnectionFactory {
private final ArrayBlockingQueue<Connection> pool = new ArrayBlockingQueue<>(20);
private final String dsn;
public ConnectionFactory(String dsn, int prefill) throws SQLException {
this.dsn = dsn;
for (int i = 0; i < prefill; i++) {
pool.offer(DriverManager.getConnection(dsn));
}
}
public Connection borrow() throws SQLException {
Connection c = pool.poll();
return c != null ? c : DriverManager.getConnection(dsn);
}
public void release(Connection c) {
if (!pool.offer(c)) {
try { c.close(); } catch (Exception ignored) {}
}
}
}
Benchmark¶
| New per request | Pooled | |
|---|---|---|
| Average request | 50 ms | 0.1 ms |
| Connection setup amortized | n/a | once per app lifetime |
500× speedup for the connection-establishment phase.
In production, prefer a battle-tested pool: HikariCP (Java), pgxpool (Go), asyncpg.Pool (Python).
Tradeoff¶
- Connection state must be reset between borrows.
- Pool sizing requires tuning.
- Long-idle connections can be dropped by the DB; need health checks.
Optimization 5: Map.of vs ConcurrentHashMap for read-only registry¶
Slow code¶
private static final Map<String, Supplier<Handler>> REG = new ConcurrentHashMap<>();
static {
REG.put("a", AHandler::new);
REG.put("b", BHandler::new);
// ...50 entries
}
ConcurrentHashMap.get() is fast but has internal coordination overhead.
Benchmark¶
Optimized — immutable Map¶
private static final Map<String, Supplier<Handler>> REG = Map.of(
"a", AHandler::new,
"b", BHandler::new,
// ... up to 10 entries; otherwise Map.ofEntries
);
Or for >10:
private static final Map<String, Supplier<Handler>> REG = Map.ofEntries(
Map.entry("a", AHandler::new),
Map.entry("b", BHandler::new),
// ...
);
Benchmark after¶
2× speedup on the lookup. Map.of returns a specialized read-only implementation with no synchronization overhead.
Tradeoff¶
- Cannot register new entries at runtime.
- Compile-time cap of 10 entries for
Map.of(useMap.ofEntriesfor more). - For startup-only registries, this is the right call.
Optimization 6: Cache prepared statements per type¶
Slow code (Python)¶
class QueryFactory:
def create(self, table: str) -> str:
return f"SELECT * FROM {table} WHERE id = ?"
# Caller does:
for tbl in ["users", "orders", "items"] * 1000:
sql = QueryFactory().create(tbl)
cursor.execute(sql, (id,))
Each call rebuilds the f-string and recompiles the SQL.
Optimized — cache¶
from functools import lru_cache
class QueryFactory:
@lru_cache(maxsize=128)
def create(self, table: str) -> str:
return f"SELECT * FROM {table} WHERE id = ?"
Benchmark¶
| No cache | LRU cache | |
|---|---|---|
| Per call | 800 ns | 80 ns |
10× speedup on the factory call.
For SQL: pair with prepared-statement caching at the DB driver level (psycopg's prepared_statement_cache_size).
Tradeoff¶
- Cache size must bound memory.
- Cache invalidation on schema changes.
- Pure functions are easiest to cache.
Optimization 7: Lazy import for heavy plugin chains (Python)¶
Slow code¶
# myapp/plugins/__init__.py
from .csv import CsvPlugin
from .xml import XmlPlugin
from .pdf import PdfPlugin # 200 ms to import (loads reportlab, etc.)
from .video import VideoPlugin # 500 ms to import (loads cv2)
PLUGINS = {
"csv": CsvPlugin,
"xml": XmlPlugin,
"pdf": PdfPlugin,
"video": VideoPlugin,
}
def create(kind: str):
return PLUGINS[kind]()
App startup: 700 ms even for users who only need CSV.
Optimized — lazy factories¶
# myapp/plugins/__init__.py
import importlib
_FACTORIES = {
"csv": ("myapp.plugins.csv", "CsvPlugin"),
"xml": ("myapp.plugins.xml", "XmlPlugin"),
"pdf": ("myapp.plugins.pdf", "PdfPlugin"),
"video": ("myapp.plugins.video", "VideoPlugin"),
}
def create(kind: str):
module_name, class_name = _FACTORIES[kind]
module = importlib.import_module(module_name)
return getattr(module, class_name)()
Benchmark¶
| Eager imports | Lazy imports | |
|---|---|---|
| Startup time | 700 ms | 50 ms |
| First call to PDF plugin | 0 ms | 200 ms |
| First call to CSV plugin | 0 ms | 0 ms (already loaded as part of factory module) |
14× faster startup. Cold-path plugins pay their cost only when used.
Tradeoff¶
- First call latency increases.
- Errors in plugin imports are deferred — may surprise late.
- Worth it for CLI tools and long-running processes alike.
Optimization 8: Specialize Go interface dispatch¶
Slow code¶
type Hasher interface{ Sum(data []byte) []byte }
type sha256H struct{}
func (sha256H) Sum(d []byte) []byte { /* ... */ return nil }
type md5H struct{}
func (md5H) Sum(d []byte) []byte { /* ... */ return nil }
func NewHasher(kind string) Hasher {
if kind == "sha256" { return sha256H{} }
return md5H{}
}
// Hot loop
for _, x := range data {
h := NewHasher("sha256")
h.Sum(x) // interface call, no inlining
}
Each h.Sum(x) goes through itab dispatch.
Optimized — concrete return for hot path¶
// Two functions: one returns concrete, one returns interface.
func NewSHA256() sha256H { return sha256H{} }
// Hot loop uses concrete
h := NewSHA256()
for _, x := range data {
h.Sum(x) // direct call, inlined
}
Benchmark¶
| Interface | Concrete | |
|---|---|---|
h.Sum per call | 3.5 ns | 0.5 ns |
7× speedup on hot loops.
Tradeoff¶
- Loses polymorphism — caller must know the concrete type.
- Acceptable when type is fixed for the loop's duration.
- Don't apply broadly; the compiler can't know which call sites are hot.
Optimization 9: Static factory methods for monomorphic call sites¶
Slow code (Java)¶
abstract class Logger {
abstract LogEntry create(String msg);
}
class StdLogger extends Logger {
LogEntry create(String msg) { return new LogEntry(msg); }
}
// Single-typed call site
Logger logger = new StdLogger();
LogEntry e = logger.create("hi");
Even monomorphic, the dispatch goes through INVOKEVIRTUAL. JIT inlines after warmup, but cold call sites pay full cost.
Optimized — static factory method¶
public final class LogEntry {
public static LogEntry of(String msg) {
return new LogEntry(msg);
}
private LogEntry(String msg) { /* ... */ }
}
// Caller
LogEntry e = LogEntry.of("hi"); // INVOKESTATIC, no vtable
Benchmark¶
| Virtual factory | Static factory | |
|---|---|---|
| Cold call | 5 ns | 1 ns |
| Hot (JIT inlined) | 1 ns | 1 ns |
5× faster on cold paths.
Tradeoff¶
- No polymorphism (cannot override).
- For closed-set products, static factories (
Integer.valueOf,List.of) are ideal. - Don't confuse with GoF Factory Method — static factory is a different pattern.
Optimization 10: Eliminate factory entirely for closed-set variants¶
Slow code (Python)¶
class ColorFactory:
@staticmethod
def create(name: str) -> "Color":
if name == "red": return Red()
if name == "green": return Green()
if name == "blue": return Blue()
raise ValueError(name)
c = ColorFactory.create("red")
For three fixed colors, the factory adds boilerplate without flexibility.
Optimized — direct enum¶
from enum import Enum
class Color(Enum):
RED = (255, 0, 0)
GREEN = (0, 255, 0)
BLUE = (0, 0, 255)
c = Color.RED # zero overhead
Benchmark¶
| Factory call | Enum access | |
|---|---|---|
| Per access | 250 ns | 50 ns |
5× faster. Plus: type-safe at the source level.
Tradeoff¶
- Enums are static — cannot extend at runtime.
- Add new colors = recompile.
- For closed sets, this is the right design.
Optimization Tips¶
How to find Factory Method bottlenecks¶
- Profile.
async-profiler(Java),pprof(Go),py-spy(Python). - Look for time in
Class.forName,newInstance,Constructor.getDeclared*. - JIT warnings: Java's
-XX:+PrintInliningreports failed inlines (megamorphism). - GC pressure: factories that allocate a lot show up in alloc profiles.
- Benchmark before and after. Don't assume.
Optimization checklist¶
- Cache
Supplier/Functioninstead ofClass.forName. - Use
LambdaMetafactoryfor dynamic creation on hot path. - Avoid megamorphic call sites; split hot paths.
- Pool expensive products instead of recreating.
- Use immutable maps (
Map.of) for read-only registries. - Cache pure factories with
lru_cache. - Lazy-import heavy plugins.
- Return concrete types in Go hot loops.
- Static factory methods for monomorphic, closed-set products.
- Replace factory with enum if variants are fixed.
Anti-optimizations¶
- ❌ Caching mutable products. Causes shared-state bugs.
- ❌ Reflection on the hot path. Use MethodHandle / LambdaMetafactory.
- ❌ Premature pooling. Most objects are cheap to construct.
- ❌ Eager initialization for cold paths. Lazy is often better.
- ❌ Switching to interface for hot loops in Go. Concrete is faster.
← Find-Bug · Creational · Roadmap
Factory Method roadmap complete. All 8 files: junior · middle · senior · professional · interview · tasks · find-bug · optimize.
Next pattern: Abstract Factory (pending).