Interview

Interview questions on mental models of systems — the internal representations engineers use to predict behavior, how to build and validate them, and how they fail (drift, divergence, happy-path-only). Answers are short and precise, with the trap and a follow-up where it sharpens the point. These probe reasoning quality, not trivia.

Q1. What is a mental model, and what's the test of whether yours is good?¶

A mental model is a simplified internal representation of a system that lets you predict its behavior without running it. The test is prediction, not detail: a good model correctly answers "if I do X, what happens?" for the questions you care about. Detail is not the goal — a useful model deliberately omits most of the system, the way a useful map omits most of the terrain.

Trap: candidates equate "good model" with "complete model." Completeness is impossible and useless; predictive accuracy for the relevant questions is the bar.

Q2. There's a common claim: "you debug your model, not the system." Explain it.¶

When you debug, the running system is doing exactly what it's programmed to do. The bug is the gap between your model's prediction and the system's actual behavior. The feeling of being "lost" while debugging is the symptom of a wrong model — nothing makes sense because you're reasoning from a picture that doesn't match reality. The fix is to rebuild the model until it predicts what you're observing, not to change code at random.

Follow-up — "So how do you get un-lost?" Stop changing things. Re-trace from a point you're certain of and move toward the failure until prediction and reality split — that split is the bug, and it's also the part of your model that was wrong.

Q3. How do you build a mental model of an unfamiliar system?¶

Concrete, evidence-driven moves (this is hypothesis-driven thinking applied to understanding):

Read the code — names lie, code doesn't.
Trace one real request end to end — entry → middleware → controller → data layer → response. Write the path down.
Draw the diagram — boxes and arrows force the fuzzy picture into something checkable; the arrow you can't label is the gap.
Run a small experiment — a log line, a breakpoint, a curl — to test a specific assumption.
Add the failure branches — for each box, ask what happens when it's slow, down, or returns garbage.

Trap: answering "I'd read the docs." Docs drift; the highest-value move is tracing a real request through the actual code.

Q4. State Little's Law and give an engineering application.¶

L = λ × W: average items in the system = arrival rate × average time in system. Application: a service handles λ = 2000 req/s at average latency W = 50 ms, so in-flight requests L = 2000 × 0.05 = 100 — you need ~100 concurrent slots (threads/connections). Or invert it: a 50-connection pool holding each request 100 ms gives a hard ceiling of λ = L/W = 50/0.1 = 500 req/s.

Trap: forgetting it holds in steady state (long-run inflow = outflow). You can't plug in instantaneous numbers mid-spike while the queue is still growing.

Q5. A connection pool has 30 connections; each query holds one for 200 ms. What's the max sustainable throughput, and what happens above it?¶

λ = L / W = 30 / 0.2 = 150 queries/s. Above 150 q/s, acquisitions exceed releases, so the "waiting requests" stock grows: callers block on pool acquisition, latency climbs, and you eventually see pool exhausted errors or upstream timeouts. The fix is to raise outflow (faster queries, more connections — if the DB can take it) or lower inflow (shed load, cache).

Follow-up: "Why not just set the pool to 1000?" Because the database has its own L=λW ceiling; a huge pool just moves the queue from your app to the DB and can thrash it.

Q6. Explain stocks and flows, and why they predict outages.¶

From Donella Meadows: a stock accumulates (queue depth, connections in use, bytes in a buffer, consumer lag); a flow is the rate that changes it (req/s in, jobs/s out). The governing rule: Δstock = inflow − outflow. Nearly every saturation outage is a stock filling because inflow exceeded outflow long enough — request queue, disk, Kafka lag, thread pool. Once you see it as a stock, the only fixes are "drain faster" or "fill slower." There is no third option.

Q7. Why must a mental model include failure behavior, not just the happy path?¶

Because your model is most needed during an incident, and an incident is by definition off the happy path. A model that only knows "request → data → response" is silent exactly when an outage is happening. For every component you should be able to answer what happens when it's slow, down, or returns bad data — including dynamics, like a slow dependency saturating a pool and triggering a retry storm (a reinforcing feedback loop).

Trap: treating failure as "it throws an error." The interesting failures are propagating ones — cascades, retry amplification, partial degradation.

Q8. What is model drift, and how do you defend against it?¶

Drift is the gap that opens between your model and the system as the system changes but your model doesn't. Symptoms: stale runbooks, "it used to work that way," wiki diagrams showing an architecture two reorgs old, configs tuned for vanished load profiles. It's dangerous because the stale model keeps giving confident answers — they're just wrong now. Defenses: treat surprises as drift detectors, update the artifact (diagram/runbook) not just your head, prefer self-updating maps (service topology from traces) over hand-drawn ones, and run postmortems as drift audits.

Q9. "The map is not the territory." What's its practical consequence for an engineer?¶

Your diagram is always a simplification and always a bit out of date, so you must hold it with calibrated confidence: firm enough to make decisions quickly, loose enough to drop it the instant evidence contradicts it. Over-trusting the map means you keep fixing the wrong component; under-trusting it means paralysis. The skill is knowing which parts of your model are well-validated and which are assumptions, and spending verification on the load-bearing assumptions.

Q10. Name reusable mental models every backend engineer should carry.¶

Memory hierarchy + latency numbers (Jeff Dean): RAM ~100 ns, SSD ~100 µs, cross-continent network ~150 ms — orders of magnitude that explain caching and why N+1 queries kill you.
Little's Law — concurrency, pool sizing, throughput ceilings.
Stocks & flows (Meadows) — why any queue grows or drains.
CAP / PACELC — what you trade under a partition.
USE method (Gregg): Utilization, Saturation, Errors per resource. RED: Rate, Errors, Duration per service.
The end-to-end argument (Saltzer, Reed, Clark) — guarantees belong at the endpoints that care.

Q11. How do USE and RED function as mental models rather than just metrics?¶

They're diagnostic checklists encoded as models, so you never stare blankly at a slow system. USE (for resources): check Utilization, Saturation, Errors — a pool at 100% utilization with growing saturation is the bottleneck. RED (for services): Rate, Errors, Duration give a complete health picture. Carrying them means "the system is slow" has a deterministic next move instead of a guess.

Follow-up: "USE vs RED — when each?" USE for resources (CPU, disk, pool, NIC); RED for request-serving services. Use both: RED localizes the slow service, USE finds the saturated resource inside it.

Q12. Two engineers agree in a design review but the design fails in production. What happened, in mental-model terms?¶

Silent model divergence: they "agreed" while picturing two different systems. The disagreement was real but hidden, surfacing only when code met production. The senior/staff defense is to force models out of heads onto the whiteboard — "draw the failure path," "where does idempotency live?", "what's the L=λW ceiling?" — so divergence is exposed and reconciled when it's cheap, not during an outage.

Q13. The system surprised you. What's the correct response?¶

A surprise is your model reporting a bug in itself — the most informative event available, because it points exactly where your understanding is wrong. The correct response is curiosity, not annoyance: "my model said X, reality said Y — why?" Then update the model (and the shared artifact if it's a documented model). "That's weird, moving on" wastes the single best learning signal you get.

Onboarding is, almost entirely, transferring the senior engineers' mental models into the new hire's head. The deliverable isn't "read the docs" — it's: here's the context diagram, trace one real request, here's the failure table for the top dependencies. A good diagram does in an hour what code-reading does in a week. A new hire is productive precisely when their model becomes accurate enough to predict. Bonus: a new hire's confusion is the cheapest org-wide drift sensor you have — fresh eyes hit doc-vs-reality gaps first.

Q15. CAP is often called a "mental model." What are its limits, and what do you reach for instead?¶

CAP is a coarse entry-level frame: under a network partition you choose between consistency and availability. Its limits: it's only about partitions and treats C/A/P as binary. In practice you reason with PACELC (else, when there's no partition, you trade Latency vs Consistency) and with your store's actual documented guarantees (read-your-writes? monotonic reads? bounded staleness?). CAP gets you in the door; precise consistency models make real decisions.

Trap: stating "CAP means pick two." That phrasing is misleading — partition tolerance isn't optional in a distributed system, so the real choice under a partition is C vs A.