Idempotent Operations — Interview¶
A focused question bank for idempotency in distributed communication: the property, why retries force it on you, HTTP semantics, idempotency keys and dedup stores, natural idempotency, the "effectively-once" identity, concurrency races, and an end-to-end payment/consumer design. Answers are written to be said out loud in an interview.
Table of Contents¶
- Q1: Define idempotency formally
- Q2: Why does idempotency matter at all?
- Q3: Which HTTP methods are idempotent, and safe?
- Q4: Is POST ever idempotent? How do you make it so?
- Q5: How does an idempotency key + dedup store work?
- Q6: What is natural idempotency? Give examples.
- Q7: Effectively-once vs exactly-once — what is really possible?
- Q8: Business idempotency vs HTTP idempotency
- Q9: How do concurrent duplicates race, and how do you make dedup atomic?
- Q10: How do you handle the in-flight / pending case?
- Q11: TTL / dedup-window trade-offs — how long do you keep keys?
- Q12: Who generates the key, and what should it cover?
- Q13: Idempotency vs commutativity vs associativity
- Q14: Design an idempotent payment endpoint
- Q15: Design an idempotent message consumer (Kafka)
- Q16: Rapid-fire / red flags
Q1: Define idempotency formally¶
An operation
fis idempotent if applying it more than once has the same effect on system state as applying it exactly once:f(f(x)) = f(x), and more generallyfⁿ(x) = f(x)for alln ≥ 1. The response may differ (a retry may see "already done"), but the observable state after the second call equals the state after the first.The distinction that trips people up: idempotency is about state convergence, not about returning the identical bytes.
SET x = 5is idempotent (state is 5 no matter how many times you run it);x = x + 5is not (state grows).DELETE /user/42is idempotent (after the first call the user is gone; further calls keep it gone) even though the first returns200and the rest return404.
Q2: Why does idempotency matter at all?¶
Because in any real network you cannot achieve reliable delivery without retries, and retries mean duplicates. A client sends a request; the server processes it; the ACK is lost in the network. The client cannot tell "my request never arrived" from "my request arrived but the reply was lost" — this is the two generals reality. Its only safe move is to retry. So every at-least-once channel (TCP retransmit, HTTP client retry, message broker redelivery, at-least-once queue) will occasionally deliver the same logical operation twice or more.
Idempotency is what makes those unavoidable duplicates harmless. If processing a message twice charges a card twice or ships two orders, the system is incorrect under normal, expected failure. If processing is idempotent, duplicates collapse to a single effect and you can retry aggressively and sleep at night.
Q3: Which HTTP methods are idempotent, and safe?¶
Per the HTTP semantics spec (RFC 9110 §9.2), a method is idempotent if the intended effect of multiple identical requests is the same as a single one, and safe if it is essentially read-only.
| Method | Safe | Idempotent | Note |
|---|---|---|---|
| GET | ✅ | ✅ | read-only; no state change |
| HEAD | ✅ | ✅ | GET without body |
| OPTIONS | ✅ | ✅ | metadata |
| PUT | ❌ | ✅ | full replace → same final state on repeat |
| DELETE | ❌ | ✅ | resource gone after first; repeats keep it gone |
| POST | ❌ | ❌ | "process this" — may create N resources on N calls |
| PATCH | ❌ | ❌ | not guaranteed (e.g. qty += 1); can be made idempotent |
Key nuances an interviewer probes: (1) safe implies idempotent, but not vice versa — PUT/DELETE mutate yet are idempotent. (2) Idempotency is a contract, not something the protocol enforces: a badly written
PUThandler that doescount++violates the contract. (3) It's about the effect on the resource, not the status code — DELETE returning404on the second call is still idempotent because the resource state is unchanged.
Q4: Is POST ever idempotent? How do you make it so?¶
By definition POST is not idempotent — its semantics are "the server decides how to process this," and the canonical case is "create a new resource each time," so two POSTs create two resources. But POST is exactly the method you most need to make retry-safe (payments, orders, sign-ups).
You make POST effectively idempotent with an idempotency key: the client attaches a unique key (
Idempotency-Keyheader) that identifies the logical operation. The server records processed keys; a retry with the same key returns the original result instead of re-executing. This is exactly how Stripe's payments API works. The alternative is to redesign the operation as aPUTto a client-chosen ID (PUT /orders/{client_uuid}), which gets natural idempotency for free — but that only works when the client can own the resource identity.
Q5: How does an idempotency key + dedup store work?¶
The client generates a unique key per logical operation and sends it (usually a header). The server keeps a dedup store keyed by that value and follows a claim-then-execute protocol:
- Atomically claim the key (insert
PENDING, orSET NX). If the claim fails, this is a duplicate — go to step 5.- Execute the operation (charge, write, publish).
- Persist the result against the key and mark it
COMPLETED.- Return the result to the caller.
- On a duplicate: if the stored state is
COMPLETED, replay the stored response; if it's stillPENDING, the first request is in flight — wait/retry or return409 Conflict.The load-bearing detail is step 1: the claim must be atomic with respect to the execute+store, otherwise two concurrent duplicates both pass the "have I seen this?" check (see Q9). The dedup store is typically Redis (
SET NX PX) for speed or a DB table with aUNIQUE(idempotency_key)constraint for durability — often both.
Q6: What is natural idempotency? Give examples.¶
Natural (or intrinsic) idempotency is when the operation is idempotent because of how it's shaped, so you need no separate key or dedup store. You lean on the data model to collapse duplicates for you.
- Upsert / SET semantics —
INSERT ... ON CONFLICT DO UPDATEorPUTwith the full desired state. Applying the same desired state twice yields the same row.- Unique constraint —
UNIQUE(order_ref)in the DB. The second insert of the same business key fails on the constraint; you catch it and treat it as "already done."- Absorbing/idempotent updates —
UPDATE ... SET status='shipped' WHERE id=?(setting, not incrementing);SADDto a set;bitmap.set(userId). Re-applying changes nothing.- Conditional writes / CAS —
UPDATE ... WHERE version = expected(optimistic concurrency); the second attempt no-ops because the version already moved.- Content-addressed writes — store under
hash(content); the same content maps to the same location, so re-writing is a no-op.Prefer natural idempotency when you can: it removes an entire moving part (the key store) and its TTL/eviction concerns. Reach for explicit idempotency keys only when the operation has no natural business identity or has external side effects (a real card charge) that a unique constraint alone can't guard.
Q7: Effectively-once vs exactly-once — what is really possible?¶
Exactly-once delivery is impossible over an unreliable network. The sender can never distinguish a lost message from a lost ACK, so it must either risk losing the message (at-most-once) or risk sending it again (at-least-once). There is no third channel-level option — this is the two-generals result.
What you can achieve is effectively-once (a.k.a. exactly-once processing / semantics):
You accept that the network delivers duplicates, then you make the effect singular on the consumer side using a dedup key or natural idempotency. So-called "exactly-once" features (Kafka transactions + idempotent producer, Flink checkpointed sinks) don't repeal physics — they implement this same recipe: dedup by producer sequence number and atomically commit offsets with output. The interview-winning framing: don't chase exactly-once delivery; engineer at-least-once + idempotent processing.
| Guarantee | What it means | Failure behavior | Requires |
|---|---|---|---|
| At-most-once | deliver ≤ 1 time | may lose messages | fire-and-forget; ACK before process |
| At-least-once | deliver ≥ 1 time | may duplicate | retry until ACKed; process before commit |
| Exactly-once delivery | deliver == 1 | — | impossible over unreliable network |
| Effectively-once | effect applied once | correct | at-least-once + idempotent consumer |
Q8: Business idempotency vs HTTP idempotency¶
HTTP idempotency is a protocol-level property of a method on a resource over a single request. Business idempotency is about the domain effect: "this customer's order #123 is placed at most once," regardless of transport, retries, or how many services touch it.
They diverge in practice:
- A method can be HTTP-idempotent yet business-wrong: two
PUTs with different bodies to the same URL each "succeed" per HTTP, but semantically last-writer-wins may clobber a legitimate concurrent change.- A method can be non-idempotent in HTTP (
POST /charges) yet business-idempotent because you added an idempotency key mapping to a business operation.- Business idempotency often spans multiple services and a whole workflow (order → payment → fulfillment), which no single HTTP method can express. You enforce it with a stable business key (order reference, request id) carried end-to-end and checked at each side-effecting step.
Rule of thumb: HTTP idempotency is necessary hygiene for retry-safe APIs; business idempotency is the actual correctness requirement, and it's usually enforced with a unique business key + dedup, not by the HTTP verb alone.
Q9: How do concurrent duplicates race, and how do you make dedup atomic?¶
The classic bug is a check-then-act (TOCTOU) race. Two duplicate requests with the same key arrive nearly simultaneously:
Req A: SELECT key → not found ┐ both read "not found" Req B: SELECT key → not found ┘ Req A: charge card; INSERT key Req B: charge card; INSERT key → DOUBLE CHARGEThe read and the write aren't atomic, so both pass the guard. Fixes, in order of preference:
- Atomic claim primitive —
INSERT ... ON CONFLICT DO NOTHING/ rely onUNIQUE(key), or RedisSET key val NX PX ttl. Exactly one caller wins the insert; the loser gets a conflict and takes the duplicate path. The DB/Redis does the mutual exclusion.- Serialize on the key — a row lock (
SELECT ... FOR UPDATEon the key row) or a distributed lock per key, so duplicates queue instead of interleave.- Transactionally couple claim + effect — put the
INSERT keyand the state change in one DB transaction so either both land or neither does. If the effect is external (a real charge), you can't put it in the DB transaction, so use claim-first: insertPENDINGbefore calling the payment gateway, and reconcile if you crash between.The unifying rule: let the datastore's atomic operation be the arbiter — never a read-then-write in application code.
Q10: How do you handle the in-flight / pending case?¶
This is the subtle part beyond "have I seen this key." When a duplicate arrives while the first request is still executing (status
PENDING), you must not re-execute and must not return a wrong answer. Options:
- Return
409 Conflict/425 Too Earlywith aRetry-After— tell the client "your operation is being processed, try again shortly." Simple and honest.- Block on the lock — the duplicate waits on the same per-key lock and, once the first completes, reads and replays the stored result. Cleaner UX, but ties up a request slot.
- Fingerprint the request body — store a hash of the original payload with the key. If a retry arrives with the same key but a different body, that's a client bug: reject with
422. This prevents a reused key from silently returning the wrong resource.You also need a crash-recovery / lease story: if the process holding a
PENDINGclaim dies, the key must not stay poisoned forever. GivePENDINGa lease/TTL and a reconciliation job that checks the downstream (e.g. queries the payment gateway by the idempotency key) to decide whether the effect actually happened before releasing or completing the key.
Q11: TTL / dedup-window trade-offs — how long do you keep keys?¶
Keys can't live forever — that's an unbounded, ever-growing store. The dedup window is a classic durability-vs-cost trade-off:
Window too short Window too long A late retry (client backoff, offline mobile, DLQ replay) arrives after the key expired → treated as new → duplicate effect Store grows large; higher memory/storage and lookup cost Cheap, small store Retains PII/business data longer than needed Sizing rules: - The window must exceed your maximum realistic retry horizon: client retry budget + broker redelivery window + max time a message can sit in a queue/DLQ before replay. For a payment API, 24h–72h is common; for a message consumer, at least the broker's retention + redelivery ceiling. - Prefer a bounded store: Redis with per-key
PXTTL, or a DB table with a TTL/partition drop job. Cassandra TTL columns work well for high-volume dedup logs. - Trade the window against the dedup granularity: for exactly-once stream processing you only need to dedup within the replay window (offset range that can be re-consumed), which is often far smaller than a business-level 72h.A pragmatic answer: pick the window from the slowest legitimate duplicate you must absorb, add margin, and put a hard TTL so the store stays bounded.
Q12: Who generates the key, and what should it cover?¶
The client generates the idempotency key — because only the client knows that "this retry is the same logical operation as the previous attempt." If the server generated it, every retry would get a fresh key and dedup would be impossible. A UUIDv4 (or ULID) is typical; it must be globally unique and stable across retries of the same intent, and new for a genuinely new operation.
What the key should be scoped to: - Per logical operation, per actor — scope keys to the authenticated principal so one tenant can't collide with (or probe) another's keys. - Bound to the request content — store a hash of the payload alongside the key so a reused key with a different body is caught (Q10), preventing a stale key from returning a mismatched result. - For server-to-server / event pipelines, the key is often a natural business identity (order id, event id,
producer_id + sequence_number) rather than a random UUID, which also gives you natural idempotency at the sink.
Q13: Idempotency vs commutativity vs associativity¶
Interviewers use this to test depth. These are related algebraic properties often needed together in distributed systems (they're the backbone of CRDTs):
- Idempotent:
f(f(x)) = f(x)— re-applying the same operation doesn't change state. Defends against duplicates.- Commutative:
a ∘ b = b ∘ a— order of two different operations doesn't matter. Defends against reordering (messages arriving out of order).- Associative:
(a ∘ b) ∘ c = a ∘ (b ∘ c)— grouping doesn't matter; lets you merge in any batching.A merge operation that is idempotent + commutative + associative (a semilattice join) gives you eventual consistency regardless of duplication, reordering, or batching — which is exactly why G-Set / OR-Set / LWW CRDTs rely on all three. Idempotency alone buys you duplicate-safety; you often need commutativity too when the transport can also reorder.
Q14: Design an idempotent payment endpoint¶
Endpoint:
POST /v1/chargeswith headerIdempotency-Key: <client-uuid>and body{ amount, currency, source, customer_id }.Flow: 1. Validate & fingerprint — reject if key missing; compute
payload_hash. 2. Atomic claim —INSERT INTO idempotency (key, scope, payload_hash, status) VALUES (?, customer_id, ?, 'PENDING') ON CONFLICT DO NOTHING. Scope key bycustomer_id. 3. If claim lost (row already exists): -payload_hashdiffers →422(key reuse with different body). - statusCOMPLETED→ replay the stored response (same status code + body). - statusPENDING→409+Retry-After(first attempt in flight). 4. If claim won: call the payment gateway, passing the same idempotency key downstream so the gateway itself dedups (defense in depth). Persist the gateway result against the key and flip status toCOMPLETED, ideally in one transaction with the ledger write. 5. Crash between charge and store: aPENDINGlease expires; a reconciler queries the gateway by the idempotency key to learn whether the charge happened, then completes or releases the record. Never re-charge blindly.Why it's correct: the DB
UNIQUE(key)constraint is the atomic arbiter (Q9); the same key is propagated to the external system so the whole chain is idempotent; the ledger uses a uniquecharge_idfor natural idempotency; the dedup window (say 72h) exceeds the client's retry horizon (Q11).
Q15: Design an idempotent message consumer (Kafka)¶
Setup: at-least-once delivery (consumer commits offsets after processing), so on rebalance/crash a batch can be redelivered. Goal = effectively-once processing (Q7).
Strategies (pick per side-effect): - Natural idempotency at the sink — if the output is an upsert keyed by a business id (
INSERT ... ON CONFLICT DO UPDATEonevent_id), duplicates just re-write the same row. This is the cheapest and most robust; prefer it. - Dedup table — maintainprocessed(event_id PRIMARY KEY, ...). Wrap "insert into processed" and "apply the effect" in one DB transaction; if the insert conflicts, the event was already handled → skip. The transaction couples the dedup marker to the effect so you can't mark-then-crash-before-effect or vice versa. - Kafka transactions / EOS — idempotent producer (enable.idempotence=true) dedups byproducer_id + sequenceon the broker; transactionalsendOffsetsToTransactionatomically commits output records + consumer offsets. This is the same at-least-once + dedup recipe, just built into the platform, and only works within Kafka (read-process-write to Kafka).Key details: use a stable event id carried in the message (or
topic-partition-offsetas a last resort); bound the dedup table with a TTL matching the retention/replay window (Q11); make handlers side-effect-idempotent for external calls (forward an idempotency key to downstream services). The interview trap to avoid: committing the offset before processing (that's at-most-once and drops messages on crash) — always process, then commit.
Q16: Rapid-fire / red flags¶
- "We'll use exactly-once delivery." Red flag — it doesn't exist over an unreliable network. Say effectively-once = at-least-once + idempotent processing.
- "POST is idempotent." No — POST is neither safe nor idempotent by default; you add an idempotency key.
- "We check if the key exists, then insert it." Read-then-write TOCTOU race → double effect under concurrency. Use an atomic claim (
SET NX/UNIQUE).- "Server generates the idempotency key." Then retries get new keys and dedup can't work — the client must own the key.
- "Idempotency = returning the same response." No — it's the same state; the response for a duplicate can legitimately differ (
409, replayed result).- "Keys live forever." Unbounded store; set a TTL sized to the max retry/replay window.
- "Commit the Kafka offset first for speed." That's at-most-once; you'll drop messages on crash. Process, then commit.
- DELETE returning 404 on the second call means it's not idempotent. Wrong — state is unchanged; idempotency is about effect, not status code.
Next step: Microservices — Junior
In this topic
- interview