Cheat Sheets — Last-Mile Quick Reference
Dense, scannable reference tables for the night-before review. Read top to bottom in 30–45 minutes. Everything here is approximate by design — interviewers want orders of magnitude and correct reasoning, not memorized decimals.
1. Latency Numbers Every Engineer Should Know
Round figures (modern server, ~2020s hardware). Memorize the ratios, not the digits.
| Operation | Latency | Notes / mental model |
| L1 cache reference | ~1 ns | baseline |
| Branch mispredict | ~3 ns | |
| L2 cache reference | ~4 ns | ~4× L1 |
| Mutex lock/unlock (uncontended) | ~17 ns | |
| L3 cache reference | ~10–20 ns | |
| Main memory (RAM) reference | ~100 ns | ~100× L1 |
| Compress 1 KB (e.g. snappy) | ~2 µs | |
| Read 1 MB sequentially from RAM | ~3 µs | |
| Send 1 KB over 10 Gbps network | ~1 µs | |
| SSD random read (NVMe) | ~16–100 µs | ~1000× RAM |
| Read 1 MB sequentially from SSD | ~50–200 µs | |
| Round trip within same datacenter | ~0.5 ms | 500 µs |
| Read 1 MB sequentially from disk (HDD) | ~5–20 ms | |
| HDD seek | ~10 ms | |
| Round trip CA → Netherlands → CA | ~150 ms | speed of light dominates |
| TLS handshake (new connection, cross-region) | ~250–500 ms | reuse connections! |
Key ratios: RAM is ~100× L1. SSD is ~1000× RAM. Network DC round-trip is ~5× SSD read. Cross-continent is ~300× DC round-trip. Disk seek ≈ cross-continent round-trip.
2. Big-O Complexity — Data Structures & Algorithms
Data structure operations (average / worst)
| Structure | Access | Search | Insert | Delete | Space |
| Array / slice (index) | O(1) | O(n) | O(n) | O(n) | O(n) |
| Slice append (amortized) | — | — | O(1)* | — | O(n) |
| Hash map | — | O(1)/O(n) | O(1)/O(n) | O(1)/O(n) | O(n) |
| Linked list | O(n) | O(n) | O(1)† | O(1)† | O(n) |
| Binary search tree | O(log n)/O(n) | O(log n)/O(n) | O(log n)/O(n) | O(log n)/O(n) | O(n) |
| Balanced BST (RB/AVL) | O(log n) | O(log n) | O(log n) | O(log n) | O(n) |
| Heap (binary) | — | O(n) | O(log n) | O(log n) | O(n) |
| Skip list | O(log n) | O(log n) | O(log n) | O(log n) | O(n) |
| B-tree / B+tree | O(log n) | O(log n) | O(log n) | O(log n) | O(n) |
| Trie | O(k) | O(k) | O(k) | O(k) | O(n·k) |
* amortized; a single append that triggers regrow is O(n). † given the node pointer.
Sorting
| Algorithm | Best | Average | Worst | Space | Stable |
| Quicksort | O(n log n) | O(n log n) | O(n²) | O(log n) | No |
| Mergesort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes |
| Heapsort | O(n log n) | O(n log n) | O(n log n) | O(1) | No |
| Insertion | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Counting | O(n+k) | O(n+k) | O(n+k) | O(n+k) | Yes |
| Radix | O(nk) | O(nk) | O(nk) | O(n+k) | Yes |
Go standard library specifics
| Operation | Complexity | Notes |
sort.Slice / sort.Sort | O(n log n) | pattern-defeating quicksort (pdqsort), not stable |
sort.Stable | O(n log n) | stable merge-based |
slices.Sort (1.21+) | O(n log n) | pdqsort, generic |
sort.Search (binary search) | O(log n) | on sorted data |
| map read/write/delete | O(1) avg | hashed; iteration order randomized |
slices.Contains | O(n) | linear scan |
append (amortized) | O(1) | doubles capacity (then ~1.25× for large slices) |
copy | O(n) | |
len / cap | O(1) | |
string concat in loop with + | O(n²) | use strings.Builder → O(n) |
3. Capacity Estimation Cheatsheet
Time constants
| Period | Seconds | Round number |
| 1 day | 86,400 | ~10⁵ (~100k) |
| 1 month | ~2.6M | ~2.5 × 10⁶ |
| 1 year | ~31.5M | π × 10⁷ (handy trick) |
QPS math
- Avg QPS = daily requests / 86,400. So 1M req/day ≈ 12 QPS. 1B req/day ≈ 12k QPS.
- Peak QPS ≈ 2–3× average (rule of thumb). Some bursty systems: 10×.
- DAU × actions/user/day → daily requests. e.g. 10M DAU × 20 actions = 200M req/day ≈ 2.3k avg QPS, ~5–7k peak.
Powers of two ↔ storage
| Power | Exact value | Approx | Name |
| 2¹⁰ | 1,024 | ~10³ | KB |
| 2²⁰ | 1,048,576 | ~10⁶ | MB |
| 2³⁰ | ~1.07 × 10⁹ | ~10⁹ | GB |
| 2⁴⁰ | ~1.10 × 10¹² | ~10¹² | TB |
| 2⁵⁰ | ~1.13 × 10¹⁵ | ~10¹⁵ | PB |
Typical record sizes (estimation anchors)
| Item | Size estimate |
| char (ASCII) / byte | 1 B |
| UUID | 16 B (raw) / 36 B (text) |
| int64 / timestamp / pointer | 8 B |
| Short metadata row (id, fks, status, ts) | ~100–200 B |
| Typical DB row (e-commerce order) | ~1 KB |
| User profile JSON | ~1–5 KB |
| Thumbnail image | ~10–50 KB |
| Web page (HTML) | ~100 KB |
| Photo (compressed) | ~1–5 MB |
| Minute of 1080p video | ~10–50 MB |
Worked example: 1B orders/year × 1 KB = 1 TB/year raw. With indexes + replication (×3) + overhead → budget ~5 TB/year. Storage is cheap; design for retention/archival.
4. PostgreSQL Isolation Levels vs Anomalies
✗ = prevented, ✓ = possible. PostgreSQL is stricter than the SQL standard (no dirty reads even at Read Committed; "Repeatable Read" is actually snapshot isolation).
| Level (PostgreSQL) | Dirty read | Non-repeatable read | Phantom read | Serialization anomaly |
| Read Uncommitted* | ✗ | ✓ | ✓ | ✓ |
| Read Committed (default) | ✗ | ✓ | ✓ | ✓ |
| Repeatable Read (snapshot) | ✗ | ✗ | ✗ | ✓ |
| Serializable (SSI) | ✗ | ✗ | ✗ | ✗ |
* PostgreSQL treats Read Uncommitted as Read Committed — dirty reads never occur.
Anomaly definitions: - Dirty read — read uncommitted data from another txn. - Non-repeatable read — re-read a row, value changed (committed update). - Phantom read — re-run a query, new rows appear (committed insert). - Serialization anomaly — concurrent commits produce a result no serial order could.
Senior notes: Repeatable Read & Serializable can throw 40001 serialization_failure → app must retry. Serializable uses SSI (predicate locks), not blocking, so it scales surprisingly well for OLTP but raises abort rate under contention. Read Committed re-reads the latest snapshot per statement — beware lost updates; use SELECT ... FOR UPDATE or INSERT ... ON CONFLICT.
5. Kafka Quick Reference
Producer acks
| acks | Durability | Latency | Risk |
0 | none (fire and forget) | lowest | silent data loss |
1 | leader only | medium | loss if leader dies pre-replication |
all/-1 | all in-sync replicas (ISR) | highest | none, if configured with min ISR ≥ 2 |
Durability config
- ISR (In-Sync Replicas): replicas caught up with the leader. Leader + ISR define the durable set.
min.insync.replicas: with acks=all, write fails if ISR count drops below this. Set to 2 with RF=3 → survive 1 broker loss, still reject writes when too few replicas (no silent under-replication). - Golden durability combo:
replication.factor=3, min.insync.replicas=2, acks=all, unclean.leader.election.enable=false.
Delivery semantics
| Semantic | How |
| At-most-once | commit offset before processing; no producer retries |
| At-least-once | commit after processing + producer retries (default, dups possible) |
| Exactly-once | idempotent producer (enable.idempotence=true) + transactions, or idempotent consumers (dedup by key) |
Partitions & consumer groups
- More partitions = more parallelism, but more open files, more rebalance cost, higher end-to-end latency, more controller load. Tune up when consumer lag grows and consumers are CPU/IO-bound — not preemptively.
- Max useful consumers in a group = partition count. Extra consumers sit idle.
- Ordering is per-partition only. Need ordering? Key by entity id (e.g.
order_id) so all events for one entity land on one partition. - A partition is consumed by exactly one consumer in a group at a time.
- Rebalances pause consumption — prefer cooperative/incremental rebalancing (
CooperativeStickyAssignor).
6. HTTP Status Codes That Matter
| Code | Name | Use when |
| 200 | OK | Successful GET/PUT/PATCH with body |
| 201 | Created | POST created a resource (return Location header) |
| 202 | Accepted | Async accepted, not yet processed (queued) |
| 204 | No Content | Success, no body (DELETE, idempotent PUT) |
| 301/308 | Moved Permanently | Permanent redirect (308 preserves method/body) |
| 302/307 | Found / Temp Redirect | Temporary redirect (307 preserves method/body) |
| 304 | Not Modified | Conditional GET, ETag/If-None-Match matched |
| 400 | Bad Request | Malformed syntax / unparseable body |
| 401 | Unauthorized | Missing/invalid auth (really "unauthenticated") |
| 403 | Forbidden | Authenticated but not permitted |
| 404 | Not Found | Resource absent (or hide existence from unauthorized) |
| 405 | Method Not Allowed | Wrong verb on a valid resource |
| 409 | Conflict | State conflict: duplicate, version mismatch, optimistic-lock fail |
| 410 | Gone | Resource permanently removed |
| 422 | Unprocessable Entity | Syntactically valid but semantically invalid (validation) |
| 429 | Too Many Requests | Rate limited (return Retry-After) |
| 500 | Internal Server Error | Unhandled server fault (don't leak internals) |
| 502 | Bad Gateway | Upstream returned invalid response |
| 503 | Service Unavailable | Overloaded / down / draining (return Retry-After) |
| 504 | Gateway Timeout | Upstream didn't respond in time |
409 vs 422: 409 = conflict with current resource state (versioning, uniqueness). 422 = the payload itself fails business validation. 429 vs 503: 429 = this client exceeded its quota. 503 = the server can't serve anyone right now.
7. HTTP Method Semantics
| Method | Safe | Idempotent | Cacheable | Has body | Typical use |
| GET | Yes | Yes | Yes | No | Read a resource |
| HEAD | Yes | Yes | Yes | No | Headers only |
| OPTIONS | Yes | Yes | No | No | Capabilities / CORS preflight |
| POST | No | No | Rarely | Yes | Create / non-idempotent action |
| PUT | No | Yes | No | Yes | Full replace / upsert at known URI |
| PATCH | No | No* | No | Yes | Partial update |
| DELETE | No | Yes | No | Maybe | Remove resource |
- Safe = no server-state change. Idempotent = N identical calls ≡ 1 call's effect.
- * PATCH is not guaranteed idempotent, but can be designed to be (e.g. set-absolute-value semantics).
- Senior tip: Make POST idempotent with an idempotency key header when retries are possible (payments, order creation).
8. Go Concurrency Quick Patterns
// Worker pool
jobs := make(chan Job); results := make(chan Result)
for i := 0; i < numWorkers; i++ {
go func() { for j := range jobs { results <- process(j) } }()
}
// Fan-out / fan-in
// fan-out: launch N goroutines reading from one input chan
// fan-in: merge N output chans into one via a sync.WaitGroup + single out chan
// errgroup — bounded, first-error-cancels-context
g, ctx := errgroup.WithContext(ctx)
g.SetLimit(8) // bound concurrency (1.23+)
for _, item := range items {
item := item
g.Go(func() error { return work(ctx, item) })
}
err := g.Wait() // returns first non-nil error
// Semaphore (bound concurrency without a pool)
sem := make(chan struct{}, maxConcurrent)
sem <- struct{}{} // acquire
go func() { defer func() { <-sem }(); work() }()
// select with timeout
select {
case res := <-ch: use(res)
case <-time.After(2*time.Second): return ErrTimeout
case <-ctx.Done(): return ctx.Err()
}
// Pipeline cancellation: always pass ctx; check ctx.Done() in long loops.
Rules: the goroutine that creates a channel owns closing it; never close from the receiver side; for range ch exits when the channel is closed; a nil channel blocks forever (useful to disable a select case).
Allocation reduction
- Preallocate:
make([]T, 0, n), make(map[K]V, n). - Reuse buffers with
sync.Pool (for short-lived, frequently-allocated objects). strings.Builder instead of + in loops; bytes.Buffer for []byte. - Pass small structs by value; avoid pointers that force heap escape.
- Avoid
interface{}/boxing in hot paths; watch escape analysis: go build -gcflags=-m. - Prefer slices of structs over slices of pointers (better cache locality, fewer allocs).
pprof commands
go test -bench=. -benchmem -cpuprofile cpu.out -memprofile mem.out
go tool pprof cpu.out # interactive; top, list <fn>, web
go tool pprof -http=:8080 cpu.out
# live server (import _ "net/http/pprof")
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30 # CPU
go tool pprof http://localhost:6060/debug/pprof/heap # heap
go tool pprof http://localhost:6060/debug/pprof/allocs # all allocs
curl localhost:6060/debug/pprof/goroutine?debug=2 # goroutine dump
go tool trace trace.out # scheduler/GC latency analysis
GC / memory tuning
GOGC (default 100): GC triggers when heap grows 100% since last GC. Higher (e.g. 200) = less CPU, more RAM. GOGC=off disables (dangerous). GOMEMLIMIT (1.19+): soft memory cap, e.g. GOMEMLIMIT=4GiB. Use it to avoid OOM-kills in containers; set it below the container limit (~80–90%). Pair with GOGC=off for a hard-cap-driven GC strategy in memory-constrained pods. GOMAXPROCS: set to container CPU quota (use automaxprocs lib) or you'll oversubscribe and thrash.
10. Redis Data Structures & Use Cases
| Type | Core commands | Use cases |
| String | GET/SET/INCR/SETEX | cache, counters, rate limiter, sessions, distributed lock (SET NX PX) |
| Hash | HSET/HGET/HGETALL | object/record cache (fields), shopping cart |
| List | LPUSH/RPOP/LRANGE/BLPOP | queues, recent items, simple job queue (use Streams instead for reliability) |
| Set | SADD/SISMEMBER/SINTER | tags, unique visitors, relationships, dedup |
| Sorted Set (ZSet) | ZADD/ZRANGE/ZRANGEBYSCORE | leaderboards, priority queues, rate limiting windows, time-ordered feeds |
| Stream | XADD/XREAD/XACK/XAUTOCLAIM | durable event log, consumer groups (Kafka-lite), at-least-once delivery |
| HyperLogLog | PFADD/PFCOUNT | approximate unique counts (cardinality) at ~12 KB fixed |
| Bitmap | SETBIT/BITCOUNT/BITOP | daily active users, feature flags per-user, presence |
| Geo | GEOADD/GEOSEARCH | nearby-search, location features |
Senior notes: Redis is single-threaded for commands — avoid O(n) commands (KEYS, big SMEMBERS) on large keys; use SCAN. Set TTLs to bound memory; pick an eviction policy (allkeys-lru for cache, noeviction for queues). For distributed locks across nodes, plain SET NX is fine for most cases; Redlock only when you truly need multi-master safety (and understand its caveats).
11. AWS Service Cheat-Map
| Category | Need | Service |
| Compute | VMs | EC2 |
| Containers (managed) | ECS (Fargate = serverless), EKS (k8s) |
| Functions | Lambda |
| Storage | Object storage | S3 |
| Block storage (disks) | EBS |
| Shared file system | EFS |
| Archive | S3 Glacier |
| Database | Managed relational (Postgres) | RDS / Aurora (Postgres-compatible) |
| Key-value / wide-column | DynamoDB |
| In-memory cache | ElastiCache (Redis/Memcached) |
| Data warehouse | Redshift |
| Messaging | Pub/sub fan-out | SNS |
| Queue (point-to-point) | SQS (FIFO for ordering/dedup) |
| Managed Kafka | MSK |
| Event bus / routing | EventBridge |
| Streaming | Kinesis Data Streams |
| Networking | DNS | Route 53 |
| CDN | CloudFront |
| Load balancer | ALB (L7) / NLB (L4) |
| Private network | VPC |
| API front door | API Gateway |
| Ops/Security | Secrets | Secrets Manager / SSM Parameter Store |
| Identity/permissions | IAM |
| Metrics/logs/alarms | CloudWatch |
| Tracing | X-Ray |
E-commerce mapping (this role): ALB → ECS/Fargate Go services → RDS Aurora Postgres + ElastiCache Redis; MSK for the event backbone; S3 for assets/exports; SNS+SQS or EventBridge for fan-out notifications; Secrets Manager for credentials; CloudWatch + X-Ray for observability.
12. The Nines (Availability → Downtime)
| Availability | "Nines" | Downtime / year | Downtime / month | Downtime / day |
| 90% | one nine | 36.5 days | ~73 hours | 2.4 hours |
| 99% | two nines | 3.65 days | ~7.2 hours | 14.4 min |
| 99.9% | three nines | 8.76 hours | ~43.8 min | 1.44 min |
| 99.95% | | 4.38 hours | ~21.9 min | 43 sec |
| 99.99% | four nines | 52.6 min | ~4.38 min | 8.6 sec |
| 99.999% | five nines | 5.26 min | ~26.3 sec | 0.86 sec |
Senior note: Each nine roughly costs an order of magnitude more (redundancy, multi-AZ → multi-region, automation). Tie targets to an error budget: 99.9% = ~43 min/month of allowed downtime to spend on risky deploys.
13. gRPC vs REST — Quick Decision
| Dimension | gRPC | REST/JSON over HTTP |
| Wire format | Protobuf (binary, compact) | JSON (text, verbose, human-readable) |
| Transport | HTTP/2 (multiplexed, streaming) | HTTP/1.1 or /2 |
| Contract | Strongly typed .proto, codegen | OpenAPI/Swagger (optional) |
| Streaming | Native (uni + bidirectional) | SSE / WebSocket bolt-ons |
| Browser support | Needs grpc-web proxy | Native everywhere |
| Performance | Higher throughput, lower latency | Higher overhead |
| Debuggability | Needs tooling (grpcurl) | curl / browser friendly |
| Best for | Internal service-to-service, low-latency, polyglot, streaming | Public/external APIs, browser clients, broad compatibility |
Rule of thumb: gRPC inside the mesh (Go service ↔ Go service), REST at the edge (public API, partner integrations, mobile).
14. SQL EXPLAIN — Scan & Join Types (PostgreSQL)
Scan types (cheapest → most expensive, context-dependent)
| Node | Meaning | Good/Bad |
| Index Only Scan | Answered entirely from index (covering index) | Best |
| Index Scan | Walk index, then fetch matching rows from heap | Good for selective predicates |
| Bitmap Index/Heap Scan | Build a bitmap of matches, then fetch heap pages | Good for medium selectivity / multiple indexes |
| Seq Scan | Read the whole table | Fine for small tables / low selectivity; bad if it should've used an index |
Join types
| Node | How it works | Best when |
| Nested Loop | For each outer row, probe inner | Small outer + indexed inner |
| Hash Join | Build hash table on one side, probe with other | Large unsorted equi-joins |
| Merge Join | Sort both inputs, merge | Both inputs already sorted / large |
What to look for in EXPLAIN ANALYZE
rows estimate vs actual rows wildly off → stale stats; run ANALYZE. Seq Scan on a big table with a selective WHERE → missing/unused index. Rows Removed by Filter high → index not selective enough; consider composite/partial index. loops=N large on a Nested Loop → likely the wrong join strategy (bad estimates). - Sort / Hash spilling to disk (
external merge Disk: …kB) → raise work_mem or reduce rows earlier. - Always use
EXPLAIN (ANALYZE, BUFFERS) to see real I/O, and BUFFERS to spot cache misses.