🟡 Middle Level (251–500)¶

← Junior · README · Senior →

Focus: Real-world system designs (Twitter feed, Instagram, Bitly, Yelp), trade-offs, sharding, replication, MQ patterns, caching strategies, microservices communication, basic distributed concepts.

For whom: 2–5 years of experience, mid-level engineer. Time per question: 30–45 minutes, whiteboard sketch with components.

🧠 Core Distributed Concepts (251–280)¶

Explain the CAP theorem and the three trade-offs.
Explain the PACELC theorem.
What is BASE in contrast with ACID?
Explain strong consistency vs eventual consistency with examples.
What is read-your-writes consistency?
What is monotonic read consistency?
What is causal consistency?
What is linearizability?
What is serializability?
What is the difference between linearizability and serializability?
Explain quorum (R + W > N) in distributed databases.
What is consistent hashing and why is it useful?
How do virtual nodes (vnodes) help in consistent hashing?
What is a hot shard and how do you mitigate it?
What is split-brain in distributed systems?
What are the trade-offs of strong vs eventual consistency in a chat app?
What is "exactly-once" semantics and is it achievable?
Explain at-most-once vs at-least-once vs exactly-once.
What is back-pressure?
What is fan-out (read vs write)?
What is the difference between push vs pull models?
What is the difference between sync replication and async replication?
What is data locality and why is it important?
What is leader–follower replication?
What is multi-master replication and what are its conflicts?
What is replication lag?
What is geo-replication?
What is horizontal vs vertical partitioning?
What is sharding by range vs hash vs directory?
What is a "shard key" and how do you pick one?

🛢️ Database & Sharding (281–320)¶

When would you choose Cassandra over Postgres?
When would you choose MongoDB over MySQL?
When would you use DynamoDB?
When would you use Redis as a primary store vs cache?
What are secondary indexes and what do they cost?
What is a covering index?
What is an index merge?
What is the cost of too many indexes?
What is a write-ahead log (WAL)?
What is MVCC (multi-version concurrency control)?
What are the four standard isolation levels?
What is a phantom read?
What is a non-repeatable read?
What is dirty read?
How would you migrate a schema with zero downtime?
How does double-write deal with cache–DB consistency?
How do you handle failed cache writes?
What is read-through cache?
What is a cache stampede and how to avoid it?
What is request coalescing?
What is a denormalized read model?
What is materialized view?
What is CDC (change data capture)?
How does CDC help with cache invalidation?
What is an outbox pattern?
What is a saga pattern?
What is two-phase commit and why is it problematic?
What is three-phase commit?
What is eventual consistency through compensating transactions?
How would you shard a "users" table by user_id?
How would you shard a "messages" table for a chat app?
What is co-location of related shards?
What is the "hot key" problem in sharded systems?
How would you re-shard a live database?
What is a global secondary index?
What is a local secondary index?
What is the n+1 query problem and how to fix it?
What is connection pooling at the proxy level (PgBouncer)?
What is read-replica lag and how to handle it for user-facing reads?
What is the difference between OLTP and OLAP DBs at the architecture level?

📨 Messaging & Streaming (321–350)¶

What is Apache Kafka at a high level?
How do partitions work in Kafka?
What is a Kafka consumer group?
What is offset in Kafka and how is it managed?
How does Kafka guarantee ordering?
What is RabbitMQ and how does it differ from Kafka?
What is an exchange in RabbitMQ (direct, topic, fanout, headers)?
What is dead-letter queue?
When do you use SQS vs SNS?
When do you use Kafka vs Kinesis?
What is the difference between queue and topic?
What is a consumer lag and how do you monitor it?
How do you achieve at-least-once delivery?
How do you make a consumer idempotent?
What is the outbox + relay pattern?
How do you design retries with exponential backoff and jitter?
What is poison pill in queue processing?
How would you handle ordering across partitions?
What is a compacted topic in Kafka?
What is exactly-once semantics in Kafka?
What is mirror-maker?
What is a stream-table duality?
What is windowing in stream processing?
What is a watermark in stream processing?
What is event time vs processing time?
When do you choose Flink vs Spark Streaming?
What is back-pressure handling in a streaming pipeline?
What is a side-car proxy in messaging context?
How do you handle schema evolution (Avro, Protobuf)?
What is the difference between push and pull queue consumers?

🌐 Microservices & APIs (351–380)¶

What are pros and cons of microservices vs modular monolith?
How do you decide service boundaries (DDD)?
What is a bounded context?
What is an aggregate in DDD?
What is service discovery and why is it needed?
What is client-side vs server-side service discovery?
What is the role of an API gateway in microservices?
What is a service mesh and why use it?
What is sidecar pattern (Envoy/Istio)?
How do services authenticate to each other (mTLS, JWT)?
What is a circuit breaker and how does it work?
What is a bulkhead pattern?
What is retry storm and how to avoid it?
What is a backend-for-frontend (BFF)?
What is a strangler fig pattern in migrations?
How do you handle distributed transactions across services?
How would you implement a saga for "order → payment → shipping"?
What is the choreography vs orchestration in saga?
How would you handle versioning in microservice APIs?
What is contract testing?
What is consumer-driven contracts (Pact)?
How would you design a search service in front of multiple data stores?
What is the read-model / write-model split (CQRS basics)?
What is BFF cache vs origin cache?
How would you migrate from monolith to microservices step by step?
What is event-carried state transfer?
What is the "shared database anti-pattern"?
What is pipelined gRPC?
What is server streaming vs client streaming vs bi-di in gRPC?
What is REST vs gRPC vs GraphQL trade-off?

🌍 Real-World Designs (381–430)¶

Design Twitter (timeline, tweets, follow).
Design Instagram (photo upload, feed).
Design Facebook News Feed (basic).
Design YouTube (video upload, playback).
Design Netflix (catalog + streaming).
Design Spotify (music streaming).
Design WhatsApp / Messenger (chat).
Design Slack (channels, threads).
Design Discord (servers, voice channels).
Design Zoom (video conferencing).
Design Google Drive / Dropbox.
Design Google Docs (real-time collaborative editing).
Design Pinterest (boards + pins).
Design Reddit (subreddits, voting).
Design Quora / StackOverflow (Q&A platform).
Design Yelp (location-based reviews).
Design Airbnb (booking, search by location).
Design Uber (ride matching, geospatial).
Design DoorDash / UberEats (food delivery).
Design Amazon product page + cart.
Design Shopify-like multi-tenant store.
Design eBay / online auctions.
Design Tinder (matching, swipes).
Design LinkedIn (connections, feed).
Design Medium (publishing platform).
Design GitHub (repos, PRs, issues).
Design Jira (issue tracker, boards).
Design Trello (kanban boards).
Design Notion (blocks-based docs).
Design Figma (collaborative canvas).
Design Twitch (live streaming + chat).
Design TikTok (short-video feed).
Design SoundCloud (audio uploads).
Design a typeahead / autocomplete service.
Design a real-time multiplayer chess server.
Design a basic e-mail service (Gmail-lite).
Design a calendar service with reminders.
Design a video conferencing whiteboard.
Design a real-time stock ticker dashboard.
Design a flight booking system.
Design a hotel booking system.
Design a movie ticket booking system at scale.
Design a parking-lot reservation system at city scale.
Design a ride-share carpool matcher.
Design a coupon / promo code service.
Design a recommendation engine for an e-commerce site.
Design a "people you may know" service.
Design a "who viewed your profile" service.
Design a hashtag trending service.
Design a real-time leaderboard for a global game.

🏗️ Reliability & Performance Patterns (431–470)¶

What is graceful degradation? Give an example.
What is fail-fast vs fail-safe?
What is retry with backoff and jitter?
What is circuit breaker open/half-open/closed states?
What is timeout cascading and how to prevent it?
What is hedged request?
What is request collapsing?
What is shadow traffic / dark launching?
What is feature flag rollout?
What is canary deployment with metric guardrails?
What is blue-green deployment trade-off?
What is rolling deployment?
What is in-place vs immutable deployment?
What is the role of synthetic monitoring?
What is real-user monitoring (RUM)?
What is APM and what tools provide it?
What is the four golden signals (latency, traffic, errors, saturation)?
What is the USE method (Utilization, Saturation, Errors)?
What is the RED method (Rate, Errors, Duration)?
How do you set SLO, SLA, SLI?
What is error budget?
What is the role of chaos engineering?
What is fault injection?
What is request tracing and why is it valuable?
What is OpenTelemetry?
What is span vs trace vs context propagation?
What is structured logging?
What is log aggregation (e.g., ELK, Loki)?
What is metric cardinality and why is it dangerous?
What is alert fatigue and how do you avoid it?
What is on-call rotation design?
What is post-mortem (blameless)?
What is "graceful shutdown" of a service?
What is connection draining on a load balancer?
What is keep-alive vs idle timeout in connection pools?
What is HTTP connection coalescing?
What is pre-warming a cache?
What is the "thundering herd" on cold cache and fix?
What is request hedging?
What is bulkhead with thread-pool isolation?

🔧 Caching Strategies (471–490)¶

What is write-through, write-back, write-around caching?
What is the cache-aside pattern in detail?
How do you choose TTL for a session cache vs profile cache?
What is negative caching and when to use it?
What is cache stampede protection (mutex / lease)?
How does Redis cluster sharding work?
What is Redis pub/sub and what are its limitations?
What is Redis Streams?
What is Redis sentinel vs cluster?
What is consistent hashing with virtual nodes in Memcached?
How do you cache GraphQL responses?
How do you cache user-specific data efficiently?
What is edge caching and how does CDN cache personalized content?
What is fragment caching (Russian doll caching)?
What is HTTP cache header (Cache-Control, ETag, Last-Modified) interplay?
What is stale-while-revalidate?
What is private vs shared cache?
What is varnish HTTP cache and where does it fit?
What is multi-tier caching (L1 in-process + L2 Redis)?
What is Redis vs ElastiCache vs MemoryDB?

🔍 Search & Indexing (491–500)¶

How would you build a basic search index for blog posts?
What is an inverted index?
What is TF-IDF and where is it used?
What is BM25?
What is Elasticsearch and how does it shard data?
What is the difference between text and keyword fields in ES?
How do you handle typo tolerance / fuzzy search?
How do you implement autocomplete/typeahead at scale?
How do you sync your primary DB with Elasticsearch?
What are the trade-offs of using Elasticsearch as a primary data store?

← Junior · README · Senior Level →