π‘ Middle Level (251β500)¶
β Junior Β· README Β· Senior β
Focus: Real-world system designs (Twitter feed, Instagram, Bitly, Yelp), trade-offs, sharding, replication, MQ patterns, caching strategies, microservices communication, basic distributed concepts.
For whom: 2β5 years of experience, mid-level engineer. Time per question: 30β45 minutes, whiteboard sketch with components.
π§ Core Distributed Concepts (251β280)¶
- Explain the CAP theorem and the three trade-offs.
- Explain the PACELC theorem.
- What is BASE in contrast with ACID?
- Explain strong consistency vs eventual consistency with examples.
- What is read-your-writes consistency?
- What is monotonic read consistency?
- What is causal consistency?
- What is linearizability?
- What is serializability?
- What is the difference between linearizability and serializability?
- Explain quorum (R + W > N) in distributed databases.
- What is consistent hashing and why is it useful?
- How do virtual nodes (vnodes) help in consistent hashing?
- What is a hot shard and how do you mitigate it?
- What is split-brain in distributed systems?
- What are the trade-offs of strong vs eventual consistency in a chat app?
- What is "exactly-once" semantics and is it achievable?
- Explain at-most-once vs at-least-once vs exactly-once.
- What is back-pressure?
- What is fan-out (read vs write)?
- What is the difference between push vs pull models?
- What is the difference between sync replication and async replication?
- What is data locality and why is it important?
- What is leaderβfollower replication?
- What is multi-master replication and what are its conflicts?
- What is replication lag?
- What is geo-replication?
- What is horizontal vs vertical partitioning?
- What is sharding by range vs hash vs directory?
- What is a "shard key" and how do you pick one?
π’οΈ Database & Sharding (281β320)¶
- When would you choose Cassandra over Postgres?
- When would you choose MongoDB over MySQL?
- When would you use DynamoDB?
- When would you use Redis as a primary store vs cache?
- What are secondary indexes and what do they cost?
- What is a covering index?
- What is an index merge?
- What is the cost of too many indexes?
- What is a write-ahead log (WAL)?
- What is MVCC (multi-version concurrency control)?
- What are the four standard isolation levels?
- What is a phantom read?
- What is a non-repeatable read?
- What is dirty read?
- How would you migrate a schema with zero downtime?
- How does double-write deal with cacheβDB consistency?
- How do you handle failed cache writes?
- What is read-through cache?
- What is a cache stampede and how to avoid it?
- What is request coalescing?
- What is a denormalized read model?
- What is materialized view?
- What is CDC (change data capture)?
- How does CDC help with cache invalidation?
- What is an outbox pattern?
- What is a saga pattern?
- What is two-phase commit and why is it problematic?
- What is three-phase commit?
- What is eventual consistency through compensating transactions?
- How would you shard a "users" table by user_id?
- How would you shard a "messages" table for a chat app?
- What is co-location of related shards?
- What is the "hot key" problem in sharded systems?
- How would you re-shard a live database?
- What is a global secondary index?
- What is a local secondary index?
- What is the n+1 query problem and how to fix it?
- What is connection pooling at the proxy level (PgBouncer)?
- What is read-replica lag and how to handle it for user-facing reads?
- What is the difference between OLTP and OLAP DBs at the architecture level?
π¨ Messaging & Streaming (321β350)¶
- What is Apache Kafka at a high level?
- How do partitions work in Kafka?
- What is a Kafka consumer group?
- What is offset in Kafka and how is it managed?
- How does Kafka guarantee ordering?
- What is RabbitMQ and how does it differ from Kafka?
- What is an exchange in RabbitMQ (direct, topic, fanout, headers)?
- What is dead-letter queue?
- When do you use SQS vs SNS?
- When do you use Kafka vs Kinesis?
- What is the difference between queue and topic?
- What is a consumer lag and how do you monitor it?
- How do you achieve at-least-once delivery?
- How do you make a consumer idempotent?
- What is the outbox + relay pattern?
- How do you design retries with exponential backoff and jitter?
- What is poison pill in queue processing?
- How would you handle ordering across partitions?
- What is a compacted topic in Kafka?
- What is exactly-once semantics in Kafka?
- What is mirror-maker?
- What is a stream-table duality?
- What is windowing in stream processing?
- What is a watermark in stream processing?
- What is event time vs processing time?
- When do you choose Flink vs Spark Streaming?
- What is back-pressure handling in a streaming pipeline?
- What is a side-car proxy in messaging context?
- How do you handle schema evolution (Avro, Protobuf)?
- What is the difference between push and pull queue consumers?
π Microservices & APIs (351β380)¶
- What are pros and cons of microservices vs modular monolith?
- How do you decide service boundaries (DDD)?
- What is a bounded context?
- What is an aggregate in DDD?
- What is service discovery and why is it needed?
- What is client-side vs server-side service discovery?
- What is the role of an API gateway in microservices?
- What is a service mesh and why use it?
- What is sidecar pattern (Envoy/Istio)?
- How do services authenticate to each other (mTLS, JWT)?
- What is a circuit breaker and how does it work?
- What is a bulkhead pattern?
- What is retry storm and how to avoid it?
- What is a backend-for-frontend (BFF)?
- What is a strangler fig pattern in migrations?
- How do you handle distributed transactions across services?
- How would you implement a saga for "order β payment β shipping"?
- What is the choreography vs orchestration in saga?
- How would you handle versioning in microservice APIs?
- What is contract testing?
- What is consumer-driven contracts (Pact)?
- How would you design a search service in front of multiple data stores?
- What is the read-model / write-model split (CQRS basics)?
- What is BFF cache vs origin cache?
- How would you migrate from monolith to microservices step by step?
- What is event-carried state transfer?
- What is the "shared database anti-pattern"?
- What is pipelined gRPC?
- What is server streaming vs client streaming vs bi-di in gRPC?
- What is REST vs gRPC vs GraphQL trade-off?
π Real-World Designs (381β430)¶
- Design Twitter (timeline, tweets, follow).
- Design Instagram (photo upload, feed).
- Design Facebook News Feed (basic).
- Design YouTube (video upload, playback).
- Design Netflix (catalog + streaming).
- Design Spotify (music streaming).
- Design WhatsApp / Messenger (chat).
- Design Slack (channels, threads).
- Design Discord (servers, voice channels).
- Design Zoom (video conferencing).
- Design Google Drive / Dropbox.
- Design Google Docs (real-time collaborative editing).
- Design Pinterest (boards + pins).
- Design Reddit (subreddits, voting).
- Design Quora / StackOverflow (Q&A platform).
- Design Yelp (location-based reviews).
- Design Airbnb (booking, search by location).
- Design Uber (ride matching, geospatial).
- Design DoorDash / UberEats (food delivery).
- Design Amazon product page + cart.
- Design Shopify-like multi-tenant store.
- Design eBay / online auctions.
- Design Tinder (matching, swipes).
- Design LinkedIn (connections, feed).
- Design Medium (publishing platform).
- Design GitHub (repos, PRs, issues).
- Design Jira (issue tracker, boards).
- Design Trello (kanban boards).
- Design Notion (blocks-based docs).
- Design Figma (collaborative canvas).
- Design Twitch (live streaming + chat).
- Design TikTok (short-video feed).
- Design SoundCloud (audio uploads).
- Design a typeahead / autocomplete service.
- Design a real-time multiplayer chess server.
- Design a basic e-mail service (Gmail-lite).
- Design a calendar service with reminders.
- Design a video conferencing whiteboard.
- Design a real-time stock ticker dashboard.
- Design a flight booking system.
- Design a hotel booking system.
- Design a movie ticket booking system at scale.
- Design a parking-lot reservation system at city scale.
- Design a ride-share carpool matcher.
- Design a coupon / promo code service.
- Design a recommendation engine for an e-commerce site.
- Design a "people you may know" service.
- Design a "who viewed your profile" service.
- Design a hashtag trending service.
- Design a real-time leaderboard for a global game.
ποΈ Reliability & Performance Patterns (431β470)¶
- What is graceful degradation? Give an example.
- What is fail-fast vs fail-safe?
- What is retry with backoff and jitter?
- What is circuit breaker open/half-open/closed states?
- What is timeout cascading and how to prevent it?
- What is hedged request?
- What is request collapsing?
- What is shadow traffic / dark launching?
- What is feature flag rollout?
- What is canary deployment with metric guardrails?
- What is blue-green deployment trade-off?
- What is rolling deployment?
- What is in-place vs immutable deployment?
- What is the role of synthetic monitoring?
- What is real-user monitoring (RUM)?
- What is APM and what tools provide it?
- What is the four golden signals (latency, traffic, errors, saturation)?
- What is the USE method (Utilization, Saturation, Errors)?
- What is the RED method (Rate, Errors, Duration)?
- How do you set SLO, SLA, SLI?
- What is error budget?
- What is the role of chaos engineering?
- What is fault injection?
- What is request tracing and why is it valuable?
- What is OpenTelemetry?
- What is span vs trace vs context propagation?
- What is structured logging?
- What is log aggregation (e.g., ELK, Loki)?
- What is metric cardinality and why is it dangerous?
- What is alert fatigue and how do you avoid it?
- What is on-call rotation design?
- What is post-mortem (blameless)?
- What is "graceful shutdown" of a service?
- What is connection draining on a load balancer?
- What is keep-alive vs idle timeout in connection pools?
- What is HTTP connection coalescing?
- What is pre-warming a cache?
- What is the "thundering herd" on cold cache and fix?
- What is request hedging?
- What is bulkhead with thread-pool isolation?
π§ Caching Strategies (471β490)¶
- What is write-through, write-back, write-around caching?
- What is the cache-aside pattern in detail?
- How do you choose TTL for a session cache vs profile cache?
- What is negative caching and when to use it?
- What is cache stampede protection (mutex / lease)?
- How does Redis cluster sharding work?
- What is Redis pub/sub and what are its limitations?
- What is Redis Streams?
- What is Redis sentinel vs cluster?
- What is consistent hashing with virtual nodes in Memcached?
- How do you cache GraphQL responses?
- How do you cache user-specific data efficiently?
- What is edge caching and how does CDN cache personalized content?
- What is fragment caching (Russian doll caching)?
- What is HTTP cache header (Cache-Control, ETag, Last-Modified) interplay?
- What is stale-while-revalidate?
- What is private vs shared cache?
- What is varnish HTTP cache and where does it fit?
- What is multi-tier caching (L1 in-process + L2 Redis)?
- What is Redis vs ElastiCache vs MemoryDB?
π Search & Indexing (491β500)¶
- How would you build a basic search index for blog posts?
- What is an inverted index?
- What is TF-IDF and where is it used?
- What is BM25?
- What is Elasticsearch and how does it shard data?
- What is the difference between text and keyword fields in ES?
- How do you handle typo tolerance / fuzzy search?
- How do you implement autocomplete/typeahead at scale?
- How do you sync your primary DB with Elasticsearch?
- What are the trade-offs of using Elasticsearch as a primary data store?
β Junior Β· README Β· Senior Level β