System Design Roadmap¶
- Roadmap: https://roadmap.sh/system-design
Deep-Dive Sections¶
- 01. Capacity Estimation — QPS, storage, bandwidth, latency budgets
- 02. Trade-offs Framework — CAP, PACELC, consistency vs availability
- 03. Classic Problems — URL shortener, Twitter timeline, WhatsApp/chat, YouTube, Uber, Dropbox, Instagram, Stack Overflow, ad counter, payment system, web crawler, recommendation engine, rate limiter service, key-value store
- 04. Real-World Architectures — Google Spanner, Facebook TAO, Amazon DynamoDB, Netflix stack
- 05. Back-of-Envelope — number tables, Fermi estimation
Companion roadmaps: - Distributed Systems — consensus, replication, sharding, sagas, CRDTs - Architecture / DDD — bounded contexts, aggregates - Computer Science — OS, networking, DB internals
1. Introduction¶
- 1.1 What is System Design?
- 1.2 How to approach System Design?
2.3 Performance vs Scalability¶
2.4 Latency vs Throughput¶
2.5 Availability vs Consistency¶
- 2.5.1 CAP Theorem
- 2.5.1.1 AP - Availability + Partition Tolerance
- 2.5.1.2 CP - Consistency + Partition Tolerance
2.6 Consistency Patterns¶
- 2.6.1 Weak Consistency
- 2.6.2 Eventual Consistency
- 2.6.3 Strong Consistency
2.7 Availability Patterns¶
- 2.7.1 Fail-Over
- 2.7.1.1 Active - Active
- 2.7.1.2 Active - Passive
- 2.7.2 Replication
- 2.7.2.1 Master - Slave
- 2.7.2.2 Master - Master
- 2.7.3 Availability in Numbers
- 2.7.3.1 99.9% Availability - three 9s
- 2.7.3.2 99.99% Availability - four 9s
- 2.7.3.3 Availability in Parallel vs Sequence
3. Background Jobs¶
- 3.1 Event-Driven
- 3.2 Schedule Driven
- 3.3 Returning Results
4. Domain Name System¶
5. Content Delivery Networks¶
- 5.1 Pull CDNs
- 5.2 Push CDNs
6. Load Balancers¶
- 6.1 LB vs Reverse Proxy
- 6.2 Load Balancing Algorithms
- 6.3 Layer 7 Load Balancing
- 6.4 Layer 4 Load Balancing
6.5 Horizontal Scaling¶
7. Application Layer¶
- 7.1 Microservices
- 7.2 Service Discovery
8. Databases¶
- 8.1 Key-Value Store
- 8.2 Document Store
- 8.3 Wide Column Store
- 8.4 Graph Databases
- 8.5 NoSQL
- 8.6 RDBMS
- 8.7 Replication
- 8.8 Sharding
- 8.9 Federation
- 8.10 Denormalization
- 8.11 SQL Tuning
8.12 SQL vs NoSQL¶
9. Caching¶
- 9.1 Refresh Ahead
- 9.2 Write-behind
- 9.3 Write-through
- 9.4 Cache Aside
9.5 Strategies¶
9.6 Types of Caching¶
- 9.6.1 Client Caching
- 9.6.2 CDN Caching
- 9.6.3 Web Server Caching
- 9.6.4 Database Caching
- 9.6.5 Application Caching
10. Asynchronism¶
- 10.1 Back Pressure
- 10.2 Task Queues
- 10.3 Message Queues
11. Communication¶
- 11.1 HTTP
- 11.2 TCP
- 11.3 UDP
- 11.4 RPC
- 11.5 gRPC
- 11.6 REST
- 11.7 GraphQL
11.8 Idempotent Operations¶
12. Performance Antipatterns¶
- 12.1 Improper Instantiation
- 12.2 Monolithic Persistence
- 12.3 Noisy Neighbor
- 12.4 Synchronous I/O
- 12.5 Extraneous Fetching
- 12.6 Busy Database
- 12.7 Busy Frontend
- 12.8 Chatty I/O
- 12.9 Retry Storm
- 12.10 No Caching
13. Monitoring¶
- 13.1 Health Monitoring
- 13.2 Availability Monitoring
- 13.3 Performance Monitoring
- 13.4 Security Monitoring
- 13.5 Usage Monitoring
- 13.6 Instrumentation
- 13.7 Visualization & Alerts
14. Cloud Design Patterns¶
14.1 Design & Implementation¶
- 14.1.1 Strangler Fig
- 14.1.2 Sidecar
- 14.1.3 Static Content Hosting
- 14.1.4 Leader Election
- 14.1.5 CQRS
- 14.1.6 Pipes & Filters
- 14.1.7 Ambassador
- 14.1.8 Gateway Routing
- 14.1.9 Gateway Offloading
- 14.1.10 Gateway Aggregation
- 14.1.11 External Config Store
- 14.1.12 Compute Resource Consolidation
- 14.1.13 Backends for Frontend
- 14.1.14 Anti-Corruption Layer
14.2 Data Management¶
- 14.2.1 Valet Key
- 14.2.2 Static Content Hosting
- 14.2.3 Sharding
- 14.2.4 Materialized View
- 14.2.5 Index Table
- 14.2.6 Event Sourcing
- 14.2.7 CQRS
- 14.2.8 Cache-Aside
14.3 Messaging¶
- 14.3.1 Sequential Convoy
- 14.3.2 Scheduling Agent Supervisor
- 14.3.3 Queue-based Load Leveling
- 14.3.4 Publisher/Subscriber
- 14.3.5 Priority Queue
- 14.3.6 Pipes and Filters
- 14.3.7 Competing Consumers
- 14.3.8 Choreography
- 14.3.9 Claim Check
- 14.3.10 Async Request Reply
15. Reliability Patterns¶
15.1 Availability¶
- 15.1.1 Deployment Stamps
- 15.1.2 Geodes
- 15.1.3 Throttling
- 15.1.4 Health Endpoint Monitoring
- 15.1.5 Queue-Based Load Leveling
15.2 High Availability¶
- 15.2.1 Deployment Stamps
- 15.2.2 Geodes
- 15.2.3 Bulkhead
- 15.2.4 Health Endpoint Monitoring
- 15.2.5 Circuit Breaker
15.3 Resiliency¶
- 15.3.1 Bulkhead
- 15.3.2 Circuit Breaker
- 15.3.3 Compensating Transaction
- 15.3.4 Health Endpoint Monitoring
- 15.3.5 Leader Election
- 15.3.6 Queue-Based Load Leveling
- 15.3.7 Retry
- 15.3.8 Scheduler Agent Supervisor
15.4 Security¶
- 15.4.1 Federated Identity
- 15.4.2 Gatekeeper
- 15.4.3 Valet Key