Databases Roadmap¶
The vendor-agnostic concept layer for databases — the one canonical home for the ideas that apply regardless of engine: the relational model, transactions, consistency, indexing, replication, sharding, and the SQL-vs-NoSQL decision.
- Reference roadmaps: roadmap.sh/postgresql-dba · roadmap.sh/sql · roadmap.sh/mongodb · the Databases node of roadmap.sh/computer-science
How this fits with the rest of the repo¶
Databases live at several altitudes. To avoid duplication, each altitude has one home and links to the others:
| Layer | Home | What it covers |
|---|---|---|
| Concepts (this roadmap) | Backend/databases/ | Vendor-agnostic theory: ACID, normalization, CAP, indexing, replication, sharding |
| Specific engines | postgresql-dba · mongodb · redis · elasticsearch | How a concrete engine implements & operates these concepts |
| Engine internals | computer-science/20-database-internals | How storage works inside: LSM vs B-tree, MVCC implementation, WAL, query planner |
| At system-design altitude | system-design/12-databases | Choosing a datastore for a system; sharding/replication as scaling & availability strategies; CAP tradeoffs in distributed design |
| On-disk data structures | DSA · trees · advanced | B-tree, B+ tree, LSM-tree, Merkle tree as data structures |
Rule of thumb: "What is ACID / how does an index work" → here. "Should I shard this service / which DB for this system" → system-design.
Topics¶
Data modeling¶
- Relational Model — relations, keys, constraints, relational algebra
- Normalization & Denormalization — 1NF→BCNF, when to denormalize
- ER Modeling — entities, relationships, cardinality, ER → schema
- SQL: DDL / DML / DQL / DCL — the four sublanguages
- Views — virtual & materialized views
- Stored Procedures & Triggers — server-side logic, pros/cons
Transactions & consistency¶
- Transactions & ACID — atomicity, consistency, isolation, durability
- Isolation Levels — read phenomena, the four SQL levels, snapshot isolation
- Locking & Concurrency Control — pessimistic vs optimistic, deadlocks
- MVCC — multi-version concurrency (concept; impl in engine-internals)
- BASE & Eventual Consistency — the NoSQL relaxation
- CAP Theorem — consistency / availability / partition tolerance
- PACELC — the latency-vs-consistency extension of CAP
Performance¶
- Indexing — B-tree/hash/GIN/GiST concepts, composite indexes, when not to index
- Query Optimization — EXPLAIN, join strategies, statistics
Scaling & distribution¶
- Replication — leader-follower, multi-leader, sync vs async, lag
- Partitioning & Sharding — horizontal/vertical, shard keys, rebalancing
- Database Federation — splitting by function
- SQL vs NoSQL — the decision and its tradeoffs
- NoSQL Data Models — key-value, document, wide-column, graph
Operations & analytics¶
- OLTP vs OLAP & Warehousing — transactional vs analytical, star schema
- Connection Pooling — sizing, PgBouncer/HikariCP
- Caching at the DB Layer — buffer pool, query cache, read-through
- Backup & Recovery — WAL archiving, PITR, RTO/RPO
roadmap.sh coverage (Computer Science → Databases)¶
Every subtopic of the roadmap.sh CS Databases node maps here:
| roadmap.sh | Here |
|---|---|
| SQL vs NoSQL | 19 |
| Normalization / Denormalization | 02 |
| Entity-Relationship Model | 03 |
| DDL, DML, DQL, DCL | 04 |
| Locking | 09 |
| ACID Model | 07 |
| BASE | 11 |
| CAP Theorem | 12 |
| PACELC | 13 |
| Indexes | 14 |
| Views | 05 |
| Transactions | 07 |
| Stored Procedures | 06 |
| Database Federation | 18 |
| Replication | 16 |
| Sharding | 17 |
Status: skeleton — folders created, content to be written.