Compilers & Interpreters Roadmap¶
"A compiler is just a function from strings to strings — but it's the most interesting such function ever written."
This roadmap is about how a programming language is turned into something that runs — lexing, parsing, type-checking, optimization, code generation, and the runtime that supports it all. It's also the gateway to the practical skills every senior engineer needs: writing DSLs, reading compiler errors fluently, understanding why the language behaves the way it does at the edges.
Looking for the CS-foundations angle (formal grammars, automata theory, optimization theory)? See Architecture → CS → Compilers — that section approaches compilers as a CS topic. This roadmap approaches them as a tool a working engineer reaches for (parsing DSLs, building linters, reading codegen output).
Looking for build orchestration (make, bazel, cargo)? See Build Systems.
Why a Dedicated Roadmap¶
Most engineers will never write a production compiler — but most engineers will:
- Read a parser/AST library (Tree-sitter, ANTLR,
go/ast) - Write a small DSL or a config-language interpreter
- Debug "why does the compiler emit this weird code?"
- Implement a linter, codemod, or codegen tool
- Read LLVM IR or JVM bytecode to understand a performance issue
This roadmap targets that working-engineer slice — the parts of compilers that show up in everyday senior work.
| Roadmap | Question it answers |
|---|---|
| Type Systems | What can the compiler prove about my program? |
| Build Systems | How is source turned into a runnable artifact? |
| Compilers & Interpreters (this) | How does a language actually understand and execute code? |
Sections¶
| # | Topic | Focus |
|---|---|---|
| 01 | The Big Picture | Source → tokens → AST → IR → target; the phases and what each does |
| 02 | Lexers / Tokenizers | Regular languages, state machines, hand-written vs generated lexers |
| 03 | Parsers | Grammar classes (LL, LR, PEG), recursive descent, Pratt parsing, error recovery |
| 04 | Abstract Syntax Trees | Designing an AST, visitor pattern, immutable trees, source positions |
| 05 | Semantic Analysis | Name resolution, scopes, type checking, constant folding |
| 06 | Intermediate Representations | Three-address code, SSA form, LLVM IR, JVM bytecode |
| 07 | Optimization | Constant folding, inlining, DCE, escape analysis, loop optimizations |
| 08 | Code Generation | Register allocation, instruction selection, target ABIs |
| 09 | Interpreters | Tree-walking, bytecode VMs, JIT compilers, AOT vs JIT trade-offs |
| 10 | Runtimes | GC, threads, FFI, the line between language and runtime |
| 11 | DSLs in Practice | External vs internal DSLs, when to design one, when to just use a library |
| 12 | Reading Codegen | Inspecting Go's SSA dumps, JVM bytecode (javap), Rust's MIR, LLVM IR |
Languages¶
The roadmap is meta-linguistic: examples in Go (writing a small interpreter, using go/ast and go/parser), Java (reading bytecode, ASM), Python (ast module, building DSLs), and Rust (syn, procedural macros) — the four languages most likely to show up in real codegen / tooling work.
Status¶
⏳ Structure defined; content pending.
References¶
- Compilers: Principles, Techniques, and Tools — Aho, Lam, Sethi, Ullman ("the Dragon Book")
- Engineering a Compiler — Cooper & Torczon
- Crafting Interpreters — Robert Nystrom (free online, the modern hands-on classic)
- Writing an Interpreter in Go / Writing a Compiler in Go — Thorsten Ball
Project Context¶
Part of the Senior Project — a personal effort to consolidate the essential knowledge of software engineering in one place.