Skip to content

AI Agents Roadmap

  • Roadmap: https://roadmap.sh/ai-agents

1. Learn the Pre-requisites

  • 1.1 Basic Backend Development
  • 1.2 Git and Terminal Usage
  • 1.3 REST API Knowledge
  • 1.4 Backend Beginner Roadmap
  • 1.5 Git and GitHub Roadmap
  • 1.6 API Design Roadmap

2. LLM Fundamentals

2.1 Understand the Basics

  • 2.1.1 Streamed vs Unstreamed Responses
  • 2.1.2 Reasoning vs Standard Models
  • 2.1.3 Fine-tuning vs Prompt Engineering
  • 2.1.4 Embeddings and Vector Search
  • 2.1.5 Understand the Basics of RAG
  • 2.1.6 Pricing of Common Models

2.2 Model Families and Licences

  • 2.2.1 Open Weight Models
  • 2.2.2 Closed Weight Models

2.3 Transformer Models and LLMs

  • 2.3.1 Model Mechanics
  • 2.3.2 Tokenization
  • 2.3.3 Context Windows
  • 2.3.4 Token Based Pricing

2.4 Generation Controls

  • 2.4.1 Temperature
  • 2.4.2 Top-p
  • 2.4.3 Frequency Penalty
  • 2.4.4 Presence Penalty
  • 2.4.5 Stopping Criteria
  • 2.4.6 Max Length

3. AI Agents 101

3.1 What are AI Agents?

3.2 What are Tools?

3.3 Agent Loop

  • 3.3.1 Perception / User Input
  • 3.3.2 Reason and Plan
  • 3.3.3 Acting / Tool Invocation
  • 3.3.4 Observation & Reflection

3.4 Example Usecases

  • 3.4.1 Personal assistant
  • 3.4.2 Code generation
  • 3.4.3 Data analysis
  • 3.4.4 Web Scraping / Crawling
  • 3.4.5 NPC / Game AI

4. Prompt Engineering

4.1 What is Prompt Engineering

4.2 Writing Good Prompts

  • 4.2.1 Be specific in what you want
  • 4.2.2 Provide additional context
  • 4.2.3 Use relevant technical terms
  • 4.2.4 Use Examples in your Prompt
  • 4.2.5 Iterate and Test your Prompts
  • 4.2.6 Specify Length, format etc
  • 4.2.7 Prompt Engineering Roadmap

5. Tools / Actions

5.1 Tool Definition

  • 5.1.1 Name and Description
  • 5.1.2 Input / Output Schema
  • 5.1.3 Error Handling
  • 5.1.4 Usage Examples

5.2 Examples of Tools

  • 5.2.1 Web Search
  • 5.2.2 Code Execution / REPL
  • 5.2.3 Database Queries
  • 5.2.4 API Requests
  • 5.2.5 Email / Slack / SMS
  • 5.2.6 File System Access

6. Model Context Protocol (MCP)

6.1 Core Components

  • 6.1.1 MCP Hosts
  • 6.1.2 MCP Client
  • 6.1.3 MCP Servers

6.2 Creating MCP Servers

6.3 Deployment Modes

  • 6.3.1 Local Desktop
  • 6.3.2 Remote / Cloud

7. Agent Memory

7.1 What is Agent Memory?

7.2 Episodic vs Semantic Memory

7.3 Short Term Memory

  • 7.3.1 Within Prompt

7.4 Long Term Memory

  • 7.4.1 Vector DB / SQL / Custom

7.5 Maintaining Memory

  • 7.5.1 RAG and Vector Databases
  • 7.5.2 User Profile Storage
  • 7.5.3 Summarization / Compression
  • 7.5.4 Forgetting / Aging Strategies

8. Agent Architectures

8.1 Common Architectures

  • 8.1.1 RAG Agent
  • 8.1.2 ReAct (Reason + Act)
  • 8.1.3 Chain of Thought (CoT)

8.2 Other Architecture Patterns

  • 8.2.1 Planner Executor
  • 8.2.2 DAG Agents
  • 8.2.3 Tree-of-Thought

9. Building Agents

9.1 Manual (from scratch)

  • 9.1.1 Direct LLM API calls
  • 9.1.2 Implementing the agent loop
  • 9.1.3 Parsing model output
  • 9.1.4 Error & Rate-limit handling

9.2 LLM Native "Function Calling"

  • 9.2.1 OpenAI Functions Calling
  • 9.2.2 OpenAI Assistant API
  • 9.2.3 Gemini Function Calling
  • 9.2.4 Anthropic Tool Use

9.3 Building Using Frameworks

  • 9.3.1 Langchain
  • 9.3.2 LlamaIndex
  • 9.3.3 Haystack
  • 9.3.4 AutoGen
  • 9.3.5 CrewAI
  • 9.3.6 Smol Depot

10. Evaluation and Testing

10.1 Metrics to Track

10.2 Unit Testing for Individual Tools

10.3 Integration Testing for Flows

10.4 Human in the Loop Evaluation

10.5 Frameworks

  • 10.5.1 LangSmith
  • 10.5.2 DeepEval
  • 10.5.3 Ragas

11. Debugging and Monitoring

11.1 Structured logging & tracing

11.2 Observability Tools

  • 11.2.1 LangSmith
  • 11.2.2 Helicone
  • 11.2.3 LangFuse
  • 11.2.4 openllmetry

12. Security & Ethics

  • 12.1 Prompt Injection / Jailbreaks
  • 12.2 Tool sandboxing / Permissioning
  • 12.3 Data Privacy + PII Redaction
  • 12.4 Bias & Toxicity Guardrails
  • 12.5 Safety + Red Team Testing