TF
Tech Frontier

Building AI Agents is hard. So I built a 12-step visual guide to make it easy

Mar 12, 2026 986 views

Hey fellow devs! πŸ‘‹

We've all seen the hype around AI Agents (Claude Code, Cursor, OpenClaw, OpenHands, etc.), but when you actually try to build one from scratch, the documentation is scattered and the logic flow between planning, memory, and tool execution can be deeply frustrating.

I spent the last few weeks breaking this down into 12 progressive sessions β€” from a single while-loop to a fully autonomous multi-agent system. Here's the complete roadmap:

🧩 The 12 Sessions

The core pattern is simpler than you think. Every agent starts with this loop:

while True:
response = client.messages.create(messages=messages, tools=tools)
if response.stop_reason != "tool_use":
break
for tool_call in response.content:
result = execute_tool(tool_call.name, tool_call.input)
messages.append(result)

Then you layer complexity on top, one concept at a time:

# Session The Core Idea
01 The Agent Loop The minimal kernel: a while-loop + one tool
02 Tools New tools register into a dispatch map; the loop never changes
03 TodoWrite An agent without a plan drifts β€” list steps first, then execute
04 Subagents Isolated messages[] per subtask keeps the main context clean
05 Skills Inject knowledge via tool_result on-demand, not upfront in system prompt
06 Compact Context will fill up β€” a 3-layer compression strategy enables infinite sessions
07 Tasks File-based task graph with ordering, parallelism, and dependencies
08 Background Tasks Run slow operations async; the agent keeps thinking ahead
09 Agent Teams When one agent can't finish, delegate to persistent teammates via mailboxes
10 Team Protocols One request-response FSM pattern drives all team negotiation
11 Autonomous Agents Teammates scan the board and claim tasks themselves β€” no manual assignment
12 Worktree + Task Isolation Each agent works in its own directory; goals and directories bound by task ID

πŸ›  What makes this different from other tutorials?

Most guides either stay too simple (basic API calls) or jump straight to LangChain abstractions. This takes a build-from-scratch approach:

  • βœ… Full working Python code for every session (not snippets)
  • βœ… Interactive simulators β€” watch the agent loop execute step-by-step
  • βœ… Diff view between sessions β€” see exactly what changed and why
  • βœ… Based on real patterns from production systems like Claude Code

The stack is intentionally minimal: Python + Anthropic API + standard library. No framework magic hiding the important parts. Completely free and MIT licensed.

πŸ‘‰ Full guide: HowToAgent.net

What's the hardest architectural decision you've hit when building agents? For me it was context compression (Session 06) β€” would love to hear what tripped others up.