← All courses

A Hands-On Course · 10 lessons

Observability

See what your agent did — tracing, debugging, event sourcing, and evals.

Short lessons (~5–8 min each), each with one tangible win and a retrieval-practice quiz. Built for engineers who already use AI coding tools and want the non-obvious mechanics.

Grounded in the agentpatterns.ai corpus (CC BY 4.0). Keep the Glossary open as you go.

Part 1 · Making Agents Legible

1 Write and Hope An agent that only reads code and test output is flying blind. Wire in signals it can see, and it stops guessing whether the fix worked. 2 Leaving a Trail A long-running agent forgets across sessions. A progress file, git commits, and OTel traces give every fresh context window a record of what already happened.

Part 2 · The Source of Truth

3 The Log Is the Truth Stop letting agents write files directly. Have them emit intentions to an append-only log; a deterministic orchestrator applies the effects — and you can replay the whole thing. 4 The Four Failure Modes When an agent produces bad output, the bug is almost never the model. Classify which of four layers failed before you change anything.

Part 3 · Stopping the Bleeding

5 Breaking the Loop Every iteration of a stuck agent looks like progress from the inside. Three layers — edit-count, doom-loop, iteration cap — stop it before the context window burns. 6 Gates That Catch Regressions Observability tells you what happened. Evals tell you whether a change made things worse. Grade the final state — not the path — and know exactly when an LLM judge is trustworthy.

Part 4 · Many Agents, One Trace

7 Attributing the Context "82% full" names a symptom, not a cause. Cut the window into the sources you can actually act on — rules, skills, MCP, subagents, history — and you prune the right one instead of compacting on reflex. 8 Catching the Wasted Run A multi-agent run burns thousands of tokens before the grader ever sees the answer. Six trace signals tell you why a run is failing while budget remains to intervene — not after. 9 One ID Across the Trace Six signals per agent are useless if you can't ask "show me everything this subagent did." A stable agent_id on every header and span makes the trace queryable by identity — no tree walk required.

Capstone

10 Seeing the Whole Run Nine techniques, one job: make the agent and the system it touches legible — solo or fanned-out. Here's the decision table for which to reach for, and a mixed review across the whole course.