Verifying Agent Work · ~7 min
"Build passed. Tests green. No issues found." The agent is fluent, confident, and — often enough to matter — wrong. This course is about never taking its word for it.
Agent output is seductively plausible. Well-formatted prose with inline citations looks authoritative; code that compiles looks correct. None of those surface signals correlate reliably with correctness — fluency is a separate objective from accuracy.
The mistake is using output quality as a proxy for accuracy. An agent produces hallucinated URLs, fabricated statistics, and plausible-but-wrong claims — all in grammatically perfect prose. The danger peaks when the agent is almost right: a fully wrong answer is easy to catch; a mostly-correct answer with one subtle error propagates undetected.
Users systematically overestimate LLM accuracy when shown default explanations, and longer explanations inflate confidence further without improving the answer (Steyvers et al., 2025, Nature Machine Intelligence). A follow-up found the same in human-AI teams: fluent explanations raised confidence with no accuracy gain beyond the prediction alone. The explanation persuades you; it doesn't verify the work.
"Build passed. Tests green." is a claim made inside the conversation — and the conversation has no way to check itself. The agent may hallucinate that checks passed, skip steps silently, or assert a result without ever running the tool. Practitioners report agents that claim fixes for code that was never changed and insist tests pass when the transcript shows failures.
The fix is not to re-read what the agent wrote — it's to check it against something outside the agent:
If a property can be checked programmatically, check it automatically. Linters, type checkers, and test suites are verification, not overhead. The rest of this course is how to make that checking mechanical — run by the harness, not hoped for from the model.
Constant verification has a cost. Applying production-auth scrutiny to a throwaway script destroys the productivity benefit. The failure on the other side is verification theater — running tests that don't cover the change, then treating a green suite as ground truth. The fix is calibrated verification: always verify high-stakes, irreversible, security-critical output; spot-check the proven; automate the rest.
Retrieval practice — recall, don't peek
Question 1Polished, fluent agent output correlates with…
Question 2An agent is most dangerous when its answer is…
Question 3A checkpoint that only reads the agent's "all tests pass" narration is…
Question 4Longer agent explanations were shown to…
Question 5To verify a cited claim, the right move is to…