Part 3 · Safety & State

Harness Engineering · ~7 min

Verification Gates

An agent will tell you it's done. The harness's job is to not believe it — and to make "done" mean a deterministic check passed.

Why this, for you: the most common silent failure in autonomous coding is an agent that declares victory on broken code. A verification gate makes "complete" mean something the harness can check — tests pass, build exits zero — so the model can't talk its way to done.

Models skew positive on their own work. Practitioners report agents that claim a fix for code they never changed and insist tests pass when the transcript shows failures. A checkpoint that reads the agent's self-report is not a checkpoint. Anchor "is it done?" to deterministic signals.

1 The completion gate

The fix for premature completion is a single rule the harness enforces: passing tests are the gate, not the agent's say-so. Anthropic's long-running coding harness makes this explicit in the prompt and the loop — a feature is not "done" until its tests pass, and the agent reads git log + a progress file before it's allowed to move on.

Don't gate on narration ("I fixed the bug", "all tests pass"). Gate on outcome evidencegit diff, build exit codes, test output — and cross-reference every claim against it. The signal must be more reliable than the thing it's checking; compilers and tests qualify, an unconstrained model judging itself does not.

2 Where the gate lives: the Stop hook

You already met the mechanism in Lesson 03. A Stop hook fires when the agent tries to end its turn — the exact moment it wants to declare completion. Exit code 2 blocks the stop and feeds the reason back, sending the agent back to work instead of letting it finish on red.

# .claude/hooks/require-green-tests.sh — a Stop hook if ! npm test --silent; then echo "Tests are failing. Fix them before ending the turn." >&2 exit 2 # blocks Stop; the agent keeps going fi exit 0 # green — the turn is allowed to end

This is the harness saying "you don't get to call it done while the suite is red," deterministically, every time. The same shape works as a CI gate on the PR and as a SubagentStop gate on delegated work.

3 Check at each step, not at the end

One big gate at the finish line is too late. An agent that writes 500 lines before any check may have made a wrong assumption at line 10 — everything after is built on it, and unwinding the cascade is expensive.

Error cost grows with distance from the error. A type mismatch caught at the point of introduction is a one-line fix; the same mismatch found after ten functions depend on it means auditing every callsite. Verify after each meaningful unit, not once at the end.

So layer the gates: a fast per-edit check (PostToolUse lint/typecheck), a per-feature completion gate (Stop + tests), and a CI gate on the PR. Each catches what the previous missed.

Gates aren't free — three ways they backfire

Unit too small: checking after every line suppresses exploration. Weak verifier: an LLM-judge that hallucinates rejects correct work and blesses wrong work — the gate must be stronger than the generator. Latency drag: if every iteration waits on a multi-minute matrix, agents batch fixes into huge unreviewable diffs. Keep pre-commit checks fast; move the heavy matrix to the merge gate.

↪ Your win: make "done" a deterministic check

  • Gate completion on tests, not narration — a Stop hook that exits 2 on red sends the agent back to work.
  • Cross-reference every claim against evidencegit diff, exit codes, test output; ignore "I fixed it."
  • Verify per unit, not per session — catch the wrong assumption at line 10, not line 500.
  • Layer fast + slow gates — per-edit lint, per-feature tests, per-PR CI.
  • Keep the verifier stronger than the generator — compilers and tests qualify; a model grading itself doesn't.

Retrieval practice — recall, don't peek

Question 1A reliable completion gate anchors "done" to…

Question 2A Stop hook that exits 2 when tests fail will…

Question 3Verifying at each step beats verifying at the end because…

Question 4For a verification gate to work, the verifier must be…

Question 5 · spaced recall from Lesson 05On OverEager-Bench, the biggest driver of overeager actions was…

Ask me anything. Want the full Stop-hook gate wired to your test command, or how to layer a fast PostToolUse typecheck under a slower CI matrix? Next in Part 3: Long-Running Agents — keeping state and recovering across sessions that outlive a context window.
✎ Feedback