Harness Engineering · ~8 min
Lesson 12 flagged the open problem: an orchestrator can burn ~15× the tokens of a chat, and a stuck agent will loop until the window fills. Here's the harness that caps the spend and trips the stop — the limit nobody taught you to wire in.
A cost control routes work to the cheapest tier that meets the task and caps the budget it can spend. A circuit breaker halts an agent loop when progress stalls — repeated errors, runaway cost, context exhaustion, or circular behavior. One bounds spend by design; the other bounds it by detection.
Model cost scales with tier and token volume. Top-tier models on every task waste compute; cheap models on complex tasks produce rework. The fix is to match capability to the task and pay up only when you must.
| Task | Tier |
|---|---|
| File search, exploration — high volume, low reasoning | Fast (e.g. Haiku) |
| Code implementation — balanced capability and speed | Balanced (e.g. Sonnet) |
| Architecture, complex refactoring — deep reasoning | Powerful (e.g. Opus) |
This rhymes with the reasoning sandwich of Lesson 11: spend the expensive resource where ambiguity is highest. Here the resource is dollars instead of reasoning tokens, but the discipline is identical — escalate on a cheap, deterministic signal, never on habit.
Routing bounds the cost of healthy work. A stuck agent isn't healthy — it applies the same failed fix, retries a flaky test twenty times, consumes resources without progress until the window fills or the session is killed. A circuit breaker watches for the stall and halts it.
| Signal | Trips when… |
|---|---|
| Iteration limit | The agent has taken N steps without completing — maxTurns enforces it at the runtime level |
| Repeated failure | The same call fails the same way — a 429 three times running will keep 429-ing |
| Repetition | The agent re-fetches a URL or re-reads a file with no new information — a stuck loop |
| Context budget | The window approaches the dumb zone of Lesson 17 — trip on dropping recall, not a fixed count |
| Cost threshold | Spend exceeds the expected budget — overrun often correlates with looping |
When a breaker trips, degrade gracefully: stop new actions, return the partial results already completed, explain what triggered the stop and what remains, and escalate to a human if the pipeline has a gate. Partial results are more useful than nothing — never discard completed work.
Where the breaker is enforced decides whether it can be ignored. maxTurns and session cost budgets are
enforced at the runtime — the model gets no vote, exactly like the hooks and permission rules of Parts 2–3. Error-rate
and repetition checks written as agent instructions depend on the model reading and obeying its own rules; if it
ignores them mid-reasoning, the stop never fires. Hooks sit in between: deterministic scripts that monitor and trigger.
This is also the kill path Lesson 16 demanded. A narrowly-scoped agent still accumulates time-integrated damage between detection and shutdown — and a Kiteworks 2026 report found 60% of organizations can't terminate a misbehaving agent at all. A runtime-enforced breaker is the termination path the agent itself cannot block, closing the loop sandboxing left open.
Circuit breakers are failure-mode detectors, not correctness guarantees. A low maxTurns cuts off
legitimate multi-step refactors — production frameworks have open issues where agents halt mid-task on "max
iterations" while still making progress. Naive repetition checks fire on valid re-reads (re-reading a file after an
edit, refetching after a 429 backoff). A hard cost cap trips on a successful exploration run as readily as on a loop
— the signal is cost without progress, not cost alone. The steelman: if your agents already fail gracefully,
another stopping layer mostly adds false positives. Instrument first; add breakers where instrumentation shows real
loops, not prophylactically.
maxTurns and cost budgets the model can't override.Retrieval practice — recall, don't peek
Question 1Cost-aware routing sends high-volume, low-reasoning exploration to the…
Question 2Cascade routing escalates to the capable model only after…
Question 3Which stop signal is enforced at the runtime, not by instruction?
Question 4When a circuit breaker trips, the agent should…
Question 5 · spaced recall from Lesson 12Orchestrator-worker fan-out is worth its ~15× token cost only when subtasks are…
maxTurns with a cost budget for a worker pool? Next, the Capstone:
a symptom→move table that folds all nineteen lessons into one decision tool.