Where You Stand

There are exactly three places a human can stand relative to an agent loop. Two of them scale badly. The third is where the leverage lives — and reversibility tells you when each is right.

Why this, for you: the instinct when an agent produces bad output is to fix the output. This lesson rewires that instinct. Knowing whether to fix the artifact or fix the thing that produced it is the difference between linear effort and compounding leverage.

Software delivery nests two loops. The why loop (idea → working software → evaluate) is always yours. The how loop (specs, code, tests → working software) is increasingly the agent's. Where you sit in the how loop decides your throughput.

1 Outside, in, or on the loop

Position	You…	The cost
Outside	delegate the whole how loop	agents in messy code spiral, take longer, cost more
In	gate each step — inspect every diff	throughput bottleneck; agents out-generate your review
On	engineer the harness that governs the loop	upfront investment; pays off across every future run

The defining distinction: when an output is wrong, "in the loop" means fixing the artifact; "on the loop" means fixing the harness that produced it. One fix helps this PR. The other helps every PR after it.

"In the loop" scales poorly because agent throughput exceeds human review capacity. The transition to "on the loop" is three moves: instrument the loop (tests, checks so agents self-evaluate), capture recurring human catches as harness rules, and review harness performance, not per-artifact quality.

2 Reversibility decides where to gate

You don't gate everywhere — you gate where mistakes are expensive to undo. Map each step to its undo cost: reversible steps run free, irreversible ones get a human.

Action	Reversibility	Gate?
Create branch, write draft, open PR, post comment	Instant / easy	No
Merge PR, publish to live site	Hard	Yes
Delete data, send external notification	Impossible	Yes

The human gate is a decision review, not an execution review: "is this the right change?" — not "is the Markdown valid?" (CI handles that). Execution review is waste; decision review is value. Progressive trust then migrates a workflow from in-the-loop (week 1) to on-the-loop (month 1) to out-of-the-loop for proven tasks (month 3) — but only after reliability is demonstrated at each level.

On-the-loop isn't a universal remedy

Microsoft's Azure SRE Agent team found heavy harness scaffolding — pre-written queries, curated tools — produced strong benchmarks but a low ceiling: "every prewritten query was a place we told the model not to think." Their breakthrough came from removing scaffolding. Harness investment encodes constraints (tests, linters, rules), not decisions. It inverts for novel problem classes, rapidly shifting model capability, and small or short-lived projects.

The gate that exists only on the diagram

Gates degrade under load. Rubber-stamping (reviewers approve hundreds of actions reflexively), automation complacency (the more reliable the agent looks, the less vigilant you become), and mismatched cadence (one human can't supervise tens of actions a minute) all turn a gate into theater. Rotate reviewers, inject negative samples, and prefer async on-the-loop monitoring once you've measured the error rate.

↪ Your win: fix the harness, not the artifact

Know your position — outside, in, or on the loop — and that "on" compounds.
When output is wrong, fix what produced it — a rule beats a one-off edit.
Gate by reversibility — free on reversible steps, human on irreversible ones.
Review decisions, not execution — "is this right?" not "is the syntax valid?"
Encode constraints, not decisions — pre-computing the answer space lowers the ceiling.

Retrieval practice — recall, don't peek

Question 1The human always owns the…

Question 2"On the loop" means that when output is wrong, you…

Question 3The most reliable signal for where to place a gate is…

Question 4Heavy harness scaffolding backfired for the Azure SRE team by…

Question 5 · spaced recall from Lesson 2The reasoning sandwich puts the lowest reasoning budget on the…

Ask me anything. Want the three-move transition checklist from in-the-loop to on-the-loop, or how progressive trust maps onto permission modes (default → acceptEdits → dontAsk)? Next, Part 2 opens with Becoming a Tech Lead — running parallel agent sessions.