Every agent works in its own context window. The handoff between them is the one channel — and the single largest place multi-agent systems lose information.
Why this, for you: the biggest category of multi-agent failure isn't a bad agent — it's
inter-agent misalignment, ambiguity at one handoff that surfaces as a wrong action several agents
downstream. Two contracts fix it: a typed handoff at the boundary, and a plan-approval gate before
any write. Both move the cost of a mistake to the cheapest possible moment.
A research agent's findings don't automatically reach the writer. Each handoff is an information-loss
point: too little context and the next agent guesses wrong; too much and it drowns in noise. The contract is what the
upstream agent writes and the downstream agent reads.
1 Structure the handoff, don't forward the transcript
The receiving agent needs conclusions, not transcripts. Forwarding a raw exploration log inflates
its context with reasoning it didn't produce and can't efficiently parse. Define what each stage produces — common
fields across handoff formats:
What was done — scope of work completed
What was found — conclusions, not raw exploration
What needs attention — items the next agent must address
What is unresolved — open questions or blockers
A structured JSON or markdown schema beats prose notes: field extraction is deterministic
and doesn't depend on the receiver parsing free text. In longer pipelines this compounds — each stage of ambiguity
multiplies, so early structure prevents error propagation across multiple handoffs.
# research → writer handoff: conclusions and open items, not the trace
{ "stage": "research","findings": [ "Tier 1 capped at 50 RPM", "429 carries Retry-After" ],"needs_attention": [ "Verify Tier 4 limits by direct measurement" ],"unresolved": [ "Whether prompt caching affects RPM is undocumented" ] }# writer reads findings for content, unresolved as open Qs — not facts
Persistent media — GitHub issues, PRs, comments — make handoffs durable, reviewable, and survivable across context resets. Labels encode pipeline state.
2 Gate the first write: the plan-approval handshake
In a lead-and-teammates topology, each teammate works in its own window and self-claims tasks. A
plan-approval handshake gates the first write: the teammate produces a plan in read-only mode,
sends it to the lead, and only begins editing after approval. Rejections round-trip with feedback — no code written
during the rejection round.
What makes it a gate, not a suggestion: plan mode is enforced by the permission model,
not prompt discipline. The edit tools are structurally unavailable until the lead's approval changes the mode. A
prompt-level "please get approval first" is routinely violated under load.
Approval criteria live in the lead's prompt — "only approve plans that include test coverage,"
"reject plans that modify the database schema." The unique signal the lead adds over a generic critic is
cross-task coherence: it sees every teammate's plan against the shared task list, the only vantage
point in the team with that view. That's the cheapest intervention against two teammates colliding on the same file.
Both contracts have a backfire mode: ceremony
A schema on a single-stage task adds friction with no boundary to protect. A plan gate on a
one-file edit doubles round-trips to catch what a post-write diff would. And a rigid schema can
hide uncertainty — a well-formatted findings array reads as authoritative even when the upstream
agent was tentative. Protocols pay off only when work genuinely crosses boundaries and a mistake is expensive to undo.
↪ Your win: explicit boundaries, cheap-moment review
Hand off conclusions, not transcripts — what was done / found / needs attention / unresolved.
Prefer a typed schema — deterministic field extraction; ambiguity compounds across stages.
Use durable media for cross-session handoffs — issues and PRs survive context resets.
Gate the first write with plan approval — enforced by the permission model, criteria in the lead's prompt.
Skip both on solo, single-stage, low-stakes work — otherwise the contract is just ceremony.
Retrieval practice — recall, don't peek
Question 1The receiving agent at a handoff should be given…
Question 2A typed schema beats prose at a handoff because field extraction is…
Question 3The plan-approval handshake is a real gate because plan mode is enforced by…
Question 4The unique signal the lead adds over a generic critic is…
Question 5 · spaced recall from Lesson 4A forked subagent is the wrong choice for a child that must…
Ask me anything. Want a handoff schema for a specific pipeline of yours, or approval criteria to
put in a lead's spawn prompt? Next in Part 2: Verify-Gated Completion — who gets to decide the work is
actually done.