Part 2 · Scaling Out

Agentic Workflows · ~7 min

Becoming a Tech Lead

Running several agents at once doesn't make you a faster coder. It makes you a tech lead — and your review bandwidth, not agent speed, becomes the thing that's actually scarce.

Why this, for you: the move from one agent to many is the move from contributor to orchestrator. This lesson is what changes (the bottleneck), what it costs (coordination overhead), and the infrastructure you need before you fan out — so you don't trade typing for chaos.

At peak, OpenAI's Sora team ran simultaneous Codex sessions on playback, search, error handling, and tests. The experience was "uncannily similar to being a tech lead with several new engineers, all making progress, all needing guidance." The human stops producing code and starts producing decisions.

1 The bottleneck moves to you

Agents work in parallel without attention cost to each other. Humans switch between sessions serially, each switch carrying cognitive overhead. So the constraint becomes your capacity for high-quality review and decisions — not the number of sessions available.

Brooks's Law applies to agents: adding sessions increases coordination overhead, and linear speedup is not guaranteed. Beyond a handful of concurrent sessions, the cost of staying current can exceed the parallelism benefit.

The practical responses: design sessions to batch their questions rather than interrupt constantly, structure tasks so a session makes extended independent progress before needing input, and spend your time on decisions only you can make — not on reviewing work the agent handles reliably.

2 Assistant model vs. factory model

The shift has a name. In the assistant model, one human watches one agent — the human is the feedback loop, so agent speed is bounded by human response time. In the factory model, one human orchestrates many sessions and automated systems — tests, CI, linters — are the primary feedback. The human reviews asynchronously.

The factory model is not a mindset change — it's infrastructure: automated feedback loops authoritative enough for agents to self-correct, monitoring that signals blocked/failed/finished without watching, task isolation (worktrees, next lesson), and skill libraries that substitute documented conventions for real-time clarification.

The throughput is real — under conditions

Anthropic's multi-agent research system (lead + parallel subagents) outperformed single-agent Claude Opus 4 by 90.2% on internal research evals and cut research time up to 90% for complex queries. The mechanism is removing sequential bottlenecks on independent work with automated feedback — not adding agents to inherently sequential tasks.

Where the factory breaks: the 41–87% failure band

The factory assumes automatable feedback. It fails on exploratory goals, tasks needing frequent guidance, undocumented tacit knowledge, and flaky verification (agents optimize to pass the gate, not solve the problem). The MAST taxonomy — 1,600+ traces across seven frameworks — found per-framework failure rates of 41–87%: 41.77% from specification ambiguity, 36.94% from coordination breakdowns, 21.30% from verification gaps. Unambiguous specs and deterministic verification are the floor.

↪ Your win: orchestrate, don't type

Retrieval practice — recall, don't peek

Question 1With several parallel sessions, the actual throughput constraint becomes…

Question 2Brooks's Law applied to agents says that adding sessions…

Question 3In the factory model, the primary feedback source is…

Question 4Parallelism produces gains specifically by…

Question 5 · spaced recall from Lesson 3The human gate at merge should be a…

Ask me anything. Want the scoped task-prompt template that tells a session when to stop and report, or how TeammateIdle hooks let you monitor a fleet without watching? Next in Part 2: Sandboxes for Swarms — git worktrees and the single-branch counterpoint.
✎ Feedback