Part 1 · When and How

Multi-Agent Systems · ~8 min

When Many Agents Beat One

A second agent feels like more horsepower. Usually it's more coordination cost for no quality return. The first skill is knowing when not to.

Why this, for you: the reflex when an agent struggles is to add another agent. This lesson installs the opposite reflex — climb the complexity ladder one rung at a time, and reach for multiple agents only when the task structure forces it. Get this wrong and you pay ~15× the tokens for output a single agent would have matched.

Multi-agent systems are not an upgrade you apply to a hard task. They are a specific tool for a specific shape of problem — and the evidence says they lose more often than they win when that shape isn't present.

1 Climb the ladder, don't jump to the top

Microsoft's orchestration guidance states the rule plainly: "Use the lowest level of complexity that reliably meets your requirements." Anthropic's Building Effective Agents gives the same escalation — "add multi-step agentic systems only when simpler solutions fall short." There are three rungs:

RungWhat it isSolves
Direct model callOne prompt, no tools, no agent loopClassification, summarization, single-step extraction
Single agent + toolsOne agent that reasons, calls tools, loops until doneThe right default for most tasks
Multi-agentSeveral agents under an orchestrator or peer protocolOnly when prompt complexity, tool overload, or security boundaries break a single agent

Each rung adds capability and failure surface. You only earn the next rung when the current one stops being reliable — not when the task merely looks intimidating.

2 What multi-agent is actually good at

The justification for multiple agents is narrow and specific: a task that needs multiple independent directions at once. A review of 94 multi-agent software-engineering papers confirms parallelism and specialization as the primary rationale for going multi-agent over single-agent. Concretely:

The common thread is independence. If subtask B needs subtask A's output, that's a chain, not a fan-out — parallelism buys you nothing and the agents just wait on each other.

The benefit is conditional, not automatic. A protocol-aligned evaluation across ten benchmarks found most multi-agent configurations underperformed a single-agent baseline — only one of six tested workflows beat it. Adding workers to a task that doesn't decompose cleanly buys coordination cost with no quality return.

3 The bill comes due in tokens

Coordination isn't free, and the price tag is large. Anthropic's research-system data reports token multipliers of ~4× for a single agent and ~15× for multi-agent (orchestrator plus workers) over a plain chat interaction — with token usage explaining roughly 80% of performance variance across research tasks.

The payoff, when the shape is right

On genuinely complex research, Anthropic's internal evals showed Opus orchestrating Sonnet workers outperformed single-agent Opus by 90.2%. The architecture earns its 15× when the task is broad and parallelizable. It wastes 15× when the task is narrow or sequential.

So the decision is economic as much as architectural: the task's value has to justify a 15× compute bill, and that only holds when the work genuinely splits into independent directions.

↪ Your win: a default of one, justified before many

Retrieval practice — recall, don't peek

Question 1The rule for picking a complexity level is to use the…

Question 2Multi-agent's primary rationale over a single agent is…

Question 3In the ten-benchmark evaluation, most multi-agent configurations…

Question 4A rough token multiplier for orchestrator-plus-workers over chat is…

Question 5A task where subtask B depends on subtask A's output should be…

Ask me anything. Want help deciding whether a specific task of yours actually justifies multiple agents, or how to estimate the 15× token bill before you commit? Next in Part 1: The Orchestrator and Its Workers — the canonical multi-agent shape, in depth.
✎ Feedback