Monolith to Sub-Agents

Your agent prototype is one big prompt in a loop. It works on your laptop and fails silently in production. Here is the five-step path out — applied in order, because each step exposes the failure the next one fixes.

Why this, for you: the prototype-to-production gap for agents is its own discipline. This is the checklist Google's ADK team published from rebuilding a real agent — sequencing, schemas, dynamic context, tracing, and bounded loops — and the precondition that decides whether the refactor helps or hurts.

A monolithic agent is one linear script calling one LLM with one large prompt. Its primary failure mode is silent collapse: if any sub-task fails — an API timeout, a hallucination — the whole process stalls and fails silently, and a step-2 hallucination quietly corrupts step-5's inputs because they share the prompt.

1 The five steps, in order

#	Step	The failure it fixes
1	Replace the loop with sequenced sub-agents, one responsibility each	Silent collapse; the pipeline now surfaces which step failed
2	Push structured outputs into the schema, not the prompt	Fragile JSON parsing; tokens wasted re-stating the format
3	Replace hardcoded context with a dynamic retrieval pipeline	Corpus changes that force a redeploy
4	Add distributed tracing before production, not after	Black-box debugging of the first incident
5	Delegate loop boundaries to the framework's circuit breakers	Retry loops that burn the token budget in minutes

Google's ADK team documented this exact transition rebuilding "Titanium" — a sales-research agent — from a monolithic for loop into a five-node SequentialAgent pipeline: Company Researcher → Search Planner → Case Study Researcher → Selector → Email Drafter. Each boundary is a failure seam: a step succeeds under contract or raises.

2 Each seam is a contract, each loop is bounded

Move the output shape out of the prompt string and into a typed object the runtime validates (Pydantic, Vertex/Anthropic/OpenAI structured outputs). Wire OpenTelemetry before the first incident — "you cannot put an agent into production without live diagnostics." And don't hand-roll retry logic: every bug in your try/catch/retry handler is its own failure mode.

"If an agent hits an error and continually retries a prompt without strict boundaries, it will burn through your token budget in minutes." Use the framework's exponential backoff, timeout ceilings, and retry caps — not hand-written loops.

Decompose only loosely-coupled work

The refactor assumes sub-tasks are loosely coupled and independently verifiable. If the steps share dense mutable state — a coding agent editing interconnected files, a conversational agent whose turns depend on nuanced history — decomposition serializes that state across schemas and loses context the monolith carried implicitly. Cognition's argument against parallel multi-agents applies there.

Unstructured split is worse: the 17.2× error trap

Splitting into sub-agents without a defined topology — sequential, orchestrator-worker, or evaluator — amplifies errors, because each agent's hallucinations feed the next. One analysis measured up to a 17.2× error multiplier in "bag of agents" systems. The topology is the point, not the count.

↪ Your win: a production-shaped pipeline

Sequence named sub-agents — one responsibility each, so failures become attributable seams.
Schema, not prompt — a runtime-validated contract kills fragile parsing and token waste.
Dynamic retrieval — the corpus refreshes without redeploying the agent.
Trace first — OpenTelemetry before the first incident; a black-box monolith is debuggable only in hindsight.
Framework circuit breakers — bounded retries beat hand-written loops, where token budgets go to die.

Retrieval practice — recall, don't peek

Question 1The monolithic agent's primary failure mode is…

Question 2Structured output belongs in the…

Question 3Distributed tracing should be wired in…

Question 4Splitting into sub-agents without a defined topology can produce up to a…

Question 5 · spaced recall from Lesson 8Snapshot-rollback setup agents convert irreversible system mutations into reversible ones using…

Ask me anything. Want the loosely-vs-tightly-coupled test applied to your own prototype, or which structured-output primitive fits your provider? Next in Part 3: Garbage-Collecting Entropy — scheduled agents that keep a codebase from rotting.