Security · ~7 min
A web page is a crowd of strangers' writing. If the page picks your next action, any stranger can. So fix the plan before the page loads.
ReAct interleaves reason and act: at each step the model observes the page, reasons, then chooses the next action. But that page combines a seller's listing, customer reviews, and sponsored ads — each authored by a different party, any of which can carry an injection. Because the page enters the prompt that selects the next action, an injection anywhere can redirect control flow.
Under plan-then-execute, the agent commits to a task-specific program before any page is observed. The program is a typed sequence of steps with known inputs, branches, and effects. Page content can populate values — the price to record, the option to select — but cannot synthesize new actions.
A malicious review is read by ExtractValue only if the program asks for it, and the extracted string
can never re-enter the planner. An injection might change the value recorded, but not which page is
visited or which button is clicked.
This is the same architectural family as CaMeL: a privileged channel carries control flow from the trusted user task; a quarantined channel handles untrusted content with no authority to alter what runs.
On the WebArena benchmark, every task is compatible with plan-then-execute, and 80% complete with a purely programmatic plan — no runtime LLM subroutines. The remaining 20% use bounded LLM calls (extraction, classification) inside a fixed control graph. The graph is set before execution either way.
Plan-then-execute closes the runtime control-flow path. It does not bound the blast radius of an action the plan legitimately authorizes, and the plan-construction phase plus any in-graph LLM subroutines remain attack surfaces. Treat it as a foundation that still needs Lesson 5's defense-in-depth — task-scoped tools, least privilege, sandboxed execution.
| When it weakens | Why |
|---|---|
| Unknown task structure | Open-ended discovery can't decompose before observation |
| Brittle target sites | DOM churn / A/B variants break pre-committed selectors |
| Low-stakes read-only browsing | No consequential action, no private data — ReAct's flexibility wins |
Browser primitives (click, type, scroll) carry page-dependent meaning — the same coordinate does different things on different pages — so plan-then-execute at the primitive layer is brittle. The real fix is typed, complete, auditable website APIs: tools that map to semantic actions with effects known before execution. Until those exist, the pattern runs against a less-than-ideal substrate.
Retrieval practice — recall, don't peek
Question 1ReAct is risky for web agents because the page…
Question 2Under plan-then-execute, untrusted page content can…
Question 3On WebArena, the share completable with a purely programmatic plan is…
Question 4Plan-then-execute is best treated as…
Question 5 · spaced recall from Lesson 5Egress decisions belong in the harness because…