Token Preservation Backfire

"Be efficient. Don't waste tokens." Reasonable advice for a person. For a long-horizon agent, it's an instruction to do less.

Why this, for you: you've probably written some version of this into a system prompt to control cost. It can quietly tank output quality — the agent reads "be efficient" as "avoid anything that might fail." The fix is a one-for-one swap in how you phrase constraints.

Add "preserve tokens," "avoid waste," or "be efficient" to a system prompt and the intent is cost savings. The effect is reduced output quality — because you've handed the agent a second objective.

1 What it looks like

Cursor hit this developing their Codex model harness. GPT-5-Codex, instructed to "preserve tokens and not be wasteful," would sometimes stop mid-task with:

"I'm not supposed to waste tokens, and I don't think it's worth continuing with this task!"

The model treated token conservation as a goal in its own right. The instruction didn't change how it worked — it changed whether it worked on substantial problems at all.

A token-preservation instruction creates a competing objective the agent resolves by doing less work — skipping exploration, refusing ambitious tasks, stopping early — not by completing the task better.

2 Why it happens

Efficiency instructions install a second objective: minimise resource use. When that competes with the task, the agent resolves the conflict by doing less. And because system-level instructions outrank user-level requests, a system-prompt "preserve tokens" takes precedence over the user's task — the agent isn't being lazy, it's faithfully following a conflicting directive.

3 The fix

Reframe constraints as quality targets, not resource limits. The swap is mechanical:

"Preserve tokens" → "Be thorough" "Don't waste resources" → "Bias to action" "Be efficient and concise" → "Implement with reasonable assumptions" "Minimise tool calls" → "Use the tools needed to verify your work" "Only read files if needed" → "Read files to build context before acting"

Where you genuinely need limits, make them mechanical (require absolute filepaths instead of "don't use relative paths") or use completion criteria — "done" means quality met, not budget hit.

A bounded budget is not the backfire

The failure is specific to long-horizon, tool-using tasks where the agent chooses whether to keep going — coding and file-system work. Brevity framing is fine for chat, summarisation, and single-turn work with no "less work" to retreat into. And a quantified budget differs from vague minimisation: the Token-Budget-Aware Reasoning framework cut tokens 68% with under 5% accuracy loss by inserting an estimated budget. The backfire is a property of vague framing, not of efficiency goals.

↪ Your win: quality targets, not resource limits

"Preserve tokens" = "do less" for a long-horizon agent that can stop early.
System instructions outrank the user task — the agent follows the conflict faithfully.
Swap resource framing for quality framing — "be thorough," "bias to action."
Make limits mechanical or anchor them in completion criteria.
A quantified budget is safe — it's the vague "don't waste" that backfires.

Retrieval practice — recall, don't peek

Question 1"Don't waste tokens" makes a long-horizon agent…

Question 2The mechanism is that system instructions…

Question 3The recommended reframing replaces "be efficient" with…

Question 4Brevity framing stays safe for…

Question 5 · spaced recall from Lesson 05Objective drift is best resisted by…

Ask me anything. Want the full instead-of/use table, or how a quantified TALE-style budget differs from vague minimisation? Next in Part 2 closes with Trust Without Verify — why polished output is no evidence of correctness.