Harness Engineering · ~8 min
Lesson 7 told a long-running agent to resume from a durable log instead of carrying everything in context. This is the move that keeps the in-session context worth carrying — replacing accumulated token mass with a dense summary before quality silently erodes.
/clear between
unrelated tasks — without teaching the in-between move. Compaction is that move: you keep the thread
alive but shrink it, trading a clean wipe for a distilled summary. Get the timing wrong and you either reason in a
degraded window or amputate context you still needed. This lesson is the timing.
Compaction replaces the accumulated conversation history with a dense summary, freeing
the context window while preserving task intent and critical state. It sits between two extremes you already know:
a full /clear (throw everything away) and carrying the whole history (let the window saturate).
Claude Code's auto-compaction fires at roughly 95% of the context window — and that is far too late. The window has a dumb zone: as context fills, pairwise token relationships stretch thin and reasoning degrades. Anthropic calls it "context rot," a gradient rather than a cliff, appearing "across all models."
That gap, between degradation onset and the auto-trigger, is where output silently erodes. Manual compaction closes it by reframing the operation: not memory cleanup, but reasoning-quality preservation.
The signal isn't a token count — it's a transition. Compact at the joints between phases, and right before you ask for the hardest reasoning of the session.
| Compact when… | Don't compact when… |
|---|---|
| Before reasoning-intensive work (architecture, multi-step debugging) | The agent is mid-chain and needs the accumulated context to finish |
| After a bulk read whose details you've extracted | Reference material (schemas, specs) will be re-needed repeatedly |
| At task-type transitions (done searching, now planning) | You're iterating one file where the full edit history informs the next change |
Direct what survives, rather than trusting a blind summary. A focus directive — /compact Focus on the API
changes and the test failures — or a persistent CLAUDE.md block that always preserves the task
objective, modified file paths, and unresolved errors. Compaction is lossy; the directive is how you steer the loss
away from what matters.
Three levers turn compaction from a panic button into a discipline:
Lower the trigger. CLAUDE_AUTOCOMPACT_PCT_OVERRIDE takes a value 1–100. Set it to
50–60% for reasoning-heavy sessions, 70–80% for mixed, and leave the 95% default only for retrieval-heavy work that
tolerates a fuller window.
Graduate the stages. A single binary compaction is a cliff. OPENDEV's five-stage Adaptive Context Compaction degrades incrementally — warn at 70%, mask older observations at 80%, prune at 85%, aggressive-mask at 90%, full LLM summary at 99% — so the agent never hits one moment where the whole history collapses at once.
Offload, don't just summarize. The strongest pattern pairs summarization with offloading: large tool payloads go to disk, replaced by a reference plus a brief summary, recoverable on demand. That makes the operation selective discarding, not lossy encoding — artifacts stay on disk.
Compaction has its own failure mode: aggressive summarization drops subtle constraints whose importance only surfaces later — Anthropic warns that "overly aggressive compaction can result in the loss of subtle but critical context." Each cycle adds summarization error; long sessions accumulate drift a single summary can't undo. And a too-low threshold forces lossy summarization while the window is still navigable, risking objective drift if the scope constraint gets omitted. Start from maximum recall and tune toward precision — not the reverse. If offloaded payloads get deleted, recoverability breaks and the whole approach is worse than just holding the context.
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE to 50–60%.Retrieval practice — recall, don't peek
Question 1Auto-compaction at ~95% fires too late because the agent has already…
Question 2BABILong found LLMs effectively use only what fraction of a long window for reasoning?
Question 3The best moment to compact manually is generally…
Question 4Offloading a large tool payload to disk makes compaction…
Question 5 · spaced recall from Lesson 7A long-running agent survives session boundaries by moving state into…