Reference · Canonical Language

Context Engineering — Glossary

The working vocabulary for this learning workspace. Once a term lives here, every lesson uses this word for it. Grows as we go.

The Discipline

Context engineering · aliases: context management, context design: The discipline of designing what enters a model's context window, how it is structured, and what is excluded, to maximise output quality and reliability.; Avoid: "prompt engineering" (that is a sub-part — designing individual instructions within the context).; Source: context-engineering.md
Signal density: The figure of merit. Anthropic's framing: "the smallest set of high-signal tokens that maximize the likelihood of your desired outcome." Optimise for density, not volume.; Avoid: "more context is better", "fill the window".; Source: Anthropic — Effective Context Engineering
Token economics · opportunity cost: Every token included is a token excluded. Context space is finite, so each inclusion displaces reasoning, instructions, or task-relevant content. Inclusion is exclusion.; Source: context-budget-allocation.md
Context pollution: Irrelevant context that accumulates and competes with relevant content for attention. Diagnostic: "Does this improve output on this specific task?" If no, it is pollution.; Avoid: "noise", "clutter" — pollution is the precise term.; Source: context-engineering.md · session-partitioning.md

Attention & Positioning

Context layer: One tier of the context stack, each with its own persistence: system prompt → project instructions → skill definitions → conversation history → tool outputs. Persistence and cost differ per layer.; Source: layered-context-architecture.md
Lost in the middle · U-shaped attention: Attention is strongest at the start and end of the window and weakest in the middle, regardless of content importance. Rules belong at the edges; reference material can sit in the middle.; Avoid: "the model ignores stuff" — the bias is positional and predictable.; Source: lost-in-the-middle.md · Liu et al. 2023
Attention sink: The first tokens in a sequence attract disproportionate attention regardless of their semantic content — position, not importance, drives the weight.; Source: attention-sinks.md

Degradation & Compaction

Dumb zone · aliases: context rot: The region of context fill where output quality has measurably degraded. It is a gradient, not a cliff — and it starts earlier than most expect.; Avoid: "running out of context" (degradation is about quality, not capacity).; Source: context-window-dumb-zone.md
Effective context: The token length up to which a model actually performs the task well — often far below the advertised window (RULER: 16–50% of claimed). Degradation onset is closer to an absolute token threshold (~32K–100K) than a fixed percentage, and varies by task type.; Avoid: "the context window" when you mean the usable part.; Source: context-window-dumb-zone.md · RULER
Compaction: Lossy summarisation of older turns to reclaim space. Auto-compaction fires at ~95% fill (Claude Code default); manual compaction (/compact) is triggered deliberately, before degradation sets in.; Avoid: "clearing" (that is /clear — discard, not summarise) and "truncation".; Source: manual-compaction-dumb-zone-mitigation.md
Turn-level context decision · the five moves: At every completed turn, one of five choices: continue, rewind, clear, compact, or delegate. Choosing well is the core skill of context management.; Source: turn-level-context-decisions.md

Discoverability

Non-discoverable context: Information the agent cannot reach with its read/grep/glob tools — rationale ("why X over Y"), constraints not encoded in code, domain rules, conventions present-but-unstated, out-of-band integrations. The only kind that earns a place in an always-loaded instruction file.; Avoid: "documentation" — instruction files are a resource-allocation decision, not docs.; Source: discoverable-vs-nondiscoverable-context.md
Discoverable context: Anything the agent can obtain itself — directory trees, API signatures, dependency versions, config, test patterns, visible conventions. Including it taxes every turn and creates a second source of truth that goes stale. Belongs in the codebase, not the instruction file.; Source: discoverable-vs-nondiscoverable-context.md · Shi et al. 2026
Pointer form: Replacing duplicated discoverable content with a path, not a copy — "use the repository pattern in src/repos/." Gives the agent direction without a stale second copy.; Source: discoverable-vs-nondiscoverable-context.md
Instruction compliance ceiling: Aggregate instruction load — not file count — drives degradation: doubling the rules in scope makes the agent less likely to follow any one of them. The mechanism that makes discoverable bloat actively harmful.; Source: instruction-compliance-ceiling.md

Layering & Assembly

Layer stack · prompt layering: The four sources instructions arrive from, outermost to most specific: system prompt → project instructions (AGENTS.md/CLAUDE.md) → skill content → user message.; Source: prompt-layering.md
Specificity precedence: On a conflict, the layer closest to the task wins (user > skill > project > system). A behavioral tendency, not a rule the model enforces — so contradictions across layers yield unpredictable output.; Avoid: "the model obeys the most important rule" — it's about position/specificity, not importance.; Source: prompt-layering.md
@import composition: Claude Code's @path syntax, expanded verbatim at session start — equivalent to concatenation. Position-bearing (an @file on line 1 lands in primacy) and silently broken if the target moves.; Source: import-composition-pattern.md
Sub-agent context isolation: A sub-agent starts fresh: it inherits none of the parent's project instructions, skills, or history unless they are explicitly passed at invocation. Layering without an injection protocol = no project layer for sub-agents.; Source: prompt-layering.md

Caching & Cost

Immutable prefix · stable prefix / dynamic tail: The cache-efficient context layout: static content (system prompt, tool definitions, project instructions) first and unchanging, variable content (history, latest message) last. Determines whether each turn pays ~10% or 100%.; Source: prompt-caching-architectural-discipline.md
Cache-buster: Any change to the cached prefix that forces a full-price re-write: modifying tool definitions, switching models, non-deterministic tool ordering, or injecting volatile state (timestamps, cwd) into the prefix. Misses are silent — no error, just full billing.; Source: static-content-first-caching.md
Static-first ordering: Assemble static sections before any variable content so the byte-identical prefix matches the cache. The same assembly-order lever as attention positioning, optimised for cost instead.; Avoid: conflating with attention order — same lever, different objective.; Source: static-content-first-caching.md

Density

Semantic density: The ratio of task-relevant tokens to total tokens. Maximise it by cutting zero-density ceremony (filler, boilerplate) while protecting high-density tokens (names, rationale, error messages) the agent would otherwise reconstruct in reasoning.; Avoid: "make it shorter" — the goal is signal per token, not minimum length.; Source: prompt-compression.md · semantic-density-optimization.md
The compression test: Applied to every line: "Can I remove a word — or this whole sentence — without losing a constraint?" If yes, cut it. Convert prose to tables/bullets/rules.; Source: prompt-compression.md
Compliance U-curve: Constraint violations peak at medium compression — best at the verbose and the crisp ends, worst half-trimmed. Compress decisively to an unambiguous rule, or leave it verbose.; Source: arXiv:2512.17920

Tail Management

Observation masking · tool-output masking: Replacing a processed tool output (file read, search, test log) with a one-line summary before the next inference call — surgically removing single-use bulk while keeping the agent's decisions and reasoning. Finer-grained than compaction.; Avoid: conflating with compaction — compaction summarises everything; masking targets tool outputs.; Source: observation-masking.md
Observation token: A token of tool output in the trajectory — ~84% of SE-agent trajectory content, mostly consumed once. The primary driver of dynamic-tail growth.; Source: arXiv:2508.21433
Offloading: Moving a large payload (file, API response) out of context to disk, leaving a reference + brief summary the agent can re-read on demand. Recoverable, non-lossy — tier 1 of compression.; Avoid: conflating with summarisation — offloading preserves the content; summarisation discards it.; Source: context-compression-strategies.md
Compaction · summarisation: Replacing conversation history with a summary of objective, state, constraints, and next steps. Lossy — tier 2, used after offloading and masking. Preserve "what's next," not just "what happened."; Source: context-compression-strategies.md
JIT context · on-demand retrieval, RAG: Pulling content into context via tool calls at the moment a step needs it, rather than preloading at session start. Startup holds only instructions + tool descriptions. Preserves budget — but only when retrieval is accurate (a noisy retriever distracts).; Avoid: assuming on-demand is always better — it trades preload cost for latency + retrieval-quality risk.; Source: retrieval-augmented-agent-workflows.md
Context priming: The preload counterpart: loading relevant files before a task so the agent pattern-matches against real project conventions. Use for repetitive access and grounding; broad-to-narrow, critical context first.; Source: context-priming.md
Goal recitation · todo.md pattern: Rewriting the objective + task list after each step so it lands in the high-attention recency tail, countering drift in long sessions. Strong elicitation (imperative restatement of the core goal) cuts drift more than a bare list.; Source: goal-recitation.md
Error preservation: Keeping failed actions and error traces in context as negative examples that steer the model off dead ends. Removing reasoning traces dropped performance ~30%. Preserve during recovery; compact after success.; Avoid: "cleaning up" errors mid-task — that's deleting the guardrail.; Source: error-preservation-in-context.md
Doom loop: The same error repeating 3+ times. Signal to stop preserving, compact, and change strategy — not retry.; Source: error-preservation-in-context.md

Loading, Assembly & Economics

Orchestrator-worker: A coordinator delegates subtasks to isolated worker sub-agents that each run in their own window and return a condensed summary — keeping the coordinator's context clean. The sub-agents inherit nothing unless passed (see [[sub-agent context isolation]]).; Source: orchestrator-worker.md
Repository map: Structural symbols extracted with tree-sitter, ranked by graph importance and fit to a token budget — gives the agent codebase topology before it reads any file. Navigation over bulk-reading.; Source: repository-map-pattern.md
Seeding agent context · breadcrumbs: Planting discoverable files, comments, and markers that agents find during exploration and use to shape their behaviour — the inverse of preloading.; Source: seeding-agent-context.md
Environment specification: Feeding dependency versions, lock files, and runtime constraints into context so the agent codes against the real environment, not stale training data — closing the version gap that drives environment-blind errors.; Source: environment-specification-as-context.md
Context budget: The finite token pool. Every token preloaded displaces one available for reasoning, tool results, and implementation — the opportunity-cost framing behind [[token economics]]. Allocate by task type.; Source: context-budget-allocation.md
Dynamic system-prompt composition: Building a system prompt from modular, priority-ordered sections (mode variants, cache-friendly ordering) rather than one monolithic static block.; Source: dynamic-system-prompt-composition.md
Prompt chaining: Decomposing a task into a sequence of LLM calls, each processing the previous output, with verification or gate-checks between steps.; Source: prompt-chaining.md
Tokenizer swap tax: When a model upgrade ships a new tokenizer, the same prompt maps to a different token count — shifting effective cost, window headroom, and rate limits before you change a line of code.; Avoid: "shorter is always cheaper" — measure the new tokenizer, don't assume.; Source: tokenizer-swap-tax.md

Integrity & Operations

Context poisoning · hallucination cascade: An early hallucination enters context as a "fact"; every later step builds on the false premise while output stays coherent and confident. The reliable fix is a clean, re-anchored session — corrective prompts patch the symptom but leave the poison in context.; Avoid: "the agent will catch its own mistake" — it doesn't hedge.; Source: context-poisoning.md
Prompt injection · indirect injection: Malicious instructions hidden in external content an agent consumes — web pages, repo files, MCP responses — followed as if from the user. Works because the model is provenance-blind: attention treats all tokens uniformly with no origin metadata. Severity scales with agent capability.; Source: prompt-injection-threat-model.md
Defence-in-depth · no single-layer defence: Layering independent controls so no single bypass compromises the agent: model-level injection resistance, infrastructure-level egress controls, and product-level confirmation flows. The strongest are architectural — constrain what the model can do after reading untrusted input, not what it's told to do.; Avoid: "URL allow-listing is enough" — allowed pages still carry injections.; Source: single-layer-injection-defence.md
Context-window diagnostic tooling · per-tool attribution: Commands that attribute token consumption to the specific tool calls, memory files, and outputs responsible (Claude Code's /context) — so you shrink the real culprit instead of pruning blindly. Diagnose before you compress.; Source: context-window-diagnostic-tooling.md
Context priming: Deliberately loading relevant files into context before a task, broad→narrow, so the agent produces project-specific output instead of generic boilerplate — the preload counterpart to just-in-time retrieval.; Avoid: "one-shot context dump" — order within the load still matters.; Source: context-priming.md
Stateful state-carry · remember, don't re-read: Carrying a long loop's record in a typed object outside the prompt, read by tool, instead of replaying the full transcript each turn — converts O(n²) loop token cost to O(n). For short loops, prompt caching hits the same curve for less work.; Source: stateful-iteration-state-carry.md
Context window anxiety · premature task closure: A behavioural shift — not quality decay — where a model rushes to finish, abbreviates reasoning, or summarises early as it perceives the limit approaching, even with capacity remaining. Countered by buffer allocation, counter-prompting, and token-budget transparency.; Avoid: conflating with the [[lost in the middle|dumb zone]] — distinct mechanism and fix.; Source: context-window-anxiety.md