Reference · Canonical Language

Context Engineering — Glossary

The working vocabulary for this learning workspace. Once a term lives here, every lesson uses this word for it. Grows as we go.

The Discipline

Context engineering · aliases: context management, context design
The discipline of designing what enters a model's context window, how it is structured, and what is excluded, to maximise output quality and reliability.
Avoid: "prompt engineering" (that is a sub-part — designing individual instructions within the context).
Source: context-engineering.md
Signal density
The figure of merit. Anthropic's framing: "the smallest set of high-signal tokens that maximize the likelihood of your desired outcome." Optimise for density, not volume.
Avoid: "more context is better", "fill the window".
Source: Anthropic — Effective Context Engineering
Token economics · opportunity cost
Every token included is a token excluded. Context space is finite, so each inclusion displaces reasoning, instructions, or task-relevant content. Inclusion is exclusion.
Source: context-budget-allocation.md
Context pollution
Irrelevant context that accumulates and competes with relevant content for attention. Diagnostic: "Does this improve output on this specific task?" If no, it is pollution.
Avoid: "noise", "clutter" — pollution is the precise term.
Source: context-engineering.md · session-partitioning.md

Attention & Positioning

Context layer
One tier of the context stack, each with its own persistence: system prompt → project instructions → skill definitions → conversation history → tool outputs. Persistence and cost differ per layer.
Source: layered-context-architecture.md
Lost in the middle · U-shaped attention
Attention is strongest at the start and end of the window and weakest in the middle, regardless of content importance. Rules belong at the edges; reference material can sit in the middle.
Avoid: "the model ignores stuff" — the bias is positional and predictable.
Source: lost-in-the-middle.md · Liu et al. 2023
Attention sink
The first tokens in a sequence attract disproportionate attention regardless of their semantic content — position, not importance, drives the weight.
Source: attention-sinks.md

Degradation & Compaction

Dumb zone · aliases: context rot
The region of context fill where output quality has measurably degraded. It is a gradient, not a cliff — and it starts earlier than most expect.
Avoid: "running out of context" (degradation is about quality, not capacity).
Source: context-window-dumb-zone.md
Effective context
The token length up to which a model actually performs the task well — often far below the advertised window (RULER: 16–50% of claimed). Degradation onset is closer to an absolute token threshold (~32K–100K) than a fixed percentage, and varies by task type.
Avoid: "the context window" when you mean the usable part.
Source: context-window-dumb-zone.md · RULER
Compaction
Lossy summarisation of older turns to reclaim space. Auto-compaction fires at ~95% fill (Claude Code default); manual compaction (/compact) is triggered deliberately, before degradation sets in.
Avoid: "clearing" (that is /clear — discard, not summarise) and "truncation".
Source: manual-compaction-dumb-zone-mitigation.md
Turn-level context decision · the five moves
At every completed turn, one of five choices: continue, rewind, clear, compact, or delegate. Choosing well is the core skill of context management.
Source: turn-level-context-decisions.md

Discoverability

Non-discoverable context
Information the agent cannot reach with its read/grep/glob tools — rationale ("why X over Y"), constraints not encoded in code, domain rules, conventions present-but-unstated, out-of-band integrations. The only kind that earns a place in an always-loaded instruction file.
Avoid: "documentation" — instruction files are a resource-allocation decision, not docs.
Source: discoverable-vs-nondiscoverable-context.md
Discoverable context
Anything the agent can obtain itself — directory trees, API signatures, dependency versions, config, test patterns, visible conventions. Including it taxes every turn and creates a second source of truth that goes stale. Belongs in the codebase, not the instruction file.
Source: discoverable-vs-nondiscoverable-context.md · Shi et al. 2026
Pointer form
Replacing duplicated discoverable content with a path, not a copy — "use the repository pattern in src/repos/." Gives the agent direction without a stale second copy.
Source: discoverable-vs-nondiscoverable-context.md
Instruction compliance ceiling
Aggregate instruction load — not file count — drives degradation: doubling the rules in scope makes the agent less likely to follow any one of them. The mechanism that makes discoverable bloat actively harmful.
Source: instruction-compliance-ceiling.md

Layering & Assembly

Layer stack · prompt layering
The four sources instructions arrive from, outermost to most specific: system prompt → project instructions (AGENTS.md/CLAUDE.md) → skill content → user message.
Source: prompt-layering.md
Specificity precedence
On a conflict, the layer closest to the task wins (user > skill > project > system). A behavioral tendency, not a rule the model enforces — so contradictions across layers yield unpredictable output.
Avoid: "the model obeys the most important rule" — it's about position/specificity, not importance.
Source: prompt-layering.md
@import composition
Claude Code's @path syntax, expanded verbatim at session start — equivalent to concatenation. Position-bearing (an @file on line 1 lands in primacy) and silently broken if the target moves.
Source: import-composition-pattern.md
Sub-agent context isolation
A sub-agent starts fresh: it inherits none of the parent's project instructions, skills, or history unless they are explicitly passed at invocation. Layering without an injection protocol = no project layer for sub-agents.
Source: prompt-layering.md

Caching & Cost

Immutable prefix · stable prefix / dynamic tail
The cache-efficient context layout: static content (system prompt, tool definitions, project instructions) first and unchanging, variable content (history, latest message) last. Determines whether each turn pays ~10% or 100%.
Source: prompt-caching-architectural-discipline.md
Cache-buster
Any change to the cached prefix that forces a full-price re-write: modifying tool definitions, switching models, non-deterministic tool ordering, or injecting volatile state (timestamps, cwd) into the prefix. Misses are silent — no error, just full billing.
Source: static-content-first-caching.md
Static-first ordering
Assemble static sections before any variable content so the byte-identical prefix matches the cache. The same assembly-order lever as attention positioning, optimised for cost instead.
Avoid: conflating with attention order — same lever, different objective.
Source: static-content-first-caching.md

Density

Semantic density
The ratio of task-relevant tokens to total tokens. Maximise it by cutting zero-density ceremony (filler, boilerplate) while protecting high-density tokens (names, rationale, error messages) the agent would otherwise reconstruct in reasoning.
Avoid: "make it shorter" — the goal is signal per token, not minimum length.
Source: prompt-compression.md · semantic-density-optimization.md
The compression test
Applied to every line: "Can I remove a word — or this whole sentence — without losing a constraint?" If yes, cut it. Convert prose to tables/bullets/rules.
Source: prompt-compression.md
Compliance U-curve
Constraint violations peak at medium compression — best at the verbose and the crisp ends, worst half-trimmed. Compress decisively to an unambiguous rule, or leave it verbose.
Source: arXiv:2512.17920

Tail Management

Observation masking · tool-output masking
Replacing a processed tool output (file read, search, test log) with a one-line summary before the next inference call — surgically removing single-use bulk while keeping the agent's decisions and reasoning. Finer-grained than compaction.
Avoid: conflating with compaction — compaction summarises everything; masking targets tool outputs.
Source: observation-masking.md
Observation token
A token of tool output in the trajectory — ~84% of SE-agent trajectory content, mostly consumed once. The primary driver of dynamic-tail growth.
Source: arXiv:2508.21433
Offloading
Moving a large payload (file, API response) out of context to disk, leaving a reference + brief summary the agent can re-read on demand. Recoverable, non-lossy — tier 1 of compression.
Avoid: conflating with summarisation — offloading preserves the content; summarisation discards it.
Source: context-compression-strategies.md
Compaction · summarisation
Replacing conversation history with a summary of objective, state, constraints, and next steps. Lossy — tier 2, used after offloading and masking. Preserve "what's next," not just "what happened."
Source: context-compression-strategies.md
JIT context · on-demand retrieval, RAG
Pulling content into context via tool calls at the moment a step needs it, rather than preloading at session start. Startup holds only instructions + tool descriptions. Preserves budget — but only when retrieval is accurate (a noisy retriever distracts).
Avoid: assuming on-demand is always better — it trades preload cost for latency + retrieval-quality risk.
Source: retrieval-augmented-agent-workflows.md
Context priming
The preload counterpart: loading relevant files before a task so the agent pattern-matches against real project conventions. Use for repetitive access and grounding; broad-to-narrow, critical context first.
Source: context-priming.md
Goal recitation · todo.md pattern
Rewriting the objective + task list after each step so it lands in the high-attention recency tail, countering drift in long sessions. Strong elicitation (imperative restatement of the core goal) cuts drift more than a bare list.
Source: goal-recitation.md
Error preservation
Keeping failed actions and error traces in context as negative examples that steer the model off dead ends. Removing reasoning traces dropped performance ~30%. Preserve during recovery; compact after success.
Avoid: "cleaning up" errors mid-task — that's deleting the guardrail.
Source: error-preservation-in-context.md
Doom loop
The same error repeating 3+ times. Signal to stop preserving, compact, and change strategy — not retry.
Source: error-preservation-in-context.md

Loading, Assembly & Economics

Orchestrator-worker
A coordinator delegates subtasks to isolated worker sub-agents that each run in their own window and return a condensed summary — keeping the coordinator's context clean. The sub-agents inherit nothing unless passed (see [[sub-agent context isolation]]).
Source: orchestrator-worker.md
Repository map
Structural symbols extracted with tree-sitter, ranked by graph importance and fit to a token budget — gives the agent codebase topology before it reads any file. Navigation over bulk-reading.
Source: repository-map-pattern.md
Seeding agent context · breadcrumbs
Planting discoverable files, comments, and markers that agents find during exploration and use to shape their behaviour — the inverse of preloading.
Source: seeding-agent-context.md
Environment specification
Feeding dependency versions, lock files, and runtime constraints into context so the agent codes against the real environment, not stale training data — closing the version gap that drives environment-blind errors.
Source: environment-specification-as-context.md
Context budget
The finite token pool. Every token preloaded displaces one available for reasoning, tool results, and implementation — the opportunity-cost framing behind [[token economics]]. Allocate by task type.
Source: context-budget-allocation.md
Dynamic system-prompt composition
Building a system prompt from modular, priority-ordered sections (mode variants, cache-friendly ordering) rather than one monolithic static block.
Source: dynamic-system-prompt-composition.md
Prompt chaining
Decomposing a task into a sequence of LLM calls, each processing the previous output, with verification or gate-checks between steps.
Source: prompt-chaining.md
Tokenizer swap tax
When a model upgrade ships a new tokenizer, the same prompt maps to a different token count — shifting effective cost, window headroom, and rate limits before you change a line of code.
Avoid: "shorter is always cheaper" — measure the new tokenizer, don't assume.
Source: tokenizer-swap-tax.md

Integrity & Operations

Context poisoning · hallucination cascade
An early hallucination enters context as a "fact"; every later step builds on the false premise while output stays coherent and confident. The reliable fix is a clean, re-anchored session — corrective prompts patch the symptom but leave the poison in context.
Avoid: "the agent will catch its own mistake" — it doesn't hedge.
Source: context-poisoning.md
Prompt injection · indirect injection
Malicious instructions hidden in external content an agent consumes — web pages, repo files, MCP responses — followed as if from the user. Works because the model is provenance-blind: attention treats all tokens uniformly with no origin metadata. Severity scales with agent capability.
Source: prompt-injection-threat-model.md
Defence-in-depth · no single-layer defence
Layering independent controls so no single bypass compromises the agent: model-level injection resistance, infrastructure-level egress controls, and product-level confirmation flows. The strongest are architectural — constrain what the model can do after reading untrusted input, not what it's told to do.
Avoid: "URL allow-listing is enough" — allowed pages still carry injections.
Source: single-layer-injection-defence.md
Context-window diagnostic tooling · per-tool attribution
Commands that attribute token consumption to the specific tool calls, memory files, and outputs responsible (Claude Code's /context) — so you shrink the real culprit instead of pruning blindly. Diagnose before you compress.
Source: context-window-diagnostic-tooling.md
Context priming
Deliberately loading relevant files into context before a task, broad→narrow, so the agent produces project-specific output instead of generic boilerplate — the preload counterpart to just-in-time retrieval.
Avoid: "one-shot context dump" — order within the load still matters.
Source: context-priming.md
Stateful state-carry · remember, don't re-read
Carrying a long loop's record in a typed object outside the prompt, read by tool, instead of replaying the full transcript each turn — converts O(n²) loop token cost to O(n). For short loops, prompt caching hits the same curve for less work.
Source: stateful-iteration-state-carry.md
Context window anxiety · premature task closure
A behavioural shift — not quality decay — where a model rushes to finish, abbreviates reasoning, or summarises early as it perceives the limit approaching, even with capacity remaining. Countered by buffer allocation, counter-prompting, and token-budget transparency.
Avoid: conflating with the [[lost in the middle|dumb zone]] — distinct mechanism and fix.
Source: context-window-anxiety.md