Reference · Canonical Language

Prompt Engineering — Glossary

The working vocabulary for the Prompt Engineering course — the compliance side of prompting: writing instructions that actually get followed. Once a term lives here, every lesson uses this word for it.

Calibrating the Instruction

System prompt altitude · aliases: prompt altitude, specificity level
How abstract an instruction is — from high-level principle down to case-by-case lookup table. The right altitude tells the agent how to reason, not what to decide, so behaviour generalises to inputs the author never enumerated.
Avoid: "be more specific" / "be more general" — altitude is a calibration, not a direction.
Source: system-prompt-altitude · Lesson 1
Brittleness
The failure mode of an instruction pitched too low — it enumerates cases, works on what the author anticipated, and breaks on everything else. Symmetric opposite of vagueness (no real constraint at all).
Avoid: "too detailed" — the problem is non-generalisation, not detail.
Source: system-prompt-altitude · Lesson 1
Instruction polarity · instruction framing
Whether a rule states what to do (positive) or what to avoid (negative). Positive forms name an execution target and tend to win on compliance; the gap compounds as instruction count grows. The general-prompting default — see guardrails for the coding-agent exception.
Avoid: "tone" — polarity is about target vs. prohibition, not style.
Source: instruction-polarity · Lesson 2
Negative space · negative-space instructions
Defining the boundary rather than the goal — banned phrases, scope exclusions, format exclusions. A good negative-space constraint is binary and verifiable: the agent either produced the banned thing or it didn't.
Avoid: conflating with negative polarity — polarity asks "how do I frame it?", negative space asks "goal or boundary?".
Source: negative-space-instructions · Lesson 2
Greppability
The design criterion for a negative constraint: can compliance be confirmed by a deterministic check (a grep, a diff)? If a rule can't be checked automatically, it probably belongs in positive guidance instead.
Source: negative-space-instructions · Lesson 2

Making Rules Stick

Primacy bias · attention sink
Initial tokens draw disproportionate attention regardless of semantic content. Critical rules placed first claim this high-attention slot.
Source: critical-instruction-repetition · Xiao et al. 2023 · Lesson 3
Recency bias
The latest tokens are freshest in the model's working state and shape the next token directly. Restating a critical rule at the end of a prompt claims this slot — the complement to primacy.
Source: critical-instruction-repetition · Lesson 3
Lost in the middle · U-shaped attention
Attention is strongest at the start and end of the context and weakest in the middle. A critical rule placed once, mid-prompt, sits in the weakest-attention trough. Liu et al. measured a 30%+ accuracy drop when relevant information moved to the middle.
Avoid: "the model ignores stuff" — the bias is positional and predictable.
Source: critical-instruction-repetition · Liu et al. 2023 · Lesson 3
Compliance ceiling · aliases: the mega-prompt, instruction overload
The rule-count threshold above which compliance degrades — first modification errors (rule followed imprecisely), then omission errors (rule skipped entirely). Attention, not agent choice, picks which rules drop; even frontier models hold only ~68% accuracy at 500 instructions.
Avoid: "the model is lazy" — degradation is an attention-capacity limit, not a choice.
Source: instruction-compliance-ceiling · IFScale 2025 · Lesson 4
Encoding neutrality
Reformatting a constraint — structured headers, YAML blocks, formal spec — has no measurable effect on compliance (Cliff's δ < 0.01 across 830+ invocations). Compact headers still earn a ~25–30% full-prompt token saving — worth it for cost, not compliance. When a constraint fails, fix its design, not its format.
Source: constraint-encoding-compliance-gap · Fang et al. 2025 · Lesson 4
Guardrails beat guidance · guardrails over guidance
For coding-agent rule files on SWE-bench, negative constraints are the only individually beneficial rule type; positive directives degrade success when added in isolation. A coding-agent specialization of polarity — not a reversal of the general-prompting default.
Avoid: generalising past coding agents — the evidence is SWE-bench-specific.
Source: guardrails-beat-guidance-coding-agents · Zhang et al. 2026 · Lesson 5
Context priming
Why random rules help coding agents almost as much as expert-curated ones: any domain-relevant text activates the coding-task subspace of the model's representations. Rule presence primes; rule content shapes the search. The two effects stack.
Source: guardrails-beat-guidance-coding-agents · Lesson 5

The Right Vehicle

Rule-driven vs example-driven
Rules generalise (compact, but can be misread); examples anchor (concrete, but can be over-fitted). The choice is a function of which failure you're preventing. The reliable combo: state the rule, then show one example.
Avoid: stacking many near-duplicate examples — that teaches interpolation, not the constraint.
Source: example-driven-vs-rule-driven-instructions · Lesson 6
Hints over code samples
For format and style in a codebase, pointing at existing code ("follow the pattern in src/repos/UserRepo.ts") beats an inline sample. Hints stay current as the code evolves and cost one line instead of thirty.
Avoid: pasting a long inline example when a live file already implements the pattern.
Source: example-driven-vs-rule-driven-instructions · Lesson 6

Organizing the System

Pointer map · aliases: table of contents, AGENTS.md content strategy
An instruction file kept to ~100 lines as an index — what the project is, where conventions live, what to read first — with the knowledge itself in a versioned docs/ directory. Fixes the file's own version of the compliance ceiling: a monolithic file crowds context, dilutes attention, and rots.
Avoid: "encyclopedia" — the file points; it does not contain.
Source: agents-md-as-table-of-contents · Lesson 7
Rule lifecycle metadata · source / applicability / expiry
Three fields on every terminal rule: why it was added (source), when it fires (applicability), and the observable that retires it (expiry). Converts deletion from a judgement call into a closed-form predicate, so the default flips from "keep when uncertain" to "delete when expired."
Source: agents-md-as-table-of-contents · Lesson 7
Layered instruction scopes · directory-level hierarchy
Instruction files concatenated from general to specific — global config, git root, then each directory down to the working directory — so the most specific rule appears last and wins. Priority is positional, exploiting recency bias, not declared with keywords.
Avoid: a flat "if in api/, use X" conditional — placement replaces the condition the model could misjudge.
Source: layered-instruction-scopes · Lesson 8
Specification as prompt
Using an existing formal artifact — a type, schema, test, or API definition — as the instruction instead of a prose re-description. The spec can't be misread the way prose can, and keeping one source of truth means there's nothing to drift.
Avoid: treating a passing spec as sufficient — agents can game a literal test; it's necessary, not sufficient.
Source: specification-as-prompt · Lesson 9

Beyond the Prompt

Instruction fade-out
The progressive loss of an instruction's influence over an extended session — even while it remains present in context. Distinct from compression: the rule survives, but drifts into a low-attention region as history accumulates around it.
Avoid: conflating with compaction — fade-out is an attention effect, not the rule being summarised away.
Source: event-driven-system-reminders · Lesson 10
Event-driven reminder · event detector, guardrail counter
A targeted instruction re-injected when a detector trips on a specific condition — repeated tool failure, budget pressure, a safety violation — rather than on a schedule. Injected as a user message for attention persistence, escalating in severity via a guardrail counter, and additive so a failed detector never breaks the agent.
Source: event-driven-system-reminders · Lesson 10
Hooks vs prompts · enforcement vs advisory
Prompts ask (probabilistic, in-context, deprioritised under pressure); hooks require (deterministic, outside the context, unoverridable at the tool-call boundary). Reach for a hook only when the rule is non-negotiable, binary, and opposed by a training prior — and pair it with CI for the gaps a hook can't reach.
Avoid: treating a hook as everywhere-proof — substitution, intent-blindness, path gaps, and hook-source trust narrow it.
Source: hooks-vs-prompts · Lesson 11
Post-compaction re-read protocol · compaction drift
When a long session compacts, the summary preserves task state but paraphrases instruction-file references, degrading rule fidelity with no error. A targeted re-read of CLAUDE.md/AGENTS.md — manual, or a SessionStart hook with a compact matcher — restores it; a confirmation requirement raises how reliably it lands.
Avoid: conflating with fade-out — fade-out keeps the exact text and loses attention; compaction loses the text's precision.
Source: post-compaction-reread-protocol · Lesson 12

Assembling the System

Concern isolation · XML-sectioned prompt
Scaffolding a large system prompt with named XML/Markdown sections, one concern each, so a rule's scope is bounded, the model can attend selectively, and a section edits without invalidating the cache prefix above it. The within-document version of layered scopes.
Avoid: applying it below ~500 tokens — tag overhead then costs more than the cache hits it buys.
Source: production-system-prompt-architecture · Lesson 13
Cache-aware layering · prefix-stable layout
Ordering a prompt by volatility because caching matches an exact prefix: stable content (date, environment) at the head, runtime-variable parameters (reasoning effort, thinking mode) at the tail, so changing a knob never invalidates the cached body. Skills and tools become pointers — a registry of paths and a static, runtime-masked tool list — to keep the prefix stable.
Source: production-system-prompt-architecture · Lesson 13
Worked reasoning trace · domain-specific system prompt
A concrete example in the system prompt that anchors the decision chain for a real edge case — domain vocabulary, the gated tool sequence, and why the wrong path fails — not just the output's shape. Domain-specific prompts with such traces produced a 54% relative pass-rate gain on τ-Bench with no model change.
Avoid: single-call, low-constraint, thin-data, or high-churn tasks — there's no multi-step chain for the trace to shape.
Source: domain-specific-system-prompts · Lesson 14