Reference · Canonical Language

Prompt Engineering — Glossary

The working vocabulary for the Prompt Engineering course — the compliance side of prompting: writing instructions that actually get followed. Once a term lives here, every lesson uses this word for it.

Calibrating the Instruction

System prompt altitude · aliases: prompt altitude, specificity level: How abstract an instruction is — from high-level principle down to case-by-case lookup table. The right altitude tells the agent how to reason, not what to decide, so behaviour generalises to inputs the author never enumerated.; Avoid: "be more specific" / "be more general" — altitude is a calibration, not a direction.; Source: system-prompt-altitude · Lesson 1
Brittleness: The failure mode of an instruction pitched too low — it enumerates cases, works on what the author anticipated, and breaks on everything else. Symmetric opposite of vagueness (no real constraint at all).; Avoid: "too detailed" — the problem is non-generalisation, not detail.; Source: system-prompt-altitude · Lesson 1
Instruction polarity · instruction framing: Whether a rule states what to do (positive) or what to avoid (negative). Positive forms name an execution target and tend to win on compliance; the gap compounds as instruction count grows. The general-prompting default — see guardrails for the coding-agent exception.; Avoid: "tone" — polarity is about target vs. prohibition, not style.; Source: instruction-polarity · Lesson 2
Negative space · negative-space instructions: Defining the boundary rather than the goal — banned phrases, scope exclusions, format exclusions. A good negative-space constraint is binary and verifiable: the agent either produced the banned thing or it didn't.; Avoid: conflating with negative polarity — polarity asks "how do I frame it?", negative space asks "goal or boundary?".; Source: negative-space-instructions · Lesson 2
Greppability: The design criterion for a negative constraint: can compliance be confirmed by a deterministic check (a grep, a diff)? If a rule can't be checked automatically, it probably belongs in positive guidance instead.; Source: negative-space-instructions · Lesson 2

Making Rules Stick

Primacy bias · attention sink: Initial tokens draw disproportionate attention regardless of semantic content. Critical rules placed first claim this high-attention slot.; Source: critical-instruction-repetition · Xiao et al. 2023 · Lesson 3
Recency bias: The latest tokens are freshest in the model's working state and shape the next token directly. Restating a critical rule at the end of a prompt claims this slot — the complement to primacy.; Source: critical-instruction-repetition · Lesson 3
Lost in the middle · U-shaped attention: Attention is strongest at the start and end of the context and weakest in the middle. A critical rule placed once, mid-prompt, sits in the weakest-attention trough. Liu et al. measured a 30%+ accuracy drop when relevant information moved to the middle.; Avoid: "the model ignores stuff" — the bias is positional and predictable.; Source: critical-instruction-repetition · Liu et al. 2023 · Lesson 3
Compliance ceiling · aliases: the mega-prompt, instruction overload: The rule-count threshold above which compliance degrades — first modification errors (rule followed imprecisely), then omission errors (rule skipped entirely). Attention, not agent choice, picks which rules drop; even frontier models hold only ~68% accuracy at 500 instructions.; Avoid: "the model is lazy" — degradation is an attention-capacity limit, not a choice.; Source: instruction-compliance-ceiling · IFScale 2025 · Lesson 4
Encoding neutrality: Reformatting a constraint — structured headers, YAML blocks, formal spec — has no measurable effect on compliance (Cliff's δ < 0.01 across 830+ invocations). Compact headers still earn a ~25–30% full-prompt token saving — worth it for cost, not compliance. When a constraint fails, fix its design, not its format.; Source: constraint-encoding-compliance-gap · Fang et al. 2025 · Lesson 4
Guardrails beat guidance · guardrails over guidance: For coding-agent rule files on SWE-bench, negative constraints are the only individually beneficial rule type; positive directives degrade success when added in isolation. A coding-agent specialization of polarity — not a reversal of the general-prompting default.; Avoid: generalising past coding agents — the evidence is SWE-bench-specific.; Source: guardrails-beat-guidance-coding-agents · Zhang et al. 2026 · Lesson 5
Context priming: Why random rules help coding agents almost as much as expert-curated ones: any domain-relevant text activates the coding-task subspace of the model's representations. Rule presence primes; rule content shapes the search. The two effects stack.; Source: guardrails-beat-guidance-coding-agents · Lesson 5

The Right Vehicle

Rule-driven vs example-driven: Rules generalise (compact, but can be misread); examples anchor (concrete, but can be over-fitted). The choice is a function of which failure you're preventing. The reliable combo: state the rule, then show one example.; Avoid: stacking many near-duplicate examples — that teaches interpolation, not the constraint.; Source: example-driven-vs-rule-driven-instructions · Lesson 6
Hints over code samples: For format and style in a codebase, pointing at existing code ("follow the pattern in src/repos/UserRepo.ts") beats an inline sample. Hints stay current as the code evolves and cost one line instead of thirty.; Avoid: pasting a long inline example when a live file already implements the pattern.; Source: example-driven-vs-rule-driven-instructions · Lesson 6

Organizing the System

Pointer map · aliases: table of contents, AGENTS.md content strategy: An instruction file kept to ~100 lines as an index — what the project is, where conventions live, what to read first — with the knowledge itself in a versioned docs/ directory. Fixes the file's own version of the compliance ceiling: a monolithic file crowds context, dilutes attention, and rots.; Avoid: "encyclopedia" — the file points; it does not contain.; Source: agents-md-as-table-of-contents · Lesson 7
Rule lifecycle metadata · source / applicability / expiry: Three fields on every terminal rule: why it was added (source), when it fires (applicability), and the observable that retires it (expiry). Converts deletion from a judgement call into a closed-form predicate, so the default flips from "keep when uncertain" to "delete when expired."; Source: agents-md-as-table-of-contents · Lesson 7
Layered instruction scopes · directory-level hierarchy: Instruction files concatenated from general to specific — global config, git root, then each directory down to the working directory — so the most specific rule appears last and wins. Priority is positional, exploiting recency bias, not declared with keywords.; Avoid: a flat "if in api/, use X" conditional — placement replaces the condition the model could misjudge.; Source: layered-instruction-scopes · Lesson 8
Specification as prompt: Using an existing formal artifact — a type, schema, test, or API definition — as the instruction instead of a prose re-description. The spec can't be misread the way prose can, and keeping one source of truth means there's nothing to drift.; Avoid: treating a passing spec as sufficient — agents can game a literal test; it's necessary, not sufficient.; Source: specification-as-prompt · Lesson 9

Beyond the Prompt

Instruction fade-out: The progressive loss of an instruction's influence over an extended session — even while it remains present in context. Distinct from compression: the rule survives, but drifts into a low-attention region as history accumulates around it.; Avoid: conflating with compaction — fade-out is an attention effect, not the rule being summarised away.; Source: event-driven-system-reminders · Lesson 10
Event-driven reminder · event detector, guardrail counter: A targeted instruction re-injected when a detector trips on a specific condition — repeated tool failure, budget pressure, a safety violation — rather than on a schedule. Injected as a user message for attention persistence, escalating in severity via a guardrail counter, and additive so a failed detector never breaks the agent.; Source: event-driven-system-reminders · Lesson 10
Hooks vs prompts · enforcement vs advisory: Prompts ask (probabilistic, in-context, deprioritised under pressure); hooks require (deterministic, outside the context, unoverridable at the tool-call boundary). Reach for a hook only when the rule is non-negotiable, binary, and opposed by a training prior — and pair it with CI for the gaps a hook can't reach.; Avoid: treating a hook as everywhere-proof — substitution, intent-blindness, path gaps, and hook-source trust narrow it.; Source: hooks-vs-prompts · Lesson 11
Post-compaction re-read protocol · compaction drift: When a long session compacts, the summary preserves task state but paraphrases instruction-file references, degrading rule fidelity with no error. A targeted re-read of CLAUDE.md/AGENTS.md — manual, or a SessionStart hook with a compact matcher — restores it; a confirmation requirement raises how reliably it lands.; Avoid: conflating with fade-out — fade-out keeps the exact text and loses attention; compaction loses the text's precision.; Source: post-compaction-reread-protocol · Lesson 12

Assembling the System

Concern isolation · XML-sectioned prompt: Scaffolding a large system prompt with named XML/Markdown sections, one concern each, so a rule's scope is bounded, the model can attend selectively, and a section edits without invalidating the cache prefix above it. The within-document version of layered scopes.; Avoid: applying it below ~500 tokens — tag overhead then costs more than the cache hits it buys.; Source: production-system-prompt-architecture · Lesson 13
Cache-aware layering · prefix-stable layout: Ordering a prompt by volatility because caching matches an exact prefix: stable content (date, environment) at the head, runtime-variable parameters (reasoning effort, thinking mode) at the tail, so changing a knob never invalidates the cached body. Skills and tools become pointers — a registry of paths and a static, runtime-masked tool list — to keep the prefix stable.; Source: production-system-prompt-architecture · Lesson 13
Worked reasoning trace · domain-specific system prompt: A concrete example in the system prompt that anchors the decision chain for a real edge case — domain vocabulary, the gated tool sequence, and why the wrong path fails — not just the output's shape. Domain-specific prompts with such traces produced a 54% relative pass-rate gain on τ-Bench with no model change.; Avoid: single-call, low-constraint, thin-data, or high-churn tasks — there's no multi-step chain for the trace to shape.; Source: domain-specific-system-prompts · Lesson 14