Capstone

Prompt Engineering · ~8 min

The Compliance Stack

Fourteen levers, one decision. Given a rule that isn't being followed, this is the order you reach for them — and when to stop reaching for prompts entirely.

Why this, for you: the techniques in this course aren't a menu — they're a system that interacts. Altitude sets what kind of rule you write; polarity and negative space set how you frame it; position and the ceiling govern whether it survives attention; rules-vs-examples picks the vehicle; the file's shape, its scope, and the specs you point at decide how it's organised; reminders, the re-read protocol, and hooks carry it past the prompt's reach; and architecture and worked traces decide how the whole document is assembled. This capstone wires them into one decision procedure you can run on any failing instruction.

Every lesson answered a different question about the same rule. Put in order, they form a diagnostic: when a constraint isn't being followed, walk the stack top to bottom and stop at the first lever that fits.

1 The decision table

Start from the symptom. Each row points at the lesson that owns the fix.

SymptomReach forFrom
Rule breaks on cases you didn't listRaise the altitude — write the principle, not the lookup tableL1
Rule is vague; any output "satisfies" itLower altitude / add a greppable boundaryL1–2
Negative rule ignored under a big rule setReframe positive — name the target behaviourL2
"Concise / clean / good" can't be checkedAdd a negative-space constraint (greppable)L2
One critical rule keeps getting droppedTop-and-tail it — state first and lastL3
Lots of rules, compliance falling overallYou're over the ceiling — cut, scope, modularizeL4
Coding agent ignores "follow style"Rewrite as a negative guardrailL5
Format/schema produced wrongShow one example, or point at existing codeL6
Instruction file has grown unmanageableIndex, don't embed — pointer map + tagged rulesL7
One file is a tangle of "if in X…" rulesLayer by scope — most specific wins by positionL8
Prose re-describes a type / schema / testPoint at the spec; keep one source of truthL9
Rule obeyed early, dropped late in a runFade-out — re-inject on the relevant eventL10
Must never failStop prompting — move it to a hook / CI gateL11
Rules drift after a long session compactsRe-read the instruction file and confirmL12
One file has outgrown a single concernSplit into named XML sections, cache-orderedL13
Reasoning looks fine but misses real casesAdd a worked reasoning trace from productionL14

2 Mixed review: the load-bearing distinctions

Four places where two lessons look like they contradict, and don't:

Polarity vs. guardrails. General prompting favours positive (L2). Coding agents on SWE-bench favour negative guardrails (L5). Not a reversal — a specialization: negative constraints remove infeasible branches, while positive directives compete with the model's training priors on a coding task.
Negative polarity vs. negative space. Polarity asks "how do I frame this rule?" Negative space asks "do I state the goal or the boundary?" You can write a positively-framed rule that still defines a boundary — they're different axes (L2).
Encoding vs. design. Reformatting a constraint into tidy headers does nothing for compliance (L4). The levers that do move it are altitude, polarity, position, count, example-anchoring, the file's structure and scope — the design levers, not the formatting.
Reminder vs. re-read vs. hook. All three fire outside the static prompt. A reminder re-asks an in-context rule that faded (L10); a re-read restores a rule whose text was paraphrased by compaction (L12) — fade-out keeps the exact words, compaction loses them, so the fixes differ; a hook requires what a prompt can only ask (L11, deterministic). Reach for a hook only when the rule is non-negotiable, binary, and prior-opposed.
Format example vs. reasoning trace. One example anchors the output's shape (L6); a worked trace anchors the decision chain behind it (L14). Same mechanism — an exemplar steers the path — aimed at different targets. And neither is encoding (L4): a trace lifts compliance because it changes what the model reasons, not how the prompt is formatted.

3 Where prompting ends

The whole stack is probabilistic. Every lever raises the odds of compliance; none guarantees it. The most important judgement in the course is knowing when the odds aren't enough.

# probabilistic — an instruction ASKS: "Never commit directly to main." # deterministic — a hook REQUIRES: pre-commit: reject push to protected branch → exit 1

If forgetting a rule causes a security, correctness, or safety failure, it should not live in the prompt at all. Instructions are for what tolerates the occasional miss; hooks, linters, and CI gates are for what doesn't.

The assembled prompt is small

Run the whole stack and the output is the same shape every time: a short instruction set, at the right altitude per section, positively framed where it helps, top-and-tailed for the one rule that matters, anchored by a single example where format is precise, organised as a pointer map scoped to where work happens, well under the ceiling — refreshed by a reminder when a long run lets it fade, and backed by hooks for anything that must never fail.

↪ Your win: a diagnostic, not a checklist

Mixed review — recall across the whole course

Question 1A rule that breaks on every case the author didn't list needs you to…

Question 2The positive-default (general) vs. negative-guardrail (coding) split is best called…

Question 3When two layered instruction files conflict, the winner is the one that is…

Question 4A constraint whose failure is unacceptable belongs…

Question 5 · spaced recall from Lesson 14A worked reasoning trace beats a bare format example when the goal is to shape the agent's…

Ask me anything. Want to run the decision table against your own system prompt or AGENTS.md, or pull the whole course into a one-page checklist? That's the course — fourteen levers and the judgement to know which one a failing rule actually needs.
✎ Feedback