Part 2 · Making Rules Stick

Prompt Engineering · ~7 min

The Ceiling

Every instruction set has a point past which adding rules subtracts compliance. The fix isn't better writing — it's fewer, better-placed rules.

Why this, for you: the instinct after every failure is to append one more rule. That instinct is the bug. Beyond a certain count, rules don't stack — they compete, and the model silently drops some. Knowing where the ceiling sits is what stops your instruction file from quietly rotting.

Stating a rule does not guarantee it is followed. Instruction sets have a compliance ceiling: below it, agents follow rules with reasonable precision; above it, compliance falls apart in a predictable order — and no amount of rewording brings it back.

1 How compliance breaks down

Transformer models process instructions and task context through the same attention mechanism, so every rule competes for attention weight at every output token. As the set grows, two failure modes appear in sequence:

Attention distribution — not agent choice — picks which rules get dropped. Even frontier models reach only 68% accuracy at 500 instructions (IFScale, 2025). Past the ceiling, more rules make no difference — the set has exceeded reliable capacity.

2 The mega-prompt

The canonical anti-pattern is a monolithic file — a 1500-line AGENTS.md covering coding standards, Git conventions, deployment, and style. Every incident appends another rule. The file grows; compliance shrinks. Rules silently conflict, and the agent resolves them unpredictably. Position compounds it: from Lesson 3, instructions near the top get more reliable attention regardless of stated importance — so place critical rules first, and don't rely on the agent finding a key rule at line 150.

3 The response is structural

The ceiling is a design constraint, not a writing problem. You don't write the rules better — you architect the set so each context stays below the ceiling.

LayerWhat belongs there
AGENTS.mdProject identity, stack, 5–10 conventions that apply to every task
SkillsTask-specific procedures and templates — loaded on demand
HooksAnything that must be enforced deterministically

Modularize task-specific rules into skills loaded only when relevant — a docs task doesn't need Git workflow rules. Scope AGENTS.md to conventions that apply to every task. Move enforcement to hooks for anything that must never fail — a linter, pre-commit hook, or CI gate, not an instruction subject to attention decay. And audit the total count: if you can't read your instruction file in under two minutes, it's too long.

Encoding form is the wrong lever

When a rule gets missed, engineers reach for formatting — restructure bullets, add headers, switch to YAML blocks. The evidence says don't. Across 11 models and 830+ invocations, encoding form had no measurable effect on compliance (Fang et al., 2025; Cliff's δ < 0.01). Compact headers do cut the constraint portion by ~71% in tokens — worth it for cost, not compliance. When a constraint fails, fix its design: simplify, cut the count, drop counter-intuitive requirements, or enforce mechanically.

When decomposition backfires

Layering has its own failure modes. On-demand skills are invisible to teammates who don't know they exist (the discovery gap); loading the wrong skill removes its rules entirely — a silent failure; and splitting tightly coupled rules across files creates boundary conflicts. For a solo dev, a 50-rule AGENTS.md that's already below the ceiling needs no decomposition at all. Reducing total rule count is often the simpler fix.

↪ Your win: stay below the ceiling by design

Retrieval practice — recall, don't peek

Question 1As an instruction set grows past the ceiling, the first failures are…

Question 2What decides which rules get dropped above the ceiling?

Question 3The architectural fix for a bloated instruction file is to…

Question 4Reformatting constraints into compact headers mainly…

Question 5 · spaced recall from Lesson 3To exploit both primacy and recency, place a critical rule…

Ask me anything. Want help splitting an overgrown AGENTS.md into layers, or a count of where your current set sits against the ceiling? Next in Part 2: Guardrails Beat Guidance — the surprising rule polarity that actually helps coding agents.
✎ Feedback