Configuration Smells in AGENTS.md

Your AGENTS.md loads at every session start — every byte trades against the task budget. Six named defects are quietly taxing 91 of 100 popular repos.

Why this, for you: Part 1 was the context you load per-task; this is the context that loads every task, forever. A named, greppable checklist turns "our CLAUDE.md is messy" into six audit items with known fixes. Run it on your own file today.

AGENTS.md and CLAUDE.md are always-on context. dos Santos et al. (June 2026) ran the first empirical mining study across 100 popular open-source repos and found 91 carried at least one of six recurring defects — three of which frequently co-occur.

1 The six smells

Lint Leakage 62% rules a linter/formatter already enforces Context Bloat 42% bigger than the agent reliably honours (≥200 lines) Skill Leakage 35% task-specific instructions in the always-loaded file Conflicting Instr. 28% contradictory rules (57% detector precision — flag, don't auto-fix) Init Fossilization 24% /init output, never modified since (single-commit history) Blind References 16% bare paths with no pitch on when/why to read

Real examples from the paper: javascript-obfuscator's CLAUDE.md ran 1,477 lines across 27 sections; inkline told the agent components live in two different directories; 24 actively-developed projects had zero edits to their AGENTS.md.

Five of the six are signal-to-token defects — they cut the useful fraction of the always-loaded context. The sixth, Conflicting Instructions, cuts its resolvability. Every byte is loaded on every turn.

2 Why it matters

Independent benchmark work converges on the same mechanism: Gloaguen et al. measured −3% task success and +20% inference cost for LLM-generated context files, and only +4% success at +19% cost for human-written ones. Context files cost without proportional gains. Naming each defect makes the fix concrete — extract style to the linter, split rare sections into on-demand skills, resolve contradictions, refresh fossils, pitch every reference.

3 The fix

Lint Leakage → move style rules to ruff / black / eslint Context Bloat → cut to a pointer-map; product docs to docs/ Skill Leakage → split rare instructions into on-demand skills Conflicting Instr. → resolve to one source of truth Init Fossilization → update content to match the current code Blind References → add a one-line pitch: when and why to read it

When the catalog earns less

It's calibrated against active multi-file repos. It pays off less in tiny utilities already under 200 lines, projects already on a strict pointer-map regime, and single-author prototypes where the developer holds the project in head. And treat the 57%-precision Conflicting-Instructions detector as a flag for human review, not an auto-fix trigger.

↪ Your win: a greppable audit, not a vibe

91 of 100 repos carry a smell — this is the modal state, not a fringe failure.
Lint Leakage (62%) is most common; Bloat, Skill Leakage, and Conflicts co-occur.
Always-on context costs every turn — benchmarks show cost without proportional gains.
Each smell has a known fix — linter, skill split, resolve, refresh, pitch.
Conflicting Instructions is a human-review flag, not an auto-fix.

Retrieval practice — recall, don't peek

Question 1The most prevalent smell, Lint Leakage, is fixed by…

Question 2Context Bloat is flagged at roughly…

Question 3Skill Leakage is fixed by moving instructions into…

Question 4The Conflicting-Instructions detector should be treated as…

Question 5 · spaced recall from Lesson 07The reliable way to verify a cited claim is to…

Ask me anything. Want the greppable detection heuristics for each smell, or how a pointer-map AGENTS.md avoids Bloat and Skill Leakage by design? Next in Part 3: Single-Layer Injection Defence — when one safeguard isn't a security boundary.