Just-in-time retrieval pulls context when the agent needs it. Priming is the opposite move — deliberately loading the right files before you ask. Both are right; the trick is knowing which.
Why this, for you: a cold prompt forces the agent to guess at your patterns, conventions, and
architecture — and it guesses generic. Priming is the cheapest way to make a coding agent write code that
fits your codebase instead of framework boilerplate you then rewrite. It's the deliberate-preload
counterpart to the JIT pull you learned earlier.
Agents don't retrieve project knowledge on their own. They work with whatever is in the window
at the moment they generate. "Add authentication to the API," cold, makes the agent guess. Read the
middleware, the auth config, and the user model first, and the same request produces output that fits.
1 Why order, not just content, matters
Priming isn't a bulk transfer — it's a loading sequence. Transformers attend more reliably to the start and
end of the window than to the middle (the lost-in-the-middle effect from Lesson 02), so the order you load in shapes
what the model actually leans on.
Load broad → narrow: architecture overview (AGENTS.md, README, top-level
structure), then the relevant module, then the specific files to modify. Building understanding incrementally beats
dumping everything at once — and it keeps the most critical framing at the attention-favoured start of context rather
than buried in detail.
2 Four priming moves
The mechanics are simple and stack well:
Read before write — have the agent read the files it'll touch and the adjacent ones before changing anything.
Progressive loading — architecture, then subsystem, then target file; broad to narrow.
Explore before implement — a read-only exploration phase before switching to edit mode.
Use plan mode — force a plan step so the agent surfaces its understanding; you correct misreads before they cost a rewrite.
A primed session reads its way in, then the ask can be tight:
# broad → narrow, before any edits
cat AGENTS.md README.md
cat src/middleware/auth.ts src/routes/auth/login.ts
cat src/models/user.ts src/config/jwt.ts
# now the prompt can assume the patterns are in contextAdd a POST /auth/refresh endpoint. Follow login.ts.Use the refreshToken field on User; sign with jwtConfig.secret.
Cold, the same agent falls back to generic Express boilerplate, needs rework to match the real middleware signature,
and likely misses the refreshToken field entirely.
3 Why it works — and where it backfires
The mechanism is the same one behind few-shot prompting: in-context examples shift the output distribution without
any weight update. Your actual middleware signature and naming conventions, sitting in context, make project-specific
outputs more probable and boilerplate less probable — repository-level conditioning measurably improves output fit.
Priming is not "load everything"
Push it too far and it inverts. Window saturation: pre-loading large files pushes your
instructions toward the attention-poor middle — trim or summarise first. Low-precision context:
loosely related files add noise that competes with signal. Short self-contained tasks (a pure
utility, a format conversion) gain nothing — priming just adds latency and token cost. Stale context:
files that don't reflect a post-refactor codebase anchor the agent on the wrong patterns. Prime
selectively, with current, high-precision files.
↪ Your win: read your way in before you ask
Read before write — load the target files and their neighbours first.
Broad → narrow — architecture, then module, then file; keep framing at the front.
Plan mode for verification — make the agent state its understanding; correct it for free.
Prime selectively — skip it for short, codebase-independent tasks.
Keep primed files current — stale context anchors on patterns that no longer exist.
Retrieval practice — recall, don't peek
Question 1Context priming differs from just-in-time retrieval in that it…
Question 2The recommended loading order for priming is…
Question 3Priming makes project-specific output more probable because in-context examples…
Question 4Priming backfires most clearly when you…
Question 5 · spaced recall from Lesson 20You should run context diagnostics…
Ask me anything. Want a reusable priming script for your repo's stack, or the rule for deciding
priming vs just-in-time retrieval on a given task? Next: Remember, Don't Re-Read — carrying typed state
across a long loop so you stop re-billing the whole transcript every turn.