Part 4 · Integrity & Operations

Context Engineering · ~6 min

Prime the Pump

Just-in-time retrieval pulls context when the agent needs it. Priming is the opposite move — deliberately loading the right files before you ask. Both are right; the trick is knowing which.

Why this, for you: a cold prompt forces the agent to guess at your patterns, conventions, and architecture — and it guesses generic. Priming is the cheapest way to make a coding agent write code that fits your codebase instead of framework boilerplate you then rewrite. It's the deliberate-preload counterpart to the JIT pull you learned earlier.

Agents don't retrieve project knowledge on their own. They work with whatever is in the window at the moment they generate. "Add authentication to the API," cold, makes the agent guess. Read the middleware, the auth config, and the user model first, and the same request produces output that fits.

1 Why order, not just content, matters

Priming isn't a bulk transfer — it's a loading sequence. Transformers attend more reliably to the start and end of the window than to the middle (the lost-in-the-middle effect from Lesson 02), so the order you load in shapes what the model actually leans on.

Load broad → narrow: architecture overview (AGENTS.md, README, top-level structure), then the relevant module, then the specific files to modify. Building understanding incrementally beats dumping everything at once — and it keeps the most critical framing at the attention-favoured start of context rather than buried in detail.

2 Four priming moves

The mechanics are simple and stack well:

A primed session reads its way in, then the ask can be tight:

# broad → narrow, before any edits cat AGENTS.md README.md cat src/middleware/auth.ts src/routes/auth/login.ts cat src/models/user.ts src/config/jwt.ts # now the prompt can assume the patterns are in context Add a POST /auth/refresh endpoint. Follow login.ts. Use the refreshToken field on User; sign with jwtConfig.secret.

Cold, the same agent falls back to generic Express boilerplate, needs rework to match the real middleware signature, and likely misses the refreshToken field entirely.

3 Why it works — and where it backfires

The mechanism is the same one behind few-shot prompting: in-context examples shift the output distribution without any weight update. Your actual middleware signature and naming conventions, sitting in context, make project-specific outputs more probable and boilerplate less probable — repository-level conditioning measurably improves output fit.

Priming is not "load everything"

Push it too far and it inverts. Window saturation: pre-loading large files pushes your instructions toward the attention-poor middle — trim or summarise first. Low-precision context: loosely related files add noise that competes with signal. Short self-contained tasks (a pure utility, a format conversion) gain nothing — priming just adds latency and token cost. Stale context: files that don't reflect a post-refactor codebase anchor the agent on the wrong patterns. Prime selectively, with current, high-precision files.

↪ Your win: read your way in before you ask

Retrieval practice — recall, don't peek

Question 1Context priming differs from just-in-time retrieval in that it…

Question 2The recommended loading order for priming is…

Question 3Priming makes project-specific output more probable because in-context examples…

Question 4Priming backfires most clearly when you…

Question 5 · spaced recall from Lesson 20You should run context diagnostics…

Ask me anything. Want a reusable priming script for your repo's stack, or the rule for deciding priming vs just-in-time retrieval on a given task? Next: Remember, Don't Re-Read — carrying typed state across a long loop so you stop re-billing the whole transcript every turn.
✎ Feedback