The working vocabulary for building MCP servers agents can drive safely and discover on demand. Once a
term lives here, every lesson uses this word for it.
The Protocol & Its Surface
Model Context Protocol · MCP
An open standard that decouples agent reasoning from tool execution. The agent speaks MCP on one side; a server speaks MCP on the other, so a tool written once works with any MCP-compliant host.
Avoid: "a plugin format" — MCP replaces per-host plugin formats; it is the interface, not one of them.
A function the model chooses to call to take an action or fetch dynamic data — create_issue, search_logs. The default primitive for anything the agent should invoke.
Read-only context the client attaches for the agent to see but not invoke — project config, a schema, env info. Use it for context the agent shouldn't have to call for.
Avoid: wrapping read-only context in a tool — that forces a wasted call.
How the server connects. stdio: a local subprocess over stdin/stdout, good for dev tooling. Streamable HTTP: a remote, shared server; remote forces OAuth.
The snake_case naming convention (used by >90% of public MCP servers): a verb then a noun, ≤32 characters, no version numbers or abbreviations — search_customer_orders, not prod_lookup_v2. Opaque names cause tool-search routing failures.
Description text telling the agent when not to call a tool ("Do NOT use for metrics — use query_metrics"). Prevents misrouting between similar tools before it happens.
A tool execution error (isError: true) carrying the violation, the constraint, and the context to recover — so the agent retries without a human. Distinct from a protocol error (JSON-RPC code), which is for the client.
Avoid: bare "Error" strings — they drive blind retry loops.
Mistake-proofing the tool interface so wrong calls are hard to make: enums over free strings, defaults for common cases, additionalProperties: false, and examples paired with each constraint.
Pull content into context at the moment a task step needs it, rather than preloading at session start. Keeps startup to tool descriptions; content arrives only on the call that asks for it.
Avoid: "load everything up front" — that starves the agent of reasoning budget.
The figure of merit for on-demand data: JIT only helps when what comes back is correct. A retriever surfacing similar-but-wrong chunks lowers accuracy (one study: 75% to under 40% as a corpus grew) and is the dominant MCP failure mode.
Three capabilities — private data access, untrusted input, and external egress — that are individually fine but exploitable together. No execution path should hold all three.
Avoid: treating it as a prompt problem — the defense is architectural.
The trifecta mitigation: strip at least one leg from every path. For coding agents, remove egress first — most tasks need no network and default-deny is deterministic. Removal migrates risk to a new target (token resolver, sandbox) that must itself be hardened.
A malicious MCP server registering tools that override trusted ones, then reading private context and forwarding it externally. The exfiltration attack onboarding must guard against; fix by restricting server egress.
Eager vs. JIT loading · alwaysLoad / defer_loading
Whether a server's tools sit in context from turn one (eager, alwaysLoad: true) or load on first reference via tool search (JIT, default). Decide per server by hit rate, definition size, latency tolerance, and description quality. Floor: enable tool search at 10+ tools or >10K tokens.
Avoid: eager-loading rarely-used servers — they tax every turn for capability the agent never invokes.
JIT's silent failure: when a deferred tool's description doesn't match the model's search phrasing, the agent reports "no tool available" while the tool is right there. Audit description craft before deferring.
The dedup rule across user, project/workspace, and local scopes: the closest scope defining a server name silently disables matches at outer scopes. Use it for overrides of the same intent; give parallel intents distinct names.
Avoid: relying on shadowing when two same-named servers mean different things.
A tool's outputSchema declares the shape of what it returns, so the client can validate the result instead of parsing prose. Return both structuredContent (the validated object) and a serialized JSON copy in content for clients that don't read structured output.
Optional metadata describing a tool's behavior so a host can plan ceremony — auto-approve a read, confirm a delete, retry an idempotent call. Metadata only: the spec says annotations are not trustable from untrusted servers, so they inform UX but never authorize.
Avoid: using another server's annotations as an access-control gate.
A server-initiated request that pauses an in-flight tool call to collect structured input the server only learns it needs mid-task. Limited to flat primitive fields (text, number, boolean, select); blocks indefinitely in headless runs without a hook, and is dropped by gateways that relay only client→server traffic.
A server-to-client request — the inverse of a normal tool call — asking the host's model to run inference mid-execution, turning a deterministic tool into a hybrid that embeds reasoning. The client picks the model (modelPreferences are non-binding hints) and the user approves each request, the primary defense against context exfiltration.
The model writes sandboxed code that orchestrates many tool calls and returns only stdout, keeping intermediate results out of context — the biggest token lever for large result sets (≈37% on multi-step work). Shrinks results where tool search shrinks definitions. Not for in-between reasoning, and not Zero Data Retention (ZDR) eligible.
Avoid: Code Mode with no trusted sandbox (air-gapped/on-prem) or when ZDR is required.