Capstone

MCP Server Design · ~8 min

Ship a Server Agents Can Drive

Eight lessons, one decision sequence. Here's the whole course as a table you run top to bottom, plus a mixed review that pulls from every part.

Why this, for you: the individual choices each have a defensible answer — the production question is the sequence, because each decision forecloses the next. This capstone is the order to resolve them in, so a real server lands safe, discoverable, and cheap to keep installed.

A well-designed MCP server makes the right tool call obvious, costs little to keep loaded, and can't be onboarded into a trifecta. Those three properties come from resolving the decisions in order — each one locks the option space for the next. The first six shape the core surface; the last three are the deeper protocol you reach for once it scales.

1 The decision table

#DecisionResolve it by…From
1PrimitiveTool (model acts), resource (client attaches read-only context), or prompt (user triggers workflow)L1
2Transportstdio for local dev tooling; Streamable HTTP for shared/remote — and remote forces OAuthL1
3Data exposureExpose corpora behind search/read tools; return only what the next step needs; guard retrieval qualityL2
4Tool craftverb_noun ≤32 chars; enums + defaults + examples; negative guidance; actionable errorsL3
5SecurityAudit each path for the trifecta; remove a leg (egress first); scope credentials; protect configL4
6Load & discoveryKeep catalog <15 tools; eager-load below 10/10K, else defer; search-friendly descriptions; name for intentL5
7Output & hintsDeclare outputSchema + return structuredContent; annotate behavior — but don't trust hints from untrusted serversL6
8Server-initiatedElicit mid-call inputs (flat fields); sample for model reasoning — client picks the model, user approves each requestL7
9Result orchestrationCode Mode for data-heavy chains — only stdout returns; not for in-between reasoning, and not ZDR-eligibleL8
The reference extreme: Cloudflare exposes ~2,500 API endpoints through two tools — search and execute — in roughly 1K tokens. Every layer lines up: remote server → intent grouping at its limit → schemas needn't be deferred → programmatic calling for large results → OAuth on auth.

2 The author's checklist

# run this before you ship [ ] Each tool is verb_noun snake_case, ≤32 chars, no versions [ ] Every param has a description with constraints and examples [ ] Enums + defaults + additionalProperties:false used where possible [ ] Descriptions say when NOT to use the tool [ ] Errors carry the violation, the constraint, and recovery context [ ] Read-only context is a resource, not a tool [ ] Tool list is under 15 tools per server [ ] Responses return only what the agent needs next [ ] Clear server instructions so tool search finds you [ ] outputSchema declared; structuredContent returned with a JSON copy [ ] Annotations honest; never trusted from an untrusted server [ ] Elicitation/sampling used only where needed, behind their gates [ ] Code Mode considered for data-heavy chains (sandbox available) [ ] No execution path holds all three trifecta legs

3 Where the checklist inverts

The checklist assumes a stable, internally-owned API. Know the conditions that flip each rule:

↪ Your win: the whole discipline, in order

Mixed review — every part in play

Question 1 · L1A reusable multi-step workflow a user triggers is best modeled as a…

Question 2 · L3Which error message lets an agent self-correct without a human?

Question 3 · L4A deploy agent holds env vars, reads repo config, and posts externally. It is…

Question 4 · L6A readOnlyHint from an untrusted MCP server should be treated as…

Question 5 · L8 · spaced recallIn Code Mode, what reaches the model's context is…

Ask me anything. Bring a server you're designing and we'll run the decisions top to bottom — primitive, transport, data exposure, tool craft, trifecta audit, load policy, output and annotations, server-initiated calls, and result orchestration — and produce the author's checklist filled in for your case.
✎ Feedback