Eight lessons, one decision sequence. Here's the whole course as a table you run top to bottom, plus a
mixed review that pulls from every part.
Why this, for you: the individual choices each have a defensible answer — the production question is
the sequence, because each decision forecloses the next. This capstone is the order to resolve them in, so a
real server lands safe, discoverable, and cheap to keep installed.
A well-designed MCP server makes the right tool call obvious, costs little to keep loaded, and can't be
onboarded into a trifecta. Those three properties come from resolving the decisions in order — each one locks
the option space for the next. The first six shape the core surface; the last three are the deeper protocol you reach
for once it scales.
Audit each path for the trifecta; remove a leg (egress first); scope credentials; protect config
L4
6
Load & discovery
Keep catalog <15 tools; eager-load below 10/10K, else defer; search-friendly descriptions; name for intent
L5
7
Output & hints
Declare outputSchema + return structuredContent; annotate behavior — but don't trust hints from untrusted servers
L6
8
Server-initiated
Elicit mid-call inputs (flat fields); sample for model reasoning — client picks the model, user approves each request
L7
9
Result orchestration
Code Mode for data-heavy chains — only stdout returns; not for in-between reasoning, and not ZDR-eligible
L8
The reference extreme: Cloudflare exposes ~2,500 API endpoints through two tools —
search and execute — in roughly 1K tokens. Every layer lines up: remote server → intent
grouping at its limit → schemas needn't be deferred → programmatic calling for large results → OAuth on auth.
2 The author's checklist
# run this before you ship
[ ] Each tool is verb_noun snake_case, ≤32 chars, no versions
[ ] Every param has a description with constraints and examples
[ ] Enums + defaults + additionalProperties:false used where possible
[ ] Descriptions say when NOT to use the tool
[ ] Errors carry the violation, the constraint, and recovery context
[ ] Read-only context is a resource, not a tool
[ ] Tool list is under 15 tools per server
[ ] Responses return only what the agent needs next
[ ] Clear server instructions so tool search finds you
[ ] outputSchema declared; structuredContent returned with a JSON copy
[ ] Annotations honest; never trusted from an untrusted server
[ ] Elicitation/sampling used only where needed, behind their gates
[ ] Code Mode considered for data-heavy chains (sandbox available)
[ ] No execution path holds all three trifecta legs
3 Where the checklist inverts
The checklist assumes a stable, internally-owned API. Know the conditions that flip each rule:
Enums vs. evolving upstreams — a thin string type is more durable when the upstream adds values often.
Schemas don't sanitize input — the stdio model can run commands on injected args; argument sanitization is the fix, not a richer schema.
Over-consolidation hurts routing — one polymorphic tool pushes disambiguation into the schema; the ceiling depends on description distinctness, not raw count.
An unavoidable trifecta — when a leg can't be removed, add compensating controls: output scanning, rate-limiting, egress anomaly detection.
Annotations as a security control — a readOnlyHint from an untrusted server is a claim, not a sandbox; gate writes and egress regardless.
Code Mode without a sandbox — it's inert air-gapped or on-prem, loses in-between reasoning, and is not ZDR-eligible; keep the round-trip loop there.
↪ Your win: the whole discipline, in order
Primitive → transport → data → tool craft → security → discovery — resolve in sequence.
Make the right call obvious with names, schemas, examples, and negative guidance.
Gate onboarding — no path holds all three legs; remove egress first.
Stay affordable — small catalog, eager-vs-JIT by hit rate, search-friendly prose.
Type and gate the deeper protocol — output schemas, honest hints, gated server-initiated calls, Code Mode for result bloat.
Know where the rules invert before you trust the checklist blindly.
Mixed review — every part in play
Question 1 · L1A reusable multi-step workflow a user triggers is best modeled as a…
Question 2 · L3Which error message lets an agent self-correct without a human?
Question 3 · L4A deploy agent holds env vars, reads repo config, and posts externally. It is…
Question 4 · L6A readOnlyHint from an untrusted MCP server should be treated as…
Question 5 · L8 · spaced recallIn Code Mode, what reaches the model's context is…
Ask me anything. Bring a server you're designing and we'll run the decisions top to bottom —
primitive, transport, data exposure, tool craft, trifecta audit, load policy, output and annotations,
server-initiated calls, and result orchestration — and produce the author's checklist filled in for your case.