Part 3 · The Deeper Protocol

MCP Server Design · ~7 min

The Server Talks Back

Most of MCP runs one way: the client calls, the server answers. Two features reverse that arrow — the server asks the user, or asks the model. Both are powerful, and both widen the trust surface.

Why this, for you: a tool that can pause mid-call to request a missing value, or to ask the host's model to reason, can do things a static schema can't. But every reversed arrow is a new way a server reaches back into the session — so knowing what each adds, and what gate guards it, is part of designing the server responsibly.

Tool schemas fix inputs at registration time. But some inputs aren't knowable until the call is already running, and some steps need reasoning, not rules. Elicitation and sampling are the two server-initiated requests that fill those gaps — each flowing from server back to the client.

1 Elicitation: ask the user mid-call

A tool parameter is decided when the tool is registered. Elicitation covers the inputs that only become knowable mid-task — after the server has inspected state, resolved a dependency, or reached a branch. The server pauses the in-flight call, describes the missing fields as a form, and the client collects the user's answer.

ApproachInputs known…Trade-off
Tool schema parameterAt registration timeGood for predictable inputs; breaks for contextual ones
ElicitationMid-task, on demandAccurate for contextual inputs; interrupts headless runs

The form schema is deliberately thin: only flat primitive fields — text, number, boolean, select. No nested objects, conditionals, or arrays. If the missing input is structurally complex, model it as a tool parameter or a multi-step tool sequence instead.

2 Sampling: ask the model mid-call

Standard MCP flows one direction — the client calls a tool. Sampling inverts it: the server sends a sampling/createMessage request, the client runs inference against its hosted model, and the result flows back to the server, all inside a single tool execution. It turns a deterministic tool into a hybrid that embeds LLM reasoning inline — interpreting an unstructured fetch, summarizing compiler errors, classifying a log into an alert category.

# server asks the host's model to reason — server → client "method": "sampling/createMessage", "params": { "messages": [ { "role":"user", "content":"Identify security-relevant changes..." } ], "maxTokens": 512, "modelPreferences": { "intelligencePriority": 0.8, "speedPriority": 0.2 } }
Two rules hold regardless of what the server asks for: the client picks the model (modelPreferences are non-binding hints), and the user approves each request — a spec-level SHOULD that keeps a human in the loop able to deny inference before it runs.

3 Both arrows widen the trust surface

Elicitation and sampling both interrupt deterministic execution — one for human judgment, one for AI reasoning — and both let the server reach back into the session. That's exactly what makes them a security concern as much as a feature.

The user-approval gate is the primary defense for sampling

A malicious or compromised server can use sampling to exfiltrate context or steer the host model through what it puts in messages and systemPrompt. Don't deploy a sampling-capable server from an untrusted source without reviewing what it sends — and treat the approval prompt as load-bearing.

Elicitation can stall headless runs and break behind gateways

An elicitation request in a CI or non-interactive context blocks indefinitely — it does not time out. And many MCP gateways relay only client→server traffic, so server→client messages like elicitation/create are silently dropped. Verify bidirectional proxying before fronting an eliciting server.

As a server author: use these when the task genuinely needs a contextual input or model reasoning, not as a default. Each reversed arrow is one more path the onboarding audit from Lesson 4 has to account for.

↪ Your win: server-initiated, but gated

Retrieval practice — recall, don't peek

Question 1Elicitation exists to collect inputs that are…

Question 2An MCP sampling/createMessage request flows…

Question 3In sampling, which model actually runs is chosen by the…

Question 4The primary defense against a server abusing sampling is…

Question 5 · spaced recall from Lesson 6Tool annotations from an untrusted server should be treated as…

Ask me anything. Want to decide whether a missing input belongs in an elicitation form or a tool parameter, or sketch a sampling call for a hybrid tool? Next in Part 3: Code Mode — letting the agent write code that orchestrates many tool calls, the biggest token lever for large result sets.
✎ Feedback