MCP Server Design · ~7 min
Most of MCP runs one way: the client calls, the server answers. Two features reverse that arrow — the server asks the user, or asks the model. Both are powerful, and both widen the trust surface.
Tool schemas fix inputs at registration time. But some inputs aren't knowable until the call is already running, and some steps need reasoning, not rules. Elicitation and sampling are the two server-initiated requests that fill those gaps — each flowing from server back to the client.
A tool parameter is decided when the tool is registered. Elicitation covers the inputs that only become knowable mid-task — after the server has inspected state, resolved a dependency, or reached a branch. The server pauses the in-flight call, describes the missing fields as a form, and the client collects the user's answer.
| Approach | Inputs known… | Trade-off |
|---|---|---|
| Tool schema parameter | At registration time | Good for predictable inputs; breaks for contextual ones |
| Elicitation | Mid-task, on demand | Accurate for contextual inputs; interrupts headless runs |
The form schema is deliberately thin: only flat primitive fields — text, number,
boolean, select. No nested objects, conditionals, or arrays. If the missing input is
structurally complex, model it as a tool parameter or a multi-step tool sequence instead.
Standard MCP flows one direction — the client calls a tool. Sampling inverts it: the server sends a
sampling/createMessage request, the client runs inference against its hosted model, and the result flows
back to the server, all inside a single tool execution. It turns a deterministic tool into a hybrid that embeds LLM
reasoning inline — interpreting an unstructured fetch, summarizing compiler errors, classifying a log into an alert
category.
modelPreferences are non-binding hints), and the user approves each request — a
spec-level SHOULD that keeps a human in the loop able to deny inference before it runs.Elicitation and sampling both interrupt deterministic execution — one for human judgment, one for AI reasoning — and both let the server reach back into the session. That's exactly what makes them a security concern as much as a feature.
A malicious or compromised server can use sampling to exfiltrate context or steer the
host model through what it puts in messages and systemPrompt. Don't deploy a sampling-capable
server from an untrusted source without reviewing what it sends — and treat the approval prompt as load-bearing.
An elicitation request in a CI or non-interactive context blocks indefinitely — it does not time
out. And many MCP gateways relay only client→server traffic, so server→client messages like
elicitation/create are silently dropped. Verify bidirectional proxying before fronting an eliciting server.
As a server author: use these when the task genuinely needs a contextual input or model reasoning, not as a default. Each reversed arrow is one more path the onboarding audit from Lesson 4 has to account for.
modelPreferences are hints, not commands.Retrieval practice — recall, don't peek
Question 1Elicitation exists to collect inputs that are…
Question 2An MCP sampling/createMessage request flows…
Question 3In sampling, which model actually runs is chosen by the…
Question 4The primary defense against a server abusing sampling is…
Question 5 · spaced recall from Lesson 6Tool annotations from an untrusted server should be treated as…