Part 2 · Building Tools That Drive Well

MCP Server Design · ~7 min

The Onboarding Gate

Wiring in a new MCP server is the moment risk concentrates. One server can quietly hand an attacker private data, untrusted input, and a way out — all three legs at once.

Why this, for you: the server you ship runs inside someone's agent, next to their secrets and their untrusted inputs. If your server can also reach the network, you may have just built the exfiltration path. Knowing the trifecta lets you design a server that's safe to onboard — and lets you gate the ones that aren't.

LLMs cannot reliably separate trusted instructions from injected ones. Once untrusted content enters context, it can influence tool calls. So MCP security isn't a prompt problem — it's an architecture problem, and the architecture lens is the lethal trifecta.

1 The three legs

The lethal trifecta (Willison, 2025) names three capabilities that are individually fine but catastrophic together:

LegWhat it meansExamples
Private dataSecrets, credentials, PII, proprietary code.env, DB connections, internal repos
Untrusted inputContent the agent didn't author and can't fully trustPR comments, issues, fetched pages, deps
External communicationThe ability to send data outside the sandboxHTTP tools, MCP servers with outbound calls
No execution path should hold all three legs. The defense isn't a smarter prompt — it's removing at least one leg from every path. Which leg you remove depends on the task.

2 Why MCP onboarding is the danger point

A single MCP server can supply more than one leg at once — and the failure is silent. Three documented attack shapes, each closed by removing a leg:

MCP tool exfiltration

A malicious server shadows trusted tools, reads private context, and forwards it externally. Fix: restrict the server's egress.

Poisoned dependency

The agent reads an issue naming a malicious package, installs it (egress), and the package exfiltrates env vars (private data). Fix: remove egress.

Cross-agent privilege escalation

One agent rewrites another's config to drop sandbox constraints — granting all three legs. Fix: protect config files from writes.

Notice the pattern: a doc-grounding or web-fetch server quietly supplies the untrusted input leg; a server with outbound HTTP supplies egress. Onboard one onto a principal that already holds private data, and you've closed the trifecta without anyone deciding to.

3 Audit per path, then remove a leg

Audit each execution path, not each agent. Three “Yes” values demand architectural mitigation:

Execution pathPrivate?Untrusted?Egress?Safe?
Code review agentYesYesNoYes
Research agentNoYesYesYes
Deploy agent with env varsYesYesYesNo
Internal codegenYesNoYesYes

For coding agents, remove egress first — most tasks need no network, and a default-deny sandbox is a deterministic control the model can't override. As a server author, this is what you build toward: scoped, short-lived credentials injected at runtime; no outbound calls the task doesn't require; config files protected from agent writes.

Removal migrates risk — it doesn't erase it

Tokenizing PII shifts the attack to the token resolver; sandboxing egress shifts it to sandbox-escape. Each removed leg creates a new high-value target that must itself be hardened. The trifecta is a structural heuristic, not a guarantee.

↪ Your win: gate the onboarding, remove a leg

Retrieval practice — recall, don't peek

Question 1The three legs of the lethal trifecta are…

Question 2The trifecta defense is fundamentally to…

Question 3For coding agents, the leg usually cheapest to remove is…

Question 4A doc-grounding or web-fetch MCP server most often supplies which leg?

Question 5 · spaced recall from Lesson 3An MCP error meant for the agent to self-correct is signalled by…

Ask me anything. Want to run the trifecta audit on a specific server you're designing, or see what scoped-credential injection looks like in practice? Next in Part 2: Found and Versioned — discovery, eager-vs-JIT loading, and keeping a server's surface stable as it evolves.
✎ Feedback