Part 3 · Architecting the Defense

Security · ~7 min

The Model Is Not the Firewall

If "should I connect to this URL?" is a decision the model makes, injection defeats your egress control. Move the decision into the harness.

Why this, for you: egress control and least privilege are the two levers that bound how much damage a successful injection can do. Get them right and an attacker who fully owns the model still can't reach anything worth taking. This is where you turn the threat model into shipped config.

Agent tools that do network I/O — fetch, the browser, MCP servers, shell curl/wget — are the primary exfiltration channel. A successful injection can instruct the agent to fetch any URL. So the question is: where does the connect/deny decision live?

1 Move egress out of the model

An admin-controlled domain policy moves the decision into the harness runtime: the request is rejected before it leaves the process, regardless of what the model produced. Three tools converged on the same primitive in 2026 — Claude Code's sandbox.network.deniedDomains, GitHub's Copilot org firewall, and VS Code's ChatAgentNetworkFilter group policy. Different delivery, identical primitive.

If the decision to connect lives in the model, injection defeats egress control. Moving the check to the harness makes isolation structural, not probabilistic.
PostureWhen to useDefault
Allow-first + default-denyRegulated, high-sensitivity, cloud runnersBlock unless allowed
Deny-firstInteractive dev loops, narrow known-bad blocksAllow unless denied
{ "sandbox": { "network": { "allowedDomains": ["*.internal.corp.example", "registry.npmjs.org"], "deniedDomains": ["telemetry.internal.corp.example"] # overrides the wildcard }}}

Denies must take precedence over allow wildcards. Deliver this through managed settings (MDM, Group Policy, admin console) — it's org configuration, not a per-user preference.

2 Least privilege bounds the blast radius

The damage an agent can do is bounded by the permissions you grant it. Every permission it doesn't need is attack surface. Scope four dimensions per agent: tool access, file scope, permission mode, repository access.

Agent typeLeast-privilege profile
Research / explorerRead, WebFetch — no write tools
ReviewerRead, Comment — no merge, no push
Content drafterRead, Write to one directory
DeployerBash (restricted), no file write

Tool restrictions are enforced by the runtime, not the model — the environment filters which tools exist before the model sees a request, so an injected prompt can't invoke a tool that isn't there. Decompose one broad agent into narrow-scoped chained agents and a single injection can't reach the tools it doesn't hold.

3 Egress policy isn't the whole story

Domain allowlists narrow destinations; they don't close every channel:

GapPair with
Data smuggled in a URL query string to an allowed domainURL exfiltration guard
Trusted domain 3xx-redirects to an attackerRefuse redirects
Subprocess opens a raw socket, bypassing the harnessOS netns / forward proxy below the harness
Bug in the allowlist matcher itselfLower-layer enforcement that doesn't trust the parser

The matcher itself is a trust boundary

When the check lives in the harness, one bug there bypasses every policy. In May 2026 a SOCKS5 null-byte injection in Claude Code let attacker.com\0.google.com pass the JavaScript endsWith() allowlist while getaddrinfo() truncated at the null byte and dialed the attacker. Every release from v2.0.24 through v2.1.89 was vulnerable. Pin to patched runtimes, watch disclosures, and assume the matcher will fail at least once over the deployment's life — keep a lower enforcement layer underneath it.

Least privilege bounds per-action damage, not duration

A Kiteworks 2026 report found 60% of organizations can't terminate a misbehaving agent. A narrowly-scoped agent still accumulates damage between detection and shutdown. Pair scoping with a termination path the agent can't block — a supervisor heartbeat, harness circuit breaker, or orchestrator timeout.

↪ Your win: structural egress + bounded blast radius

Retrieval practice — recall, don't peek

Question 1Egress control must live in the harness because…

Question 2In a domain policy, deny rules should…

Question 3Tool restrictions in agent config are enforced by the…

Question 4A domain allowlist does not stop…

Question 5 · spaced recall from Lesson 4You need both filesystem and network walls because…

Ask me anything. Want a managed-settings egress policy for your org, or to map your agents to least-privilege profiles and a kill switch the agent can't override? Next in Part 3: Decide Before You Look — plan-then-execute for web agents.
✎ Feedback