Part 6 · The Output & Data Surface

Security · ~7 min

The Output Is Untrusted Too

Fourteen lessons defended the way in — what the agent reads. This is the way out: what the agent writes, executed or rendered downstream without anyone checking. Same trust failure, opposite direction.

Why this, for you: the whole course so far has hardened the input side of the agent. But the string the model emits crosses into SQL, a shell, a renderer, or a package manager — and that boundary had no validator on it. This is the most common omission in agent designs that take injection seriously: OWASP calls it LLM05, and it ships about a quarter of all AI-generated code with a vulnerability.

Trust does not transfer through a string boundary. Your input-validation layer checked the user prompt, not the model response — so when that response reaches a code-interpreting sink, the system meets text nothing in the pipeline ever scrutinised. Treat agent output as untrusted input to the next system.

1 The model is a user — at every sink

OWASP LLM05:2025 Improper Output Handling is "insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream." Its rule is one sentence: treat the model as any other user and apply input validation to its responses. The five canonical sinks:

SinkRiskPer-sink control
exec/eval/shellRemote code executionCommand allowlist; never eval a model string
Unparameterised SQLSQL injectionPrepared statements — never interpolation
Browser-rendered HTML/markdownXSS, image exfiltrationContext-aware encoding; CSP; fetch gating
Unsanitised file pathsPath traversalCanonicalise, constrain to a base dir
Package managerSlopsquatting installResolve against an installed lockfile first
The controls are not new — parameterised queries, HTML encoding, allowlists. What's new is the applicability surface: LLM-generated strings now reach sinks that previously consumed only validated or trusted text.

2 A real CVE: the model that wrote SQL

The canonical instantiation is CVE-2025-1793 in LlamaIndex (CVSS 9.8): vector-store integrations built SQL by string-concatenating an LLM-generated query, fixed in 0.12.28 by switching to parameterised queries. The fix isn't LLM-specific — it's the same parameterisation that defends any user-supplied SQL string.

# BEFORE — model output interpolated straight into SQL filter_clause = llm.generate(f"build a WHERE clause for: {q}") cursor.execute(f"SELECT * FROM docs WHERE {filter_clause}") # → 1=1; DROP TABLE docs --
# AFTER — model emits a schema-constrained object; executor parameterises f = Filter.model_validate(llm.structured_output(q, schema=Filter)) cursor.execute("SELECT * FROM docs WHERE %s %s %s", (f.field, f.op, f.value))

The model never writes SQL; a deterministic executor builds the parameterised query from validated fields. Schema- constrained tool calling makes the schema the validator.

3 Distinct from excessive agency

This is not Lesson 8's blast radius. OWASP draws the line: LLM06 Excessive Agency is the agent taking action (too much functionality, permission, autonomy); LLM05 is a downstream consumer mishandling the text the agent produced. A perfectly permission-bounded agent can still trigger LLM05 if its bounded actions emit strings consumed unsafely. Defend both — independently.

Where the gate is theatre

Per-sink validation is not free, and four conditions invert it. Mature teams that already parameterise every query and escape every render gain nothing from a parallel LLM-specific scanner — confirm, don't duplicate. An agent that drafts emails for human review or writes code a developer reads before commit has no code-interpreting sink in that path. Stream-level "strip URLs / strip code" sanitisers reject legitimate docs and samples — prefer escaping at the sink over stripping the stream. And strict structured outputs already constrain the model to schema-conformant fields.

↪ Your win: enumerate the sinks, gate each one

Retrieval practice — recall, don't peek

Question 1OWASP LLM05 Improper Output Handling is about…

Question 2The reason output handling fails is that trust…

Question 3The per-sink controls for LLM05 are best described as…

Question 4The right fix for the CVE-2025-1793 SQL class is to…

Question 5 · spaced recall from Lesson 14In a URL-exfiltration attack, the private data leaks…

Ask me anything. Want to enumerate the downstream sinks in an agent you're shipping, or see how Safe Outputs and the Action-Selector pattern each realise per-sink validation? Next in Part 6: The Payload That Waits — how a dormant instruction sits in memory across a hundred sessions.
✎ Feedback