Agent Anti-Patterns · ~7 min
"AI reviewer in GitHub Actions" sounds like a productivity win. In its default shape it's a CVSS 9.4 critical — and the model is not the gate.
The default shape of an AI reviewer ingests PR titles, issue bodies, and comments — all
attacker-writable on a public repo — while the same runtime holds GITHUB_TOKEN, pipeline
secrets, and write tools. That is the lethal trifecta, closed on every workflow run.
GitInject (Isbarov et al. 2026) provisioned ephemeral repos against four AI providers in their default configs and found every provider susceptible to at least one attack class. The exploit:
Anthropic rated the Claude Code Security Review variant CVSS 9.4 Critical after a malicious PR
title broke context and dumped env to a public comment.
The mechanism is provenance-blindness: transformer attention has no channel separating a system prompt from a PR title that just entered context. Once the attacker's text lands in the runtime holding repo credentials, the model's compliance is enough. Microsoft attributes the class to "untrusted GitHub data flowing into an AI agent that holds production secrets and unrestricted tool access in the same runtime."
Close one leg of the trifecta on every path. The decisive move is architectural separation:
The reviewer never touches the credentialed actor's context; the actor never reads attacker-controlled bytes. Measured attack-success drops to 0.31% under two-agent isolation and 0% with full read/write separation — a 323x reduction. Treat PR titles, issue bodies, and comments as adversarial input at the boundary.
Blanket hardening isn't always proportional. The thesis narrows for private repos with vetted
contributors (the untrusted leg closes at access control), pure read-only agents with no
write tooling (the egress leg closes at the allowlist), and runtimes with no production secrets.
Where two-agent separation is impractical, defence-in-depth — output secret scanning, a mandatory human merge
gate, scoped GITHUB_TOKEN — covers the realistic surface.
Retrieval practice — recall, don't peek
Question 1The default AI-reviewer shape is dangerous because it…
Question 2The model follows a PR title as instructions because of…
Question 3The decisive fix is to…
Question 4Two-agent isolation drops the attack-success rate by about…
Question 5 · spaced recall from Lesson 09The fix for single-layer injection defence is…