Garbage-Collecting Entropy

Codebases rot between commits — docs drift, deprecated patterns spread, conventions decay in corners no one watches. Scheduled agents catch that decay on a cadence and hand you one-minute PRs.

Why this, for you: the velocity an agent gives you fades in months while the quality debt compounds indefinitely. This lesson is the proactive counter — turn your taste into machine-checkable rules once, then let an agent enforce them on every line, continuously.

Entropy reduction agents are scheduled background processes that scan for violations of encoded standards and open targeted PRs for human review. They run on a cadence whether or not anyone pushes — catching decay that reactive CI misses. OpenAI's harness team calls this "garbage collection" of technical debt.

1 Encode taste once, enforce it continuously

The pattern has three mechanisms: encode golden principles as mechanical constraints (lint rules, architectural tests, agent instructions); run background agents on a cadence scanning for deviations; and auto-generate targeted refactoring PRs reviewable in under one minute.

The core design principle: "Human taste is captured once, then enforced continuously on every line of code." Encoding a vague principle as a precise rule — "all retry logic must use retry_with_backoff" — is itself the win: vague principles can't be enforced; precise ones survive team turnover.

Why it works: entropy accumulates because the cost of noticing each violation is high — nobody's paid to scan the whole codebase weekly. Entropy agents eliminate the noticing cost. Caught on a short cadence, each violation is small and isolated, so the fix PR is small and reviewable in under a minute. The same debt caught quarterly has compounded into a risky change.

2 CI is reactive; this is proactive

Dimension	Traditional CI	Entropy reduction
Trigger	Code push / PR	Schedule (nightly, weekly)
Posture	Reactive	Proactive
Scope	Changed files	Entire codebase
Output	Pass / fail	Refactoring PR

The two are complementary. Deterministic linters catch rule-expressible violations; LLM agents handle judgment-heavy ones. Start minimal — one golden principle, a tech-debt-tracker.md the agent reads and updates, one periodic scan prompt — and graduate from weekly manual runs to nightly once false positives are rare. OpenAI runs Codex overnight; every morning, fixes are already waiting.

Unsupervised refactors break code ~⅔ of the time

CodeScene data shows AI breaks code in roughly two-thirds of refactoring attempts without proper validation. So human review stays non-negotiable, run existing tests against each PR before opening it, and scope each PR to one violation. Poorly specified principles produce high false-positive rates, reviewers start ignoring PRs, and the pattern collapses into noise — tune the rule before scaling the cadence.

Don't paper over a slop factory

This is not a substitute for fixing the root-cause process that generates debt. If agents produce entropy faster than scheduled cleanup can clear it, fix the upstream problem first. The AI velocity spike (+281% lines in month 1) fades by month 3 while static-warning (+30%) and complexity (+42%) increases persist — entropy reduction is one lever, scaled QA is the system.

↪ Your win: a codebase that cleans itself on a schedule

Encode one golden principle as a machine-checkable rule — the precision is the durable value.
Schedule the scan — proactive, whole-codebase, on a cadence CI never reaches.
One violation per PR — small, reviewable in under a minute, trivial to revert.
Gate on tests + human review — two-thirds of unsupervised refactors break code.
Keep the tracker current — a stale tech-debt-tracker.md re-raises resolved issues.

Retrieval practice — recall, don't peek

Question 1Entropy reduction agents differ from CI because they are…

Question 2The design principle behind the pattern is that human taste is…

Question 3Each generated PR should be scoped to…

Question 4CodeScene found unsupervised AI refactors break code in roughly…

Question 5 · spaced recall from Lesson 9In the monolith-to-sub-agents refactor, each sequential boundary acts as a…

Ask me anything. Want the weekly GitHub Actions cron workflow that opens one-violation PRs, or how the tech-debt tracker doubles as agent-readable input and agent-writable output? Next in Part 3: Define "Done" First — eval-driven development and the self-improving loop.