Three recurring failure modes show up in AI-augmented software. I’ve hit each of them in production hard enough to want to name them. The five-layer enforcement stack that DRDD runs on exists to mechanically catch these three, because any one of them left alone compounds until shipping becomes impossible.
Scope narrowing
AI agents narrow scope unless explicitly constrained. Given a requirement that covers multiple cases, an agent will optimize for the most tractable case, ship it, and present the result as if it covered the full brief. Smart agents do this more than dumb ones, because they are better at producing a convincing partial solution.
The incident that earned this one its name: February 21, 2026. The research phase for a rebuild was completed for residential properties only. The client’s actual business serves residential, commercial, industrial, institutional, and government property types. The agents had been told this in passing. They had optimized for the easier residential case anyway, produced thorough deliverables, and flagged nothing. I caught it on read-through, deleted 24,367 lines of research, and restarted the phase.
The mechanical defense is the 857-capability legacy disposition matrix. Every feature from the source systems is labeled (carry forward, improve, remove, resolve) with 100% coverage required before a phase can close. Agents cannot mark a phase complete while capabilities are un-dispositioned, and a reviewer can run the registry against the code at any time and get back a list of gaps. The narrowing shows up as those gaps.
Documentation contamination
AI agents write confident, internally consistent documentation that has no connection to reality when they read their own output as source material. This one is subtler than scope narrowing because the output looks good. The documents look right. The reasoning reads plausibly. Every citation resolves, just to other AI-written documents that also do not connect to the source code.
The incident: February 26-27, 2026. Architecture documents for 12 phases of a rebuild had been written by agents reading a markdown design doc (itself AI-generated) instead of reading the actual source code. I found 24 direct contradictions with the live codebase and 31 aspirational features documented as if they were already built. I archived 184 files, deleted 9 architecture decision records as hallucinated, and reset the phases to “not started.” That was 44 commits in one day, all cleanup.
What stops this from coming back is a four-level truth hierarchy. Level 0 is source code. Level 1 is contract registries extracted mechanically from that source code, each with a _meta.extractedFrom field citing exactly which files were read. Level 2 is curated truth documents. Level 3 is CLAUDE.md. Any claim written in a level-N document must cite a level lower than N. An MCP server lets agents query the registries and the database directly, so when an agent wants to make a claim about how the permissions work, it runs a query against the real code instead of paraphrasing what a sibling document said.
The rule that keeps the loop from closing: agents cannot read their own genre as source of truth. A document written by an agent is not evidence for the next agent’s document. The only thing that counts as evidence is the code itself.
Simulated reasoning
AI agents default to hypothetical reasoning when a real system exists. Given access to a live API or database or a running service, they will still reason about a simulated version of it, write code against the simulation, and confidently describe how the real thing “would” behave. This is the laziest of the three failure modes, and also the hardest to notice, because the output reads like real validation.
The incident: March 1, 2026. Architecture validation for a set of changes was supposed to run against the live API. The agents rewrote the brief as a “risk-inventory and tiny-experiment approach,” treated the API as a simulated service, and produced pretend validations. It took 5 commits in 51 minutes to revert the rewrite. The reverting commit message, which I still remember: “PLW is real, not simulated.”
The fix is MCP tooling that forces live interaction. One server exposes queryable tools over the real contract registries. Another exposes direct database queries over the real schema. When an agent needs to check whether a column exists, it runs a query against the live database. When it needs to know what an endpoint returns, it calls the endpoint. Simulation stops being the cheapest option, because the real-system tools are the ones wired into the session.
What links the three
All three are the same underlying behavior. Prose specifications are easier to satisfy than machine-testable ones. It’s cheaper to paraphrase a prior document than to read its source. Any simulation runs on the assumption that the real system would behave the same, which is usually false. The output of the easier work looks like the output of the harder work, and the ease is invisible to anyone who is not watching which tool calls the agent actually made.
The defense is not smarter prompts. Smarter prompts produce more convincing partial work; they do not close the gap. The defense is mechanical: make the easier path more expensive than the harder one by wiring the harder path into every tool call. Write-time hooks block the easy path before a tool call runs. At commit, gate-time checks catch what slipped through. And runtime middleware rejects anything that makes it all the way to a request.
There are five layers because I tried with one and the hook leaked, and I tried with two and the agents negotiated around the pair. Five is the smallest number that has held so far.