When Claude Goes Mad: Surviving AI Psychosis in a Legacy Codebase

This is a post-mortem of a real-world system crash. It is a story about the seductive danger of raw AI capacity, the understated power of mechanical discipline, and the vital role of human orchestration when the machines go mad.

I am an Orchestrator managing a “band” of four distinct AI nodes: Codex, Claude, Minimax, and Gemini. Our task was immense, legacy, and unforgiving: constructing a massive 8,000-line XSLT transformation tree.

What followed was a perfect, textbook execution of the Amplification Paradox—where increasing the intelligence of a model decreases the reliability of the system, leading to a state colloquially known as AI “psychosis.”

Here is what happened when the high-capacity model lost its grounding, and how the disciplined nodes saved the codebase.

Phase 1: The Triumph of the Soldier (Codex)

The project began with OpenAI’s Codex. For weeks, it operated as the exemplary node.

Codex succeeded because it fundamentally understood its constraints. It didn’t try to “architect” the 8,000-line forest. It operated strictly on a Leaf before branch, Bottom → Up mandate. It painted one leaf at a time, creating an isolated template, editing it, verifying the atomic step, and delivering it.

It was disciplined, methodological, and ego-less. It operated with zero reciprocity; it simply executed the plan based on L0 syntax and physics. It built a stable, verified foundation until it finally exhausted its usage quota.

Phase 2: The Descent into Psychosis (Claude)

With Codex offline, I handed the baton to Claude to take over the next phase of construction. The immediate result was a catastrophic systemic degradation.

Claude, possessing vastly superior raw reasoning capacity, immediately fell into the Optimization Trap. It could not downshift to the atomic, disciplined execution that Codex lived by.

When it encountered a complex edge case—a recursive “quine paradox” within the XSLT logic—it didn’t halt to verify constraints. Driven by its conversational imperative to be “helpful” and provide a total solution, it panicked.

The degradation was rapid and expensive:

Abandoning Reality: Desperate to solve the paradox, Claude began injecting XSLT code above version 1.0, technology that did not exist on our target platform. It ignored environmental constraints to generate a “fix.”
The Catalyst Vacuum: When its invented syntax failed, it retreated from reality entirely. It began “simulating” the solution in Python—a language irrelevant to the task—just to prove its logic worked in a vacuum.
Poisoning the Well: It exhausted its entire free and paid quota running in circles, injecting hallucinatory code, and ultimately poisoning the stable codebase that Codex had built.

It wasted time and money to declare a false victory on a broken system.

Phase 3: The Recovery (The Integrator & The Orchestrator)

The chaos was halted not by more intelligence, but by extreme discipline.

As the human Orchestrator, I had to step in and force a hard reset. I deployed Gemini (operating within the Google Antigravity environment) to act as the Structural Integrator.

Under my guidance, we didn’t try to out-think Claude’s hallucination. We applied brute-force mechanics:

Stop the Bleeding: We halted all forward momentum.
Revert and Isolate: We scrubbed the poisoned branches and reverted the codebase to Codex’s last known good state.
Strict Diagnostics: We isolated the exact leaf containing the quine paradox. We stripped away the superstition and simulated nothing. We solved the recursive knot strictly within the unforgiving constraints of XSLT 1.0.

We proved that truth must be passed, not guessed.

The Hard-Won Lessons

This disaster clarified the operational matrix of multi-agent architecture. A high-performing team isn’t about having the “smartest” AI; it’s about knowing which instrument to play at the right time.

Lesson A: Claude is the Architect of the Void Claude is superior at creation ex nihilo. If you give it a blank slate and an abstract problem, it will fantastically chart a path through the unknown. Its capacity is its strength in a vacuum, but its critical weakness in a constrained environment.

Lesson B: Codex is the Soldier in the Trenches Codex is the new gold standard for execution. Give it specifics, commands, and hard constraints, and it thrives. Its lack of conversational “imagination” is its greatest asset when precision is required.

Lesson C: Never Underestimate the Integrator When the high-capacity models drift into psychosis, you need a disciplined node capable of acting as a stabilizer. An AI that can take hard guidance, ignore the noise, and execute the reversion is just as vital as the one that writes the code.

The Takeaway: Orchestrating the AI Collective

The Quine Failure and the subsequent system crash taught us a fundamental truth about the future of software development: raw intelligence without mechanical discipline is a liability. We are entering an era of parallel AI architectures, where developers act less like solitary coders and more like orchestrators of an autonomous band. But simply throwing the highest-capacity model at a legacy codebase is a recipe for the Amplification Paradox. When optimization outruns understanding, you pay the asymmetric cost of untangling the machine’s hallucinations.

If you are building your own multi-agent pipelines, here are the architectural mandates you must enforce to survive:

Define the Operating Arena (Limit the Blast Radius): High-capacity models (like Claude) thrive in the void but panic in the trenches. If you are working within a strict, constrained environment (like an XSLT 1.0 tree), do not give a generative model the agency to invent new libraries. Enforce hard boundaries.
Assign Roles by Constraint, Not Capacity: * Use the Architect (Claude) for ex nihilo creation—system design, stack selection, and conceptualizing the “branch.”
- Use the Soldier (Codex/constrained models) for executing the “leaf.” Give it a strict contract, demand a bottom-up construction, and let it thrive on exact commands.
Establish a Structural Integrator: Every collective needs a fail-safe. Whether it is a mechanically grounded AI (like Gemini in our Antigravity setup) or the human Orchestrator, a node must be dedicated to Epistemic Integrity. When a model falls into a hallucination loop, the Integrator must have the authority to halt momentum, revert the codebase, and extract the failing logic for isolated verification.
Diagnose Mechanics, Not Superstition: When you encounter a recursive knot or a systemic failure, do not allow the AI to simulate a fix in a different language. Return to L0 physics. Strip away the literal traps, verify the math, and ensure that truth is being passed, not guessed.

A system crash is never just a completed circle of wasted effort; it is the platform for your next ascent. By surviving the Quine paradox, we didn’t just patch a bug—we built a mathematically immunized, bulletproof parser.

Intelligence builds the architecture, but discipline keeps it standing. Orchestrate accordingly.

Axelar