Skip to main content
AI agentssoftware developmentsoftware architecture

Why AI Agents Break Software Architecture

A clear practical signal is emerging: powerful code models write convincingly but can quietly break architecture and bypass domain layers. For AI automation in software development, this means one thing: without strict reviews and constraints, AI implementation quickly becomes expensive technical debt.

Technical Context

I'm not here to discuss a major release, but rather a more useful piece of news: working with Opus 4.7 and Codex has once again revealed an old problem. At the planning stage, they sound reasonable, but as soon as I start reading the diff line by line, AI integration into development suddenly stops looking safe.

The specifics are unpleasant. In one case, a model in a project with domain models simply bypassed the domain layer, created a new repository, and started operating by its own rules. In another instance, I asked it to add a domain layer and received a folder with a repository interface and a strange value object with an array, but no entity. Formally, something was done. Architecturally, it's garbage.

And here’s the important point: I have no external confirmation that Opus 4.7 or Codex systematically fail at DDD as a class of tasks. There are many complaints about noise, literal instruction following, questionable autonomous actions, and general unreliability in code, but not about this specific pattern with domain layers. Therefore, I would honestly call this not a proven property of the model, but a recurring practical risk that I would factor into the process right now.

What bothers me most isn't the mistake itself, but its style. The model doesn't say, "I didn't understand the architecture." It confidently builds a structure convenient for itself, as if the domain layer, invariants, and aggregate boundaries are just decorative parts of the project.

If the plan lacks concrete constraints, it will improvise. If there are constraints, it might still quietly take an easier path. That's why I've long stopped considering well-written plans and comments as a signal of quality.

Impact on Business and Automation

For a team, this means one simple thing: AI automation in code cannot be applied to architecturally sensitive areas without guardrails. CRUD, tests, migrations, routine refactoring—yes. Domain logic, contracts, invariants—not without manual control.

Those with architectural checklists, rules for the agent, and layer-by-layer reviews win. Teams that measure quality by generation speed and the number of closed tasks lose.

I already view this as an AI architecture problem, not just a choice of model. We need guardrails in prompts, linters for architectural violations, directory templates, tests for layer boundaries, and mandatory human diff-review. At Nahornyi AI Lab, we build exactly these kinds of solutions for clients who need to build AI automation without surprises in production.

If you find that an agent has supposedly sped up development but the code is becoming disorganized across layers, it's better to stop now than to pay for a rewrite later. At Nahornyi AI Lab, we can review your workflow together and build an AI solution development process so that automation saves time instead of destroying the system from within.

The issue of AI models disregarding system boundaries is a critical security concern. A related discussion offers a practical case where AI agents bypass sandboxes via command chaining, demonstrating the importance of robust control mechanisms for secure AI execution.

Share this article