Technical Context
What caught my eye wasn't just another framework, but a very down-to-earth workflow: write spec - plan - subagent-driven development. The idea is simple and, frankly, painfully familiar. If code generation goes off the rails, the solution isn't a new prompt, but a living spec you can open to rebuild the solution almost from scratch.
This case features a change_spec.md as a core file that lives alongside the code. It’s not a document for a checkmark, but the source of truth: what we're changing, what the constraints are, and what we consider a successful result. When the codebase gets messy or the agents start pulling the solution in the wrong direction, I can return to the spec and regenerate the branch without any guesswork.
I especially liked the time estimate: a large task takes 2-3 days for the spec and only 0.5-1 day for code generation. This means 70-80% of the effort is spent not on programming, but on formalizing the task. It sounds almost insulting until you're stuck with a multi-agent system that's perfectly and rapidly doing the wrong thing.
Technically, this is very similar to a proper hierarchy. One agent or person forms the spec through brainstorming, a planner then breaks it down into steps, and sub-agents take on narrow pieces: backend, frontend, tests, validation. I consider this approach more mature than trying to cram an entire project into one long context and hoping the model will 'figure it out'.
Another strong point here is decomposition. If a spec becomes too bloated, you shouldn't keep adding to it indefinitely but rather break it down along natural boundaries: intake, analysis, execution, validation. This is where multi-agency starts providing real value instead of just burning tokens for a nice-looking diagram.
What This Changes for Business and Automation
For businesses, the takeaway is stark: the main uncertainty in AI projects lies not in the model, but in the task definition. If you don't have a living specification, no AI-powered automation will be stable. You'll get a series of demos, after which the team will quietly despise the word 'agent'.
I see this in projects where clients ask us to create an AI agent for internal processes. Almost always, the first real breakthrough happens not after changing the model, but after we extract the success criteria, exceptions, roles, data constraints, and escalation rules from people's heads. Once this is laid out in a proper spec structure, the AI solution architecture becomes much more manageable.
Who wins? Teams with long R&D cycles, a lot of ambiguity, and costly mistakes: product development, internal platforms, complex integrations, enterprise processes. Who loses? Those who expect instant magic and consider specifications to be bureaucracy. Multi-agents without a good spec can easily become an expensive way to parallelize confusion.
I wouldn't recommend shoehorning a multi-agent setup into every case. If a single agent with proper tool calls can handle the task, that's often enough. But when the context swells, there are many stages, and the result needs to be rebuildable, a spec-first approach starts paying off very quickly.
At Nahornyi AI Lab, we typically build such systems using a practical combination: a living specification, a planner, isolated executors, artifact control, and a clear human review loop. This is no longer 'playing with LLMs' but a proper AI implementation in development and operational processes where repeatability matters.
I'm Vadym Nahornyi from Nahornyi AI Lab, and I look at these patterns not as an observer, but as someone who regularly builds AI automations, agentic pipelines, and custom AI architecture for real-world tasks. If you want to discuss your case, order AI automation, custom AI agent development, or n8n automation, get in touch. We'll figure out where you truly need a swarm of agents and where a single clear spec and a solid build will suffice.