Technical Context
I carefully went through Anthropic's material on recursive self-improvement, and my conclusion is simple: it is not AGI yet, but it is no longer just a demo toy. What caught my eye was something else: steps of the AI R&D cycle that used to depend entirely on humans are now being handled by models themselves. And this is already very close to what I see in real AI automation projects.
Anthropic's main point is not that the model suddenly 'started building itself.' They honestly state: the main problem right now is not execution, but judgment—selecting directions, goals, and research priorities. This is an important caveat, because without it, it's easy to fall into clickbait like 'AGI is already here.'
The numbers show an interesting picture. Claude, according to their data, reached a 76% success rate on the most open-ended tasks by May 2026, and in optimizing the experimental workflow, the acceleration grew from about 3x to 52x in less than a year. Another marker: in 'where to dig next' tasks, the model guided research in a productive direction 64% of the time, compared to 51% for humans.
And here I remembered an old experiment from the community with micromorph: a self-improving agent in a couple of hundred lines of Python that could expand its own functionality. Not magic, but a standard cycle: plan, modify code, run, verify, repeat. Once given access and a goal, the agent set up Telegram communication for itself within a few minutes. This is not recursive self-awareness, but an engineering pattern—yet these are the very patterns from which practical AI implementation is built.
Where would I immediately put the brakes? On rollback, health checks, and tool constraints. Without these, any 'self-improving' agent will very quickly turn into an agent that carefully breaks itself.
What This Changes for Business and Automation
For business, those who have a lot of repetitive engineering routine will win: integrations, internal bots, test pipelines, and minor API adjustments. There, AI solution development can already be structured as a self-verifying cycle, rather than a one-off chat request.
Teams that confuse autonomy with a lack of control will lose. If you let an agent write code, touch production, and give it no boundaries, it won't get smarter from freedom; it will just become more expensive to maintain.
Right now, I would look at this not as 'AGI is about to be born,' but as a new layer of AI architecture: the agent can not only perform a task but also build tools for itself to solve it. At Nahornyi AI Lab, we solve exactly these kinds of challenges for clients: where you don't need a chat just for the sake of chatting, but a working automation with AI featuring tests, access rights, and a clear economic effect. If manual technical tasks are piling up in your processes, we can easily break them down together and build an AI agent without extra fantasy, but with real utility.