OpenAI and Claude: Limits, Hype, and the Real Cost

In May 2026, there's a lot of talk about "cheap tokens" for OpenAI and Claude, but the reality is mixed. OpenAI has a temporary Codex boost, while Claude's limits seem tighter. This matters for AI automation because it directly affects budgets, speed, and how you calculate your workload and infrastructure costs.

The Technical Context

I dug into the current discussions about OpenAI and Claude because, for AI implementation, these things quickly go from memes to infrastructure bills. And here’s where I hit the brakes: there's a lot of talk about “2x tokens,” but significantly fewer confirmed facts.

For OpenAI, what I see today is different: the ChatGPT Pro plan for $100 has a temporary promotional logic for Codex until May 31, 2026. This isn't a universal doubling of everything, but rather a significantly expanded limit specifically for coding scenarios, which is expected to revert after the promotion. This has led some people to feel that the limits have “almost disappeared.”

With Claude, the picture isn't about generosity at all. What I see from public data looks more like throttling during peak hours and more aggressive consumption, especially for those who use Claude Code all day. Plus, they have Max 5x ($100) and Max 20x ($200) plans, but the consumption mechanics have become less pleasant than at the beginning of the year.

Now for the main point where everyone gets confused. When people say “it got cheaper,” they often mix up three different layers: subscription limits, API economics, and the subjective feeling of speed. If a model responds faster, you can burn through your weekly limit faster, and that's not a discount—it's just different throughput.

I also believe the reports about the voracity of multi-session work. If you have an orchestrator and 20-30 sub-agents, as in real pipelines, limits disappear not linearly, but almost imperceptibly fast. I see this in client scenarios too: a single “smart” agent looks cheap, but a proper AI integration with parallel branches already requires cold calculation.

What This Changes for Business and Automation

The winners are teams that code a lot, test hypotheses, and maintain a short “idea -> run -> fix” cycle. For them, the current OpenAI boost can genuinely make development temporarily cheaper and accelerate AI automation.

The losers are those who only look at the plan price. If your architecture is agent-based, with long runs, browsing, and lots of parallel calls, a monthly subscription stops being a clear unit of budget.

Right now, I wouldn't build processes on the feeling that “tokens are almost free.” I would build them on measurements: where to use a subscription, where to use an API, where to cache, where a fast mode is needed, and where it’s just a nice illusion of speed.

If you're starting to get confused by limits, agents, and bills, we can analyze your stack together. At Nahornyi AI Lab, we build AI solutions for business so that automation with AI doesn’t just look cheap on a screenshot but actually holds up in production and within budget.

Understanding how to optimize token consumption is becoming critical as new models demand more resources. We've already looked at how Cloudflare Markdown for Agents can significantly reduce token usage, affecting the overall economics of working with LLMs.

Share this article

Twitter/X LinkedIn Telegram

OpenAI and Claude: Limits, Hype, and the Real Cost

The Technical Context

What This Changes for Business and Automation

More News

Insurance for AI Agent Errors

Codex EU Patcher Goes Public