Uber Cools the Hype Around LLM Spending

Uber’s COO publicly stated a simple truth: rising token expenses for large language models are increasingly hard to justify without direct business returns. This is a crucial signal for AI implementation. The era of limitless experiments is officially ending, and companies are now prioritizing tangible results over mere novelty.

Technical Context

I was caught not by the headline itself, but by the phrasing. Uber's COO Andrew Macdonald essentially said: we are spending more tokens, but a clear increase in product value is nowhere to be seen. To me, this is a very familiar picture from real AI automation projects, where a team easily scales up model calls, yet the connection to business metrics quickly blurs.

The primary source here is an interview recapped by Business Insider. A revealing episode surfaced there: internally at Uber, they discussed that the Claude Code budget for 2026 had already been depleted. This became the exact moment people stopped looking at LLMs as almost free magic. And rightfully so, because while a single prompt costs pennies for an individual employee, for a company, it collectively becomes an architectural decision with a very tangible bill.

What really strikes me here isn't the amount itself, but the lack of a direct line between input and output. If I cannot show that more tokens resulted in noticeably faster releases, better support quality, or more automated operations, then I don't have AI integration—I have an expensive habit.

And yes, the news is fresh, May 2026, so this is not a retrospective. It's the market's new tone: token counters first, beautiful demos second.

Impact on Business and Automation

I see three practical takeaways here. First: companies will not cut AI itself, but rather the unsystematic consumption of models without routing, caching, limits, and evaluating where an expensive LLM is truly needed and where a simpler combination would suffice.

Second: the winners will be teams that calculate unit economics at the scenario level. Not "we implemented AI", but "this agent reduced ticket resolution time by 42% and pays for itself in a quarter". This is exactly what proper AI solution development looks like, rather than simply buying access to another model.

The losers will be those who built internal processes on an uncontrolled copilot without thinking about AI architecture. I see this regularly: as soon as real limits are set, half the chains suddenly turn out to be redundant.

If you have a similar situation and model costs are already competing with hiring budgets, let's look at this maturely. At Nahornyi AI Lab, we usually start not with a new model, but with a process map. After that, we can assemble AI automation so that the business pays for results, not for the spectacular burning of tokens.

We previously explored technical ways to radically reduce token consumption, such as passing lightweight Markdown syntax to AI agents instead of heavy HTML. Such architectural optimizations are becoming vital right now, as large enterprises begin to seriously doubt the profitability of generative models.

Share this article

Twitter/X LinkedIn Telegram

Uber Cools the Hype Around LLM Spending

Technical Context

Impact on Business and Automation

More News

Gemma 4 Becomes Significantly More Practical on Edge

364M parameters and a new chance for on-device AI