The AI honeymoon is almost over

For the past two years, AI has felt like magic.

Costs looked manageable.
Capabilities kept improving.
And the narrative was simple:

Build with AI, and the economics will figure themselves out.

That narrative is starting to break.

Not because the models stopped improving.

But because the economics underneath them never worked to begin with.

The problem isn’t the model

The AI industry keeps trying to solve a cost problem by changing models:

switch providers
fine-tune smaller models
optimize prompts
reduce context

But none of these address the real issue.

The problem is not the model.

The problem is the unit used to measure cost.

Tokens were never the right unit

Today, AI is still billed primarily through tokens.

Tokens are easy to count. But they fail in two structural ways:

Semantically blind: The same number of tokens does not represent the same amount of work. A trivial rewrite and a complex reasoning task can generate similar token counts while consuming vastly different compute.

Operationally blind: Agentic workflows perform substantial work off-screen — retrieval, tool calls, loops, retries, background execution. Much of that work never becomes tokens at all.

So what looks like “usage” is only a partial view of what is actually happening.

The hidden gap

If cost is driven by compute…and revenue is tied to tokens…then there is a gap.

Some portion of real work is not being properly measured or priced.

That gap doesn’t disappear.

It gets absorbed.

Right now, it is being absorbed by the providers.

The honeymoon phase

This is the honeymoon.

generous pricing
free tiers
predictable bills
rapid adoption

The goal is clear: get developers building.

And it’s working.

AI usage is exploding. Companies are investing hundreds of billions into infrastructure to support it.

But the underlying economics are under strain.

Costs are rising with usage — even as per-token prices fall.

And companies are already reporting that AI spend is growing faster than revenue.

Why this doesn’t hold

AI doesn’t behave like SaaS. It behaves like a utility:

More usage → more cost.

Not amortized.

Not flattened.

Compounded.

And as agentic systems expand, more work happens off-screen — outside the token boundary.

Which means the gap between measured usage and actual cost grows wider.

At scale, that gap becomes impossible to ignore.

What happens next

When the economics don’t hold, the system adjusts.

We’ve seen this before:

limits get introduced
pricing gets restructured
access gets constrained
“fair use” policies tighten

Not because companies want to restrict usage.

Because they have to.

Don’t change the model. Change the unit of measure.

The industry is trying to fix this by changing models.

That’s the wrong layer.

You don’t need a new model to fix AI economics.

You need a new meter.

FLOP-Based Metering (FBM) does exactly that.

It does not require changing the LLM.

It sits between the user and the model as a measurement and control layer.

Instead of counting words, it measures the compute work actually performed.

Including the work tokens never see.

The real shift

This is not a pricing tweak.

It is a foundational change.

Because once the unit of measure changes:

cost becomes predictable
budgets become enforceable
governance becomes possible
pricing aligns with actual work

And the economics finally make sense.

The bottom line

The current system works because the gap is being absorbed.

That is the honeymoon.

But as usage scales, that gap grows.

And when it becomes too large to ignore, the system will be forced to change.

Not the model.

The unit of measure.

– Published on Saturday, March 28, 2026