Who wins AI? The first to cut its cost.
By Alan Jacobson, AI Economics Strategist
AI must work as a business. Today’s models don’t.
Better models don’t fix the economics. Models only improve output — not cost structure.
And neither does scale — because unlike search, social, ecommerce, and cloud, AI costs scale with usage instead of compressing with it, as each query adds directly to operating cost.
So there are only three ways to make AI work:
- Cut cost
- Bill for compute
- Give users control
1. Cut cost
Manage compute, shrink context window size without losing context, and eliminate feature bloat.
2. Bill for compute
Charge for compute — not flat-rate, not seats, not tokens.
3. Give users control
Users must control how AI behaves for it to be trusted and adopted at scale.
Every technology had to evolve to make money. AI is no different. This is AI 2.0
1. How to cut cost
AI cost comes from three sources — all three must be addressed by…
- Managing compute
- Shrinking the context window size without losing context
- Eliminating feature bloat
Managing compute begins with governed, pre-execution provisioning. It ensures only the required compute is used, eliminating waste, and that inference runs only when predefined criteria are met.
Shrinking the context window without losing context is possible. The industry is moving in the opposite direction — expanding context windows to preserve context — but this comes at significant computational expense. Smaller context windows reduce cost, latency and error when paired with 100% lossless memory, preserving context without carrying unnecessary tokens.
Eliminating feature bloat is becoming necessary. Even OpenAI is pulling back from an “everything everywhere all at once” strategy, acknowledging that side efforts were becoming a distraction. Adding features has not improved adoption — it has increased cost. More features mean more compute and more complexity. Leaner, purpose-built systems deliver more accurate results at lower cost.
2. How to bill for compute
Cost is compute — not users, not seats, not licenses, and not tokens, which are only an imprecise proxy. The only way to align cost, usage, and revenue is to bill for compute. FLOP-based compute metering ties billing directly to actual compute, enabling fair, defensible pricing. Paired with customer-controlled usage limits, it prevents unpredictable spend and makes usage-based pricing viable for enterprises.
3. How to give users control
AI adoption has plateaued because users don’t trust it. No one hesitated to adopt Facebook, Amazon, YouTube or TikTok, but AI feels dangerous. Even massive marketing hasn’t changed that. Super Bowl ads from Amazon, Anthropic and Ring failed to reassure users—prompting The New York Times and New York to call AI “creepy.” Users and enterprises will not adopt AI at scale until they trust it, and they will not trust it until they can control it.
Scaling AI requires user-controlled governance within a safe, enforced boundary. Providers must continue to enforce non-negotiable rules around harm, illegality and abuse, but inside that boundary, users need control over how the system behaves. Preferences—style, tone, reasoning, constraints—should persist and be honored consistently. User-controlled governance creates a safe envelope with adjustable tuning inside it, allowing personalization without sacrificing safety. That is what makes AI predictable, usable, and scalable.
Where is cost control already separating winners from everyone else?
Send signal: signal@revenuemodel.ai
I read every signal.