Approved at $0.12. Actual cost: 5× higher
A company deploys an AI agent for customer support.
Based on token pricing:
$0.12 per interaction
Finance signs off.
It looks predictable.
Then it runs.
Each interaction triggers…
- 3 retries
- 2 tool calls
- 1 retrieval pass
- a branching decision path
The system doesn’t fail.
It works exactly as designed.
But the execution path expands
Actual cost:
4–6× higher than expected
Not because of scale.
Not because of misuse.
Because the system did more work than tokens could see.
Here’s the problem
Finance had:
- no visibility before execution
- no way to constrain behavior
- no unit that reflected actual work
They’re accountable for ROI without control over cost.
This is where the model breaks
Tokens don’t map to compute.
And if the unit doesn’t map to cost,
it can’t be used to control it.
AI economics only works when…
- cost is estimated before execution
- execution is bounded before it runs
- usage is measured in compute, not text
That’s what puts Finance in control
before cost is incurred.
Why this can’t be fixed with existing approaches
- Tokens cannot see execution → measurement is wrong
- Observability happens after execution → control is too late
- Pricing sits on top of both → inherits the error
If cost is committed before it’s controlled, no pricing model can fix it.
What’s missing isn’t visibility after the fact — it’s control before execution:
- FLOP-based metering
- Customer-controlled governance
- Pre-execution provisioning
- Bounded AI Pricing
- Normalized Compute Units
– Published on Monday, March 30, 2026