Approved at $0.12. Actual cost: 5× higher

Approved at $0.12. Actual cost: 5× higher

A company deploys an AI agent for customer support.

Based on token pricing:
$0.12 per interaction

Finance signs off.
It looks predictable.

Then it runs.

Each interaction triggers…

  • 3 retries
  • 2 tool calls
  • 1 retrieval pass
  • a branching decision path

The system doesn’t fail.
It works exactly as designed.

But the execution path expands

Actual cost:
4–6× higher than expected

Not because of scale.
Not because of misuse.

Because the system did more work than tokens could see.

Here’s the problem

Finance had:

  • no visibility before execution
  • no way to constrain behavior
  • no unit that reflected actual work

They’re accountable for ROI without control over cost.

This is where the model breaks

Tokens don’t map to compute.
And if the unit doesn’t map to cost,
it can’t be used to control it.

AI economics only works when…

  • cost is estimated before execution
  • execution is bounded before it runs
  • usage is measured in compute, not text

That’s what puts Finance in control
before cost is incurred.

Why this can’t be fixed with existing approaches

  • Tokens cannot see execution → measurement is wrong
  • Observability happens after execution → control is too late
  • Pricing sits on top of both → inherits the error

If cost is committed before it’s controlled, no pricing model can fix it.

What’s missing isn’t visibility after the fact — it’s control before execution:

  1. FLOP-based metering
  2. Customer-controlled governance
  3. Pre-execution provisioning
  4. Bounded AI Pricing
  5. Normalized Compute Units

– Published on Monday, March 30, 2026



Contact us

© 2025 BrassTacksDesign, LLC