FLOP-Based Metering (FBM)
Definition.
FLOP-Based Metering (FBM) measures AI usage based on compute work performed rather than tokens generated. It treats inference as a computational process whose cost is determined by underlying operations, not textual output.
In other words, FBM measures what actually drives cost.
Why this exists.
Current AI billing models rely on tokens, which serve as a proxy for usage but do not reflect actual compute.
This creates four structural problems:
- Mismatch: identical token counts can represent vastly different compute workloads
- Invisibility: agentic and background processes may consume compute without generating tokens
- Lack of control: tokens are measured after execution, so cost cannot be bounded in advance
- Semantic blindness: token-based systems treat each unit of text as having equal weight, despite the fact that computational effort varies with context, structure, and meaning
FBM replaces tokens with a unit that reflects the real economic driver: compute.
What “FLOP-based” means.
FLOP-based refers to measuring the underlying computational work required to perform inference, based on floating operations per second (FLOPS).
Rather than counting words or characters, FBM measures the computational work required to execute a request, expressed in floating point operations (FLOPs). The result is a measurement aligned with actual resource consumption, producing results more closely aligned with actual resource consumption.
FBM’s defining characteristic: compute alignment.
FBM aligns measurement with cost in alignment with execution, enabling pre-execution control. This differs from token-based systems, where:
- measurement is indirect
- cost is inferred from output
- variability is hidden
In FBM, measurement reflects the work performed, not the artifact produced.
How it works.
For each request, the system evaluates the projected computational workload associated with the execution.
From this, the system derives a compute-based measure that can be evaluated prior to, during and after execution.
The public description is intentionally abstract; the specific calculation methods are not disclosed.
What measurement means in FBM.
Measurement in FBM represents the economic cost of execution.
It is not:
- a proxy for text length
- a billing artifact
- a post-hoc approximation
It is a representation of the compute required to produce a result.
Difference from token-based metering.
Token-based systems count units of generated text. FBM measures execution cost.
Tokens answer: how much text was written
FBM answers: how much work was done
This distinction becomes critical in agentic systems, where work may occur without generating proportional text output.
Difference from infrastructure-level metering
Infrastructure metering tracks system-level resource usage (e.g., GPU time, utilization).
FBM operates at the level of the individual request, enabling:
- Per-request cost visibility
- Pre-execution estimation
- Integration with governance layers
Cost discipline
FBM enables cost discipline by making compute measurable in a way that aligns with actual usage.
When combined with pre-execution controls, it allows:
- estimation before execution
- comparison across requests
- enforcement of cost boundaries
Without a compute-aligned unit, cost cannot be reliably controlled.
Current state of AI measurement
Public AI systems continue to rely on token-based metering, which was sufficient, but inherently imprecise, for single-turn, text-based interactions.
As systems evolve toward multi-step workflows, agentic execution and tool integration, token-based measurement becomes increasingly disconnected from cost.
Origin and engagement
FLOP-Based Metering is part of a broader architectural approach to aligning AI cost, billing and control with execution.
Organizations exploring alternatives to token-based billing typically evaluate FBM in conjunction with:
- control mechanisms
- normalized measurement units
- governance frameworks
Public discussion is intentionally incomplete; failure modes only become clear at the architectural level. Confidential architectural review available upon request.