Memory, governance and revenue remain “Over the Rainbow” for Open AI, Google and Anthropic

By Alan Jacobson, Systems Architect & Analyst

And now…for all you nerds out there…

GPT-5. Gemini 3. Opus 4.5. Every few months the labels change, the charts refresh, and the story is the same: bigger models, longer context, more “magic.”

But if you look under the hood, the real race isn’t about squeezing out one more benchmark win. It’s about three very unsexy pieces that determine whether any of this will actually work at scale — the foundation of AI 2.0

And those three pieces are the ones everyone knows, but no one has actually shipped:

On all three, the biggest players are racing forward. They are also leaving giant gaps.

And yes, I’m saying that having already filed for things they don’t have yet.

1. Memory: long context vs real memory

OpenAI, Anthropic, and Google now all advertise some version of AI “memory.”

OpenAI’s ChatGPT can remember your name, projects, and preferences across chats. You can open a panel, see what it remembers, and ask it to forget. OpenAI+2OpenAI Help Center+2
Anthropic’s Claude offers project-scoped memory. It will remember your current coding project or client, and keep different workstreams in separate “spaces.” Claude Help Center+2Skywork+2
Google’s Gemini leans on brute force: million-token context windows and deep integration with your Google account so it can pull from mail, docs, calendar, and search. Google Help

All of that is progress. But it is not the same as real memory.

Every one of these systems is still fundamentally lossy:

they summarize
they forget
they prioritize “useful” details and drop the rest

Ask power users of any of these products and you’ll hear the same story: now and then the AI just “forgets” a project, loses the thread, or contradicts itself in the same session. OpenAI Developer Community+2Reddit+2

What I’ve filed is different in kind, not just degree. The design assumes:

memory is loss-less by default
every interaction that matters can be replayed, not just vaguely recalled
memory sits under explicit governance, not as a side effect of system logs

You don’t have to believe me on the technical details yet. The point is simpler: the majors are trying to retrofit memory onto architectures that were never designed to remember. I started from the opposite premise: if you don’t solve memory properly, nothing else scales.

2. Governance: who really has their hand on the brake?

If you read the blog posts, everyone sounds responsible.

OpenAI has safety teams, policy documents, and a growing collection of open-weight safety models. Anthropic has Constitutional AI, Responsible Scaling Policies, and an entire Transparency Hub devoted to governance. Google has a Frontier Safety Framework and API-level safety_settings that let you tune content filters up or down. TechTonic Shifts+7IntuitionLabs+7LinkedIn+7

This is all real work. It’s also one-sided.

Today’s governance story is basically:

the company defines the rules
the company tunes the filters
the company owns the kill switch

As a user, you get some settings:

toggle memory on or off
choose stricter filters for teens
dial safety thresholds for certain APIs

Important, yes. But at the end of the day, the provider can still change the rules unilaterally — and has every incentive to design governance that protects the brand first and the user second. HiddenLayer | Security for AI+3OpenAI+3OpenAI Help Center+3

Meanwhile, jailbreak researchers keep demonstrating “universal bypass” attacks that walk straight past system prompts and alignment training. That should tell you how fragile the current guardrails are. HiddenLayer | Security for AI+2arXiv+2

The architecture I’ve filed for takes a harder line:

governance is a shared control system, not a one-way policy
the provider has a key
the user has a key
neither side can silently overrule the other
every allow and deny produces a tamper-resistant receipt that can be audited later

In other words: a real kill switch with receipts, not just a promise in the terms of service.

The big labs are still treating governance as a combination of policy PDFs, internal flags, and a few sliders in the UI. I’m treating it as a technical system with two hands on the brake.

3. Revenue: metering words vs metering reality

For all the talk about “AGI” and “frontier models,” the revenue story is comically simple.

Everyone is metering tokens.

OpenAI charges per million tokens on the API, and per seat on ChatGPT Plus, Pro, Team, and Enterprise. OpenAI+2OpenAI Platform+2
Anthropic charges per million tokens for Claude, plus monthly plans that hide token limits behind “fair use” and caps when inference whales get too expensive. Claude Developer Platform+2Finout+2
Google charges per million tokens on Gemini APIs, then bundles Gemini into cloud and Workspace deals where the metering is buried in enterprise pricing. Reddit+2Andreessen Horowitz+2

Inside the companies, everyone knows what really drives cost: FLOPs — the actual floating point operations required to answer a question. You can see it in technical blogs and investor memos. FLOPs show up in estimates of training cost, energy use, and model scaling. Andreessen Horowitz+4CUDO Compute+4Hugging Face Forums+4

But if you’re a customer, you almost never see FLOPs. You see:

input tokens
output tokens
clever product names

You have very little say in how much compute the model spends on your problem. You can’t say “for this query, spend 10 times the FLOPs because the stakes are high” — or “for this one, give me the cheapest approximation.” You’re billed by the inch of text.

The filings I’ve made assume a different world: usage is metered in tokens and FLOPs

Instead of pretending every token is equal, the system says the quiet part out loud: when you pay for intelligence, you are paying for compute.

What this really means

You don’t have to believe that my ideas are “better” than what OpenAI, Anthropic, or Google are doing.

You just need to notice one thing:

The majors are trying to bolt memory, governance, and pricing logic onto architectures that were never designed for any of those things.
I started from the opposite direction: treat memory, governance, and revenue as the spine of the system, then hang models off of that.

So when you see the next benchmark chart or breathless coverage of GPT-5 and Gemini 3, remember the three boring questions that actually matter:

What does this system remember, and who decides?
Who really has their hand on the brake, and where are the receipts?
Are we metering words, or are we finally honest about compute?

Until those questions are answered in hardware, software, and contracts — not just in blog posts — AI will keep feeling like a flashy demo sitting on top of a very rickety foundation.