The 100 Trillion Token Mirage: What OpenRouter's AI Report Actually Reveals

OpenRouter's 100 trillion token study was supposed to prove AI is transforming everything. The data shows something else: half of open-source usage is roleplay, enterprise adoption is thin, and one account caused a 20-point spike in the metrics.

What 100 Trillion AI Tokens Actually Reveal About Usage

One hundred trillion tokens. It's a number designed to shut down argument, the kind of figure that makes executives sit up and investors reach for the checkbook. OpenRouter and a16z released their "State of AI" report this week, billing it as an empirical study of real-world LLM usage patterns. The dataset is genuinely massive. The methodology is transparent. And the conclusions, if you read them carefully, quietly undermine the narrative that AI adoption is accelerating everywhere, across all industries, transforming work as we know it.

The report's most striking finding isn't what the authors emphasize. It's what the data reveals about who actually uses AI inference APIs, and what they use them for. More than half of open-source model usage goes to roleplay and entertainment. Programming dominates the rest. Everything else, the legal research, the medical queries, the financial analysis that supposedly justify trillion-dollar valuations, barely registers.

The Breakdown

• OpenRouter's dataset captures indie developers and researchers, not enterprise. Banks and Fortune 500 companies route through other channels.

• Roleplay accounts for 52% of open-source model usage. Programming takes most of the rest. Enterprise verticals barely register.

• Singapore ranks second by billing location, likely reflecting Chinese VPN traffic rather than genuine Southeast Asian adoption.

• One account caused a 20-percentage-point spike in tool usage, suggesting extreme concentration behind aggregate numbers.

The Selection Problem Nobody Wants to Discuss

OpenRouter functions as an aggregator, a middleman that routes API calls to various model providers for a 5.5% fee. Its value proposition appeals to a specific demographic: developers who want to experiment across multiple models without managing separate accounts, researchers comparing performance, indie builders who need flexibility more than enterprise-grade reliability.

This demographic is not representative of AI usage broadly. Banks route their inference through Azure. Hospitals deploy on-premises. Fortune 500 companies with compliance requirements use direct API contracts with Anthropic or OpenAI. None of that traffic appears in OpenRouter's dataset.

The Hacker News discussion surfaced this immediately. As one commenter noted, "Most of the high volume enterprise use cases use their cloud providers. What we have here is mostly from smaller players." Another pointed out that small models, the category the report claims is declining, are precisely those that can be self-hosted. OpenRouter would never see that usage.

Acknowledgment is not adjustment. The authors buried the limitations in the methodology section, trusting that headlines would treat OpenRouter's slice as the whole pie.

Roleplay Is Eating the Open-Source World

Here's where the data gets uncomfortable for the productivity narrative. Among open-source models on OpenRouter, 52% of all tokens go to roleplay. Not coding assistance. Not document analysis. Not the enterprise use cases that dominate investor presentations. Roleplay.

Dig into the subcategories and the picture sharpens. Nearly 60% of roleplay tokens fall under "Games/Roleplaying Games." Another 15.6% goes to "Writers Resources." And 15.4%, the report delicately notes, falls under "Adult content." People use open-source models for fantasy, interactive fiction, and the kind of filth that commercial APIs refuse to generate.

The reason is obvious. Open-source models can be run without content filters. They don't cut off responses mid-sentence because an algorithm flagged the word "knife." For users seeking creative freedom, or freedom of another sort entirely, these models offer something proprietary systems won't.

This isn't a small niche. It's the majority of open-source usage. And it raises a question the AI industry would rather not answer: what happens to the "AI will transform work" thesis when the actual usage data shows people primarily want AI for entertainment and companionship?

The report frames this positively, noting that "roleplay tasks require flexible responses, context retention, and emotional nuance." True enough. But flexible responses and emotional nuance aren't what's driving $100 billion infrastructure buildouts. Those investments assume AI will automate legal research, accelerate drug discovery, write corporate reports. The data suggests consumers have different priorities.

Programming Dominates Everything Else

Strip out roleplay, and programming absorbs nearly all remaining serious usage. The report shows programming queries growing from roughly 11% of total tokens in early 2025 to over 50% by November. Anthropic's Claude handles more than 60% of programming traffic. OpenAI and Google split most of the rest.

This concentration carries implications. If programming represents the bulk of non-entertainment AI usage, then AI adoption outside the tech industry is far thinner than commonly assumed. The report's category breakdown confirms this: finance, legal, and healthcare all appear in the "fragmented" lower-volume segments. Science queries are 80% about "Machine Learning & AI" itself, not biology or physics or chemistry. Users are asking AI about AI.

One Hacker News commenter nailed the problem: "I haven't seen many obvious signs of AI adoption around me once I leave the office. Microsoft has been struggling to sell its Copilot offerings to ordinary MS Office users, who apparently aren't that keen."

The data supports this observation. Outside of coding and entertainment, growth looks flat. The category charts show seasonal patterns and gradual increases, not the exponential curves that justify current valuations. Enterprise adoption of AI for non-programming tasks may be happening, but it's happening elsewhere, through channels this dataset cannot see.

Geographic Data That Doesn't Add Up

Singapore ranks second globally in token volume, behind only the United States. Germany sits third. China fourth.

Singapore. A city-state of 6 million people, generating more AI inference traffic than Germany, the UK, Japan, and India combined.

The report determines geography through billing location rather than IP addresses. The authors frame this as a privacy-preserving methodology. What it actually creates is an obvious blind spot. Chinese users facing restrictions on Western AI services can route traffic through Singaporean bank accounts and VPNs. The billing data would show Singapore; the actual usage originates elsewhere.

Multiple Hacker News commenters flagged this immediately. "Almost certainly VPN traffic," one wrote. "Most major LLMs block both China and Hong Kong, so Singapore ends up being the fastest nearby endpoint that isn't restricted."

The authors don't address this possibility. Asia's share of usage reportedly doubled from 13% to 31% over the study period. How much of that growth represents genuine Asian adoption versus Chinese users circumventing blocks? The methodology can't tell us. The authors don't seem curious.

Concentration Risks and the One-Account Spike

Buried in the tool-usage analysis sits an extraordinary admission. In May 2025, tool invocations spiked by roughly 20 percentage points. The cause? "One sizable account whose activity briefly lifted overall volumes."

One account. Twenty percentage points. On a platform processing trillions of tokens.

This suggests usage concentration far more extreme than the aggregate numbers imply. If a single heavy user can visibly move platform-wide metrics, then the "100 trillion tokens" figure may represent fewer independent users than it appears. The long tail of casual experimenters might generate noise while a small cohort of power users generates most actual traffic.

The report doesn't disclose user concentration metrics. We don't know whether 1% of accounts generate 50% of tokens, or 90%. Without that information, interpreting the aggregate statistics is guesswork dressed up in charts.

The Glass Slipper Finding Worth Remembering

Not everything in the report is methodological quicksand. The retention analysis surfaces a genuinely useful pattern the authors call the "Cinderella Glass Slipper" effect.

When a new model launches, it briefly has an opportunity to become the first solution that actually works for a particular user's problem. If the model fits, users stick. Their workflows calcify around that model's capabilities and quirks. Switching becomes costly. The foundational cohort, users who found their fit during that narrow window, shows dramatically higher retention than later adopters.

OpenAI's GPT-4o Mini demonstrates this starkly. The July 2024 launch cohort retained at levels far above every subsequent cohort. First-mover advantage in AI isn't about being first to market. It's about being first to solve a specific pain point that previously had no solution.

For model developers, this implies that timing matters enormously. Launch when your model represents a genuine capability jump, and you capture sticky users. Launch into a crowded field of comparable options, and you compete purely on price.

What the Data Actually Shows

Strip away the impressive token counts and the data tells a simpler story. OpenRouter serves a specific community: indie developers, researchers, and a substantial population of roleplay enthusiasts. Within that community, programming and entertainment dominate. Enterprise usage of AI for finance, legal, and healthcare either happens elsewhere or hasn't materialized at the scale hype suggests.

Growth appears linear, not exponential. The charts show seasonal dips and recoveries, not hockey sticks. Outside programming, adoption looks mature rather than accelerating.

The open-source ecosystem has genuinely diversified. DeepSeek no longer dominates. Chinese models have captured meaningful share. Medium-sized models are finding product-market fit. These are real structural shifts.

But they're shifts within a particular segment of the market, one that may not represent where the real money flows or where the transformative use cases, if they exist, will emerge.

Why This Matters

  • For investors: The "AI is transforming everything" thesis needs stress-testing. This data suggests consumer entertainment and developer productivity drive actual usage, with other verticals lagging far behind. Valuations premised on universal enterprise adoption may be premature.
  • For model developers: First-to-solve creates durable advantages. The window to capture foundational users is narrow and tied to genuine capability jumps, not incremental improvements.
  • For enterprise buyers: The absence of your industry from this dataset isn't necessarily bad news. It might mean serious players are routing through enterprise channels with better security and compliance, exactly where they should be.

❓ Frequently Asked Questions

Q: What is OpenRouter and why does it matter for this study?

A: OpenRouter is an API aggregator that routes requests to various AI model providers (OpenAI, Anthropic, Google, open-source models) for a 5.5% fee. Users pay one provider instead of managing multiple accounts. This matters because OpenRouter's customer base skews toward indie developers and researchers, not enterprise users, which shapes what the data can and cannot tell us.

Q: Does the "small models are declining" finding hold up?

A: Probably not in the broader market. Small models (under 15 billion parameters) are exactly those that can be self-hosted on consumer GPUs. OpenRouter would never see that traffic. A developer running Llama 7B locally doesn't show up in these statistics. The decline likely reflects users moving to self-hosting, not abandoning small models entirely.

Q: How did they determine what people are using AI for?

A: OpenRouter samples 0.25% of prompts and runs them through Google Cloud's Natural Language classifier, which assigns categories like "Programming," "Roleplay," or "Finance." The researchers only see the category labels, not the actual prompts. This means classification depends on Google's taxonomy, and some categories may be misattributed, particularly "roleplay" which could capture legitimate "act as an expert" prompts.

Q: What are "reasoning models" and why did their share grow so fast?

A: Reasoning models like OpenAI's o1 and o3 perform multi-step deliberation before generating output, rather than producing text in a single pass. Their share grew from negligible in early 2025 to over 50% of all tokens by late 2025. This reflects both new model releases (GPT-5, Claude 4.5, Gemini 3) and user preference for models that can handle complex, multi-step tasks.

Q: What's the "Glass Slipper" effect and why should developers care?

A: When a model launches and happens to be the first that solves a user's specific problem, those users stick around far longer than later adopters. GPT-4o Mini's July 2024 cohort retained at dramatically higher rates than any subsequent cohort. For developers, this means timing a launch to coincide with a genuine capability jump matters more than incremental improvements in a crowded field.

China’s BRICKSTORM Malware: 17 Months Inside U.S. VMware
Chinese hackers operated inside U.S. VMware servers for 17 months undetected. The malware repairs itself when deleted. It hides where most security teams don’t look. CISA’s December 4 advisory exposes an architectural blind spot in enterprise defense.
Werner Vogels’ Final Re:Invent Exit & Chinese Hackers
Werner Vogels exits Re:Invent with a final AI warning. Chinese hackers hid inside VMware servers for 17 months. Your essential daily AI briefing awaits.
Werner Vogels’ Final Keynote: The Renaissance Developer Era
Werner Vogels ends his 14-year keynote streak by handing out printed newspapers and warning developers about “verification debt.” His parting message: AI generates code faster than humans can understand it. The work is yours, not the tools.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Implicator.ai.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.