What GPT-5 Actually Costs: Breaking Down OpenAI's Complex Pricing Structure

💡 TL;DR - The 30 Seconds Version

💰 GPT-5 costs $1.25 per million input tokens—half of GPT-4o's price—but invisible reasoning tokens at $10/million can multiply bills by 5x.

📊 Three tiers offer different accuracy: GPT-5 fixes 74.9% of bugs, mini gets 71%, nano manages 54.7%—cheaper models need more retries.

🎚️ Setting reasoning from "minimal" to "high" can jump a 5,000-token task from $20 to $120 for the same answer with more thinking.

💳 ChatGPT auto-switches models based on subscription: free users drop to mini quickly, Plus ($20) gets higher limits, only Pro ($200) guarantees GPT-5.

📚 Processing a 200,000-word novel at high reasoning costs $340+ just for input, with accuracy dropping from 95% to 87% at maximum length.

⚠️ Businesses can't predict AI costs when models switch automatically and reasoning multipliers stay hidden until the bill arrives.

OpenAI priced GPT-5 to kill the competition. At $1.25 per million input tokens, it costs half what GPT-4o charges and undercuts Claude Sonnet 4 by 58%. But the real bill tells a different story.

The sticker price hides a web of multipliers. GPT-5 runs at four reasoning levels. Pick "high" for complex tasks and those invisible thinking tokens pile up at $10 per million. A coding task that looks cheap at first can cost five times more than expected. You pay for every step the model thinks through, but OpenAI won't show you what you bought.

The Three-Tier Trap

GPT-5 comes in three versions. The main model costs $1.25 per million input tokens and $10 per million output. Mini drops to $0.25 input and $2 output. Nano bottoms out at $0.05 input and $0.40 output.

The performance gaps match the price cuts. GPT-5 fixes software bugs correctly 74.9% of the time. Mini manages 71%. Nano drops to 54.7%. That 20-point difference between GPT-5 and nano means you'll run prompts multiple times to get working code. The cheaper model becomes expensive through repetition.

Here's what catches developers: ChatGPT automatically switches between these models based on your subscription and usage. Free users start with GPT-5 but quickly drop to mini versions after hitting undisclosed limits. The $20 Plus tier gets "significantly higher" limits—OpenAI won't say what that means. Only the $200 Pro subscription guarantees consistent GPT-5 access.

The Reasoning Tax

The reasoning parameter changes everything. Set it to "minimal" and tokens stream back fast with basic processing. Bump it to "high" and the model burns through thinking tokens before producing output. Those thinking tokens cost the same $10 per million as regular output.

A 5,000-token coding problem at minimal reasoning might output 2,000 tokens total. The same problem at high reasoning could generate 10,000 tokens of thinking plus 2,000 of actual code. Your cost jumps from $20 to $120 for the same answer, just with more invisible processing.

Simon Willison discovered this during his preview testing. The API gives no visibility into reasoning tokens until the bill arrives. Developers face an absurd choice: pay for thinking they can't see or turn reasoning down and lose accuracy.

Comparing the Competition

GPT-5's $1.25 per million input tokens beats most rivals. Claude Opus 4 charges $15—twelve times more. Gemini 2.5 Pro ranges from $1.25 to $2.50 depending on context length. Only Amazon's Nova models price lower, with Nova Micro at $0.035 per million.

But raw price comparisons miss the point. GPT-5 scores 96.7% on tool-calling benchmarks where Gemini barely breaks 85%. For complex tasks requiring multiple API calls, GPT-5's efficiency saves money despite higher per-token costs. It uses 22% fewer output tokens and 45% fewer tool calls than o3 for the same bug fixes.

The 90% discount on cached tokens matters for chat applications. Build a conversation interface and previously used tokens cost one-tenth the normal rate. A 50-message conversation that would cost $5 drops to under $1 with caching.

The Long Context Problem

GPT-5 handles 272,000 input tokens—about 200,000 words. Process a full novel at high reasoning and you're looking at $340 just for input, plus whatever reasoning and output tokens follow.

Accuracy drops with length too. At 128,000 tokens, GPT-5 retrieves information correctly 95% of the time. Push to maximum length and accuracy falls to 87%. You pay full price for degraded performance.

The Bottom Line

GPT-5's pricing works like a mobile phone plan. The advertised rate looks great until you count the overages, add-ons, and fine print. A task that seems to cost pennies can hit dollars once reasoning kicks in.

For simple queries, nano at $0.05 per million tokens beats everything. For mission-critical code where accuracy matters, paying GPT-5's premium saves money by avoiding mistakes. The challenge lies in predicting which model and reasoning level each task needs—a calculation OpenAI leaves entirely to you.

Why this matters:

• The real cost of GPT-5 depends more on reasoning settings than token prices—developers must run expensive experiments to find the right balance for each use case

• ChatGPT's automatic model switching means businesses can't predict or control their AI costs—only the $200 Pro tier guarantees consistent model access

❓ Frequently Asked Questions

Q: How does the 90% cached token discount work?

A: Tokens used within the past few minutes cost one-tenth the normal rate. A chat conversation replaying the same 10,000 tokens each message would cost $12.50 the first time, then $1.25 for each subsequent message using those cached tokens. The discount applies to input tokens only.

Q: How do I estimate my costs before running a prompt?

A: Count your input tokens (roughly 1.3 tokens per word), multiply by the model's rate, then estimate output. At high reasoning, expect 3-5x more thinking tokens than actual output. A 1,000-word prompt generating 500 words could cost $2.16 on GPT-5 or $0.03 on nano.

Q: What happens when I hit my ChatGPT usage limit?

A: Free users drop to mini versions immediately after hitting undisclosed caps. Plus subscribers get fallback to mini after higher limits. The interface doesn't warn you when switching happens—your next response just comes from a weaker model. Only Pro users never hit limits.

Q: Is Amazon Nova really cheaper than GPT-5 nano?

A: Nova Micro costs $0.035 per million input tokens versus nano's $0.05, making it 30% cheaper. But Nova hasn't published comparable benchmarks. GPT-5 nano still beats Nova on output pricing ($0.40 vs Nova Lite's $0.24) and likely performs better on complex tasks.

Q: Which reasoning level should I use for different tasks?

A: Use minimal for simple queries and chat. Low works for basic coding and writing. Medium handles most professional tasks. High suits complex debugging, math proofs, and multi-step analysis. Visual tasks barely improve past low. Start low and increase only if results disappoint.

Q: How does ChatGPT decide which model to route my query to?

A: OpenAI's router analyzes conversation type, complexity, tool needs, and keywords like "think hard about this." Simple questions go to fast models. Complex queries trigger reasoning models. The router's logic stays secret, making costs unpredictable. API users choose models manually, avoiding this uncertainty.

Q: Can I see how many reasoning tokens I'm being charged for?

A: No. The API returns total tokens used but doesn't break down thinking versus output. ChatGPT shows thinking summaries but not token counts. You discover the real cost only when the bill arrives. This makes budgeting impossible for reasoning-heavy tasks.