Anthropic released Claude Sonnet 5 on June 30 with a simple pitch: most of Opus 4.8's capability at well under half the cost. On the company's own agent benchmarks, the model trails its flagship by a few points on coding and slips narrowly ahead on knowledge work. The pitch rests entirely on that pairing, near-Opus performance at a mid-tier price. Whether it pays off comes down to which tasks a team runs against the model, and to a few catches Anthropic files in its own disclosures.

Key Takeaways

AI-generated summary, reviewed by an editor. More on our AI guidelines.

Where Sonnet 5 is the right default

Anthropic and its early access partners point to the same kind of workload: agents that run constantly and cannot afford to stall. Daniel Shepard, a senior engineer at Zapier, handed the model a two-part job: update Salesforce account tiers, then send a launch announcement to enterprise contacts. It finished end to end. "That used to stall halfway," Shepard said. "For day-to-day automation, it's a no-brainer." Cursor co-founder Sualeh Asif said Sonnet 5 agents "stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost," and Cursor's own CursorBench score moved from 49% on Sonnet 4.6 to 57% on Sonnet 5. Zimu Li, a member of technical staff at Factory, called the model "a strong execution layer for multi-step software engineering work," singling out sustained coding, tool use, and debugging "across messy technical contexts" and workflows "where follow-through and technical grounding matter."

That reliability gain matters more than the raw benchmark gap for three categories of work. Long-horizon coding and refactors, the kind AWS says the model is "designed to navigate real codebases, land multi-file changes, and carry longer debugging and refactoring tasks through to completion," benefit from a model that finishes rather than one that scores marginally higher but stalls partway. Browser and terminal automation, the category AWS highlights for financial services clients running spreadsheet modeling and self-auditing reporting agents, rewards completion consistency over peak intelligence. And knowledge work, such as synthesizing a long research document into a brief, is the one category where Anthropic's own numbers put Sonnet 5 ahead of Opus 4.8 rather than behind it.

The token math, and where it actually saves money

A team paying Opus 4.8's $5 input and $25 output rate for an agent that does not need Opus-level judgment can move to Sonnet 5 at $2 and $10, a nominal cut of roughly 60% through August and closer to 40% once standard pricing takes effect in September. The saving is real for high-volume, low-judgment agent loops. Picture a customer-service bot that fires hundreds of times a day and kicks only the hardest cases up to Opus. That is the fit.

Teams migrating from Sonnet 4.6 face a catch here. The new tokenizer, shared with Opus 4.7, generates about 30% more tokens for the same input text, and as much as 1.35 times more on some content. Anthropic set the introductory price to absorb that increase relative to Sonnet 4.6. It did not calibrate against Opus 4.8, so the per-token comparison to Opus still holds. The introductory rate is also temporary. The $2/$10 pricing expires September 1, when input pricing rises 50% to $3 and output pricing rises 50% to $15, regardless of how usage changes.

Anthropic's own effort-level dial points to the test worth running. A team can run the same workload at both models and compare completion rate and total token spend against its real prompts rather than published benchmarks, then decide whether the mid-tier savings survive its specific pipeline.

How it stacks up

ModelPrice (in/out per 1M)Best forWeak point
Claude Sonnet 5$2/$10 intro to Aug 31, then $3/$15High-volume agents, long-horizon coding, knowledge workTrails Opus on the hardest coding; capped by design on cyber tasks
Claude Opus 4.8$5/$25Highest-accuracy coding and judgment callsMore than double Sonnet 5's cost for agent loops that don't need it
Claude Sonnet 4.6$3/$15Existing deployments not yet migrated58.1% agentic coding vs. Sonnet 5's 63.2%; no cyber safeguards
GPT-5.5$5/$30Teams already standardized on OpenAI's stackPriciest of the group on output; ties Opus 4.8 on input
Gemini 3.1 ProHigher than Sonnet 5Google Cloud-native workflowsUndercut on price by Sonnet 5's intro rate
Gemini 3.5 FlashBelow Sonnet 5Cost-floor workloads where full agentic capability isn't neededNo published head-to-head benchmark against Sonnet 5 in current reporting

Where it falls short

Three limits show up in Anthropic's own disclosures, not in outside testing. Opus 4.8 still leads the hardest coding by six points on agentic coding and by a wider margin on Terminal-Bench 2.1, 82.7% to Sonnet 5's 80.4%. A team whose agent handles the top tenth of difficulty, the bugs that stump Sonnet 4.6 and Sonnet 5 alike, still needs Opus in the loop, at least as an escalation path.

Know someone who'd find this useful? ✉️ Email it to a friend in one click, or they can subscribe free here.

Dangerous cyber work is capped by design, even though the model still handles routine, non-harmful cyber tasks. Anthropic says it did not train Sonnet 5 on cyber tasks, and on a Firefox 147 exploit-development evaluation built with Mozilla, the model never produced a working exploit, a 0% score against Opus 4.8's 68.8% and Mythos 5's 88.4%. Sonnet 5 ships with the same real-time cybersecurity safeguards as Opus 4.7 and 4.8, and Anthropic says requests involving prohibited or high-risk cybersecurity topics may be refused. Anthropic built that ceiling deliberately, and its own guidance recommends Opus 4.8 instead for cybersecurity work that requires reduced guardrails.

The price has an expiration date baked in. The introductory rate expires September 1, and Priority Tier, an option available on other current Claude models, is not offered for Sonnet 5 at all. A migration planned entirely around June's headline price will look different on a September invoice.

The verdict

Sonnet 5 fits the agent that runs all day and mostly succeeds. Much of what is currently routed to Opus 4.8, a long refactor or an always-on reporting agent, it can absorb. The test is cheap. Swap the model string, point it at real prompts for a week, and check whether the completion rate holds before assuming the savings do. It is a poor fit for the hardest coding problems, for anything touching offensive security, and for any budget that treats the $2/$10 rate as permanent, since that rate expires September 1.

Frequently Asked Questions

What does Claude Sonnet 5 cost?

$2 per million input tokens and $10 per million output tokens through August 31, 2026, then $3 and $15 per million tokens after that, according to Anthropic.

Is Sonnet 5 better than Opus 4.8?

Not on the hardest coding and reasoning tasks, where Opus 4.8 leads by about six points on Anthropic's agentic-coding benchmark. Anthropic says Sonnet 5 slightly edges Opus on one knowledge-work benchmark, GDPval-AA v2.

Can Sonnet 5 be used for cybersecurity or offensive-security work?

Anthropic did not train it for cyber tasks, and it never produced a working exploit in a Firefox 147 test built with Mozilla. The company recommends Opus 4.8 instead for cybersecurity work that needs reduced guardrails.

Will Sonnet 5 stay this cheap?

No. The $2/$10 introductory rate is temporary and resets to $3/$15 per million tokens on September 1, 2026, a 50% increase on both input and output pricing.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

Repo Radar: 5 GitHub Projects Worth Your Week
The fastest-climbing GitHub repos this week describe what teams build around an agent once a demo becomes a workload. The five below gained stars between June 22 and June 24, and each handles one piec
Repo Radar: 5 GitHub Projects Worth Your Week
GitHub spent the week leaning on Amazon's cloud to absorb agentic-development traffic, Business Insider reported June 16, after a run of AI-driven outages. As agents run at machine speed, this week's
Anthropic Ships Fable 5 and Locks the Unrestricted Version Away
San Francisco | Wednesday, June 10, 2026 Anthropic put a Mythos-class model on the open market Tuesday. Claude Fable 5 runs $10 per million input tokens and $50 out, and its classifiers hand any cybe
AI News

San Francisco

Editor-in-Chief and founder of Implicator.ai. Former ARD correspondent and senior broadcast journalist with 10+ years covering tech. Writes daily briefings on policy and market developments. Based in San Francisco. E-mail: editor@implicator.ai