Liang Wenfeng's team removed the strikethroughs from DeepSeek's API pricing page on Saturday. The promotional 75 percent discount on V4 Pro, set to expire May 31 at 15:59 UTC, was now permanent. Where the page had listed $1.74 per million uncached input tokens, struck through in favor of $0.435, it now showed only the lower number. Output tokens fell from a struck-through $3.48 to $0.87.
The promotional price is now the standard price, and the standard runs on Huawei chips. DeepSeek can hold it because it trains and serves V4 on domestic silicon, carries no IPO clock, and treats inference as a commodity input rather than a premium product. Western labs cannot match $0.87 per million output tokens without rewriting the revenue assumptions their valuations depend on.
Key Takeaways
- DeepSeek made its 75% V4 Pro discount permanent Saturday, locking output tokens at $0.87/M — 34.5× cheaper than GPT-5.5
- The price is sustainable because DeepSeek runs on Huawei Ascend 950 chips, with 750K units shipping in 2026 and expanding domestic supply
- Enterprise buyers face a new benchmark: good-enough performance at a fraction of premium pricing, though geopolitical risk limits regulated adoption
- Anthropic accuses DeepSeek of distillation attacks on Claude; if substantiated, the price gap reflects IP arbitrage, not engineering efficiency
AI-generated summary, reviewed by an editor. More on our AI guidelines.
The math that just got locked in
The numbers, laid side by side, look like a pricing error:
API Price · USD per 1M tokens
| Model | Input | Output | vs. V4 Pro |
|---|---|---|---|
| DeepSeek V4 Pro | $0.435 | $0.87 | baseline |
| DeepSeek V4 Flash | $0.14 | $0.28 | 3.1× cheaper |
| Google Gemini 3.5 Flash | $0.15 | $0.60 | 1.5× cheaper |
| OpenAI GPT-5.5 standard | $5.00 | $30.00 | 34.5× more |
| OpenAI GPT-5.5 ≥272K context | $10.00 | $45.00 | 51.7× more |
| Anthropic Claude Opus 4.7 | $5.00 | $25.00 | 28.7× more |
Output-token comparison against DeepSeek V4 Pro. Source: published API rate cards, May 24, 2026.
Raw per-token pricing is not the whole picture, because token consumption per task varies between models. Anthropic's Opus 4.7 uses more tokens than its predecessor while GPT-5.5 uses fewer than GPT-5.4, yet both ended up 30 to 90 percent more expensive than the models they replaced. DeepSeek V4 Pro, a 1.6 trillion-parameter mixture-of-experts model that activates 49 billion parameters at inference, trails both on raw benchmarks. That performance gap would have to be enormous to justify a price gap now running from 28 to 51 times, depending on the comparison.
Artificial Analysis ran the numbers on Saturday. Running its full Intelligence Index benchmark costs $268 on V4 Pro. The same test costs about 12 times more on GPT-5.5 and 19 times more on Opus 4.7. DeepSeek said at V4's launch that the model would welcome "the era of cost-effective 1M context length." Saturday's update applied that line to the published rate card and removed the expiration date that had qualified it.
Get the edge in AI business strategy
Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.
No spam. Unsubscribe anytime.
The chip that makes the price possible
When DeepSeek launched V4 last month, it said the Pro version would cost up to 12 times more than the Flash version because of "constraints in high-end compute capacity." A month later that gap has closed, and the company has not said what changed.
The Huawei Ascend 950 is the most likely answer. With V4, DeepSeek built its first model family tuned for Huawei's Ascend accelerators rather than Nvidia hardware, and Huawei aims to ship around 750,000 Ascend 950PR units during 2026, according to The Tech Portal. Tencent, Alibaba and ByteDance are reportedly competing for the same supply. How fast Huawei can scale that output is still constrained by export controls on advanced chipmaking equipment, even as its 2026 targets climb.
DeepSeek did not say whether the permanent cut was driven by cheaper chip supply, and it does not have to: a price cut is not a margin sacrifice when the underlying compute cost is falling. The company is only now entering its first funding round, and it carries nowhere near the revenue pressure that OpenAI, valued at $852 billion, and Anthropic, whose annualized revenue surged from $9 billion to $30 billion between late 2025 and April 2026, take into their approaching IPOs. Saturday's change reset where DeepSeek sits in the market rather than defending a margin.
What enterprise buyers just saw
Salesforce projects $300 million in Anthropic token spending this year. At DeepSeek's new permanent pricing, an equivalent volume costs a fraction of that figure. For enterprise accounts consuming millions of tokens daily, across agentic coding runs, document processing, and long-context reasoning, the savings are material.
The catch is the one that always applied to DeepSeek. A bank, a drugmaker or a federal agency does not pick an AI model on the output-token line alone; procurement runs through data handling, uptime, auditability, compliance posture and vendor risk before price enters the decision. Routing sensitive workloads through a Chinese provider also carries geopolitical and technical exposure that no per-token rate can offset.
Even so, the negotiation has shifted. If V4 Pro delivers good-enough performance at a fraction of premium pricing, enterprise buyers will cite it as a benchmark even when they never route production traffic through it. Every Western lab's procurement conversation now starts from a reference figure 34 times below the one on its own rate card.
Switching is easier than it sounds, because DeepSeek supports both OpenAI and Anthropic API formats. V4 Pro offers a one-million-token context window and up to 384,000 output tokens, competitive with any Western model on spec-sheet dimensions. Its main constraint is throughput: V4 Pro caps concurrency at 500 requests, against 2,500 for V4 Flash.
The accusation that shadows the price
Anthropic has publicly accused DeepSeek of "distillation attacks," improperly training on Claude's responses to improve its own models. The allegation, first made in February, remains unresolved. The Register reported this month that Anthropic claims Chinese labs "illicitly extract the innovations of American companies" through the practice. If substantiated, the finding would mean some of DeepSeek's capability advantage was built on Anthropic's research investment. The price differential would then reflect intellectual property arbitrage rather than engineering efficiency.
Anthropic has also urged Washington to tighten chip and model controls on China, warning of the consequences if "authoritarian governments" take the lead in AI. In Anthropic's telling, the distillation claim and the export-control lobbying make one case: DeepSeek's prices reflect borrowed research and protected domestic chips rather than a market it won on efficiency, and removing either advantage would push the price back up.
DeepSeek has not addressed the accusation in detail. It has instead locked in the lower prices and opened its first funding round, and Liang Wenfeng told investors this month that the company would keep developing open-source models while pursuing artificial general intelligence, according to Bloomberg. Making the discount permanent is the harder of those moves to reverse, because a published rate cut sets an expectation the company would have to unwind in public.
The 75 percent discount that was set to expire May 31 is now the standing price. For the past year Western labs charged premium rates for frontier capability and treated the gap with Chinese models as a temporary lead. Saturday's change removed the expiration date, and with it the assumption that the cheaper number was a promotion buyers could afford to wait out.
Frequently Asked Questions
What did DeepSeek announce?
DeepSeek made permanent the 75% discount on its flagship V4 Pro model on Saturday, May 23. Output tokens now cost $0.87 per million, down from $3.48. The promotion had been set to expire May 31.
How much cheaper is V4 Pro than Western models?
V4 Pro is 34.5× cheaper than GPT-5.5 on output tokens and 28.7× cheaper than Claude Opus 4.7. For long-context workloads above 272K tokens, the gap stretches past 50× against GPT-5.5.
Why can DeepSeek sustain these prices?
DeepSeek V4 runs on Huawei Ascend 950 accelerators rather than Nvidia hardware. Huawei aims to ship 750,000 Ascend 950PR units in 2026. DeepSeek also faces no IPO pressure, unlike OpenAI and Anthropic.
What are the risks for enterprise buyers?
Routing sensitive workloads through a Chinese AI provider carries geopolitical, compliance, and data-handling risks. Regulated industries like banking and pharma may not switch on price alone. DeepSeek also faces unresolved distillation accusations from Anthropic.
What is the distillation accusation?
Anthropic has publicly accused DeepSeek of improperly training on Claude responses. If substantiated, it means some of DeepSeek capability was built on Anthropic research investment — making the price gap reflect IP arbitrage rather than efficiency.
AI-generated summary, reviewed by an editor. More on our AI guidelines.



IMPLICATOR