Liang Wenfeng's team removed the strikethroughs from DeepSeek's API pricing page on Saturday. The promotional 75 percent discount on V4 Pro, set to expire May 31 at 15:59 UTC, was now permanent. Where the page had listed $1.74 per million uncached input tokens, struck through in favor of $0.435, it now showed only the lower number. Output tokens fell from a struck-through $3.48 to $0.87.

The promotional price is now the standard price, and the standard runs on Huawei chips. DeepSeek can hold it because it trains and serves V4 on domestic silicon, carries no IPO clock, and treats inference as a commodity input rather than a premium product. Western labs cannot match $0.87 per million output tokens without rewriting the revenue assumptions their valuations depend on.

Key Takeaways

AI-generated summary, reviewed by an editor. More on our AI guidelines.

The math that just got locked in

The numbers, laid side by side, look like a pricing error:

API Price · USD per 1M tokens

Model Input Output vs. V4 Pro
DeepSeek V4 Pro $0.435 $0.87 baseline
DeepSeek V4 Flash $0.14 $0.28 3.1× cheaper
Google Gemini 3.5 Flash $0.15 $0.60 1.5× cheaper
OpenAI GPT-5.5 standard $5.00 $30.00 34.5× more
OpenAI GPT-5.5 ≥272K context $10.00 $45.00 51.7× more
Anthropic Claude Opus 4.7 $5.00 $25.00 28.7× more

Output-token comparison against DeepSeek V4 Pro. Source: published API rate cards, May 24, 2026.

Raw per-token pricing is not the whole picture, because token consumption per task varies between models. Anthropic's Opus 4.7 uses more tokens than its predecessor while GPT-5.5 uses fewer than GPT-5.4, yet both ended up 30 to 90 percent more expensive than the models they replaced. DeepSeek V4 Pro, a 1.6 trillion-parameter mixture-of-experts model that activates 49 billion parameters at inference, trails both on raw benchmarks. That performance gap would have to be enormous to justify a price gap now running from 28 to 51 times, depending on the comparison.

Artificial Analysis ran the numbers on Saturday. Running its full Intelligence Index benchmark costs $268 on V4 Pro. The same test costs about 12 times more on GPT-5.5 and 19 times more on Opus 4.7. DeepSeek said at V4's launch that the model would welcome "the era of cost-effective 1M context length." Saturday's update applied that line to the published rate card and removed the expiration date that had qualified it.

The chip that makes the price possible

When DeepSeek launched V4 last month, it said the Pro version would cost up to 12 times more than the Flash version because of "constraints in high-end compute capacity." A month later that gap has closed, and the company has not said what changed.

The Huawei Ascend 950 is the most likely answer. With V4, DeepSeek built its first model family tuned for Huawei's Ascend accelerators rather than Nvidia hardware, and Huawei aims to ship around 750,000 Ascend 950PR units during 2026, according to The Tech Portal. Tencent, Alibaba and ByteDance are reportedly competing for the same supply. How fast Huawei can scale that output is still constrained by export controls on advanced chipmaking equipment, even as its 2026 targets climb.

DeepSeek did not say whether the permanent cut was driven by cheaper chip supply, and it does not have to: a price cut is not a margin sacrifice when the underlying compute cost is falling. The company is only now entering its first funding round, and it carries nowhere near the revenue pressure that OpenAI, valued at $852 billion, and Anthropic, whose annualized revenue surged from $9 billion to $30 billion between late 2025 and April 2026, take into their approaching IPOs. Saturday's change reset where DeepSeek sits in the market rather than defending a margin.

What enterprise buyers just saw

Salesforce projects $300 million in Anthropic token spending this year. At DeepSeek's new permanent pricing, an equivalent volume costs a fraction of that figure. For enterprise accounts consuming millions of tokens daily, across agentic coding runs, document processing, and long-context reasoning, the savings are material.

The catch is the one that always applied to DeepSeek. A bank, a drugmaker or a federal agency does not pick an AI model on the output-token line alone; procurement runs through data handling, uptime, auditability, compliance posture and vendor risk before price enters the decision. Routing sensitive workloads through a Chinese provider also carries geopolitical and technical exposure that no per-token rate can offset.

Even so, the negotiation has shifted. If V4 Pro delivers good-enough performance at a fraction of premium pricing, enterprise buyers will cite it as a benchmark even when they never route production traffic through it. Every Western lab's procurement conversation now starts from a reference figure 34 times below the one on its own rate card.

Switching is easier than it sounds, because DeepSeek supports both OpenAI and Anthropic API formats. V4 Pro offers a one-million-token context window and up to 384,000 output tokens, competitive with any Western model on spec-sheet dimensions. Its main constraint is throughput: V4 Pro caps concurrency at 500 requests, against 2,500 for V4 Flash.

The accusation that shadows the price

Anthropic has publicly accused DeepSeek of "distillation attacks," improperly training on Claude's responses to improve its own models. The allegation, first made in February, remains unresolved. The Register reported this month that Anthropic claims Chinese labs "illicitly extract the innovations of American companies" through the practice. If substantiated, the finding would mean some of DeepSeek's capability advantage was built on Anthropic's research investment. The price differential would then reflect intellectual property arbitrage rather than engineering efficiency.

Anthropic has also urged Washington to tighten chip and model controls on China, warning of the consequences if "authoritarian governments" take the lead in AI. In Anthropic's telling, the distillation claim and the export-control lobbying make one case: DeepSeek's prices reflect borrowed research and protected domestic chips rather than a market it won on efficiency, and removing either advantage would push the price back up.

DeepSeek has not addressed the accusation in detail. It has instead locked in the lower prices and opened its first funding round, and Liang Wenfeng told investors this month that the company would keep developing open-source models while pursuing artificial general intelligence, according to Bloomberg. Making the discount permanent is the harder of those moves to reverse, because a published rate cut sets an expectation the company would have to unwind in public.

The 75 percent discount that was set to expire May 31 is now the standing price. For the past year Western labs charged premium rates for frontier capability and treated the gap with Chinese models as a temporary lead. Saturday's change removed the expiration date, and with it the assumption that the cheaper number was a promotion buyers could afford to wait out.

Frequently Asked Questions

What did DeepSeek announce?

DeepSeek made permanent the 75% discount on its flagship V4 Pro model on Saturday, May 23. Output tokens now cost $0.87 per million, down from $3.48. The promotion had been set to expire May 31.

How much cheaper is V4 Pro than Western models?

V4 Pro is 34.5× cheaper than GPT-5.5 on output tokens and 28.7× cheaper than Claude Opus 4.7. For long-context workloads above 272K tokens, the gap stretches past 50× against GPT-5.5.

Why can DeepSeek sustain these prices?

DeepSeek V4 runs on Huawei Ascend 950 accelerators rather than Nvidia hardware. Huawei aims to ship 750,000 Ascend 950PR units in 2026. DeepSeek also faces no IPO pressure, unlike OpenAI and Anthropic.

What are the risks for enterprise buyers?

Routing sensitive workloads through a Chinese AI provider carries geopolitical, compliance, and data-handling risks. Regulated industries like banking and pharma may not switch on price alone. DeepSeek also faces unresolved distillation accusations from Anthropic.

What is the distillation accusation?

Anthropic has publicly accused DeepSeek of improperly training on Claude responses. If substantiated, it means some of DeepSeek capability was built on Anthropic research investment — making the price gap reflect IP arbitrage rather than efficiency.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

DeepSeek Releases V4 Pro and Flash, Undercutting OpenAI Pricing by Up to 10x
Chinese AI startup DeepSeek on Friday released preview versions of two new flagship models, V4-Pro and V4-Flash, at prices that run roughly 3 to 10 times below comparable US frontier systems, accordin
California Says Amazon Pressured Levi's, Hanes to Raise Rival Retail Prices
California accused Amazon on Monday of pressuring brands including Levi's and Hanes to ask rival retailers to raise prices, according to a newly unsealed filing and a state evidence release in Califor
China's Budget Phone Era Is Over. AI and a Dutch Courtroom Killed It.
At Mobile World Congress in Barcelona last week, phone manufacturers showed off new devices without naming prices. Not because they hadn't decided. Because they couldn't. Memory costs were shifting to
Analysis

San Francisco

Editor-in-Chief and founder of Implicator.ai. Former ARD correspondent and senior broadcast journalist with 10+ years covering tech. Writes daily briefings on policy and market developments. Based in San Francisco. E-mail: [email protected]