MiniMax released M3 on Monday and said model weights and a technical report will follow within 10 days, leaving developers with API access before local inspection. The Shanghai company is offering M3 through MiniMax Code, token plans and an API, while its M3 model page lists a 1M-token context window and a guaranteed 512,000-token minimum for API use. In MiniMax's description, MiniMax Sparse Attention, or MSA, uses a pre-filtering step to pick relevant key-value blocks before the full attention calculation.
Key Takeaways
- MiniMax released M3 with a 1M-token context window and native multimodal input.
- The company reports 59.0% on SWE-Bench Pro and 74.2% on MCP Atlas.
- Standard API pricing lists $0.60 per million input and $2.40 per million output.
- MarketWatch reported a 16% Hong Kong share drop after the STAR Market disclosure.
AI-generated summary, reviewed by an editor. More on our AI guidelines.
The release is API-first
MiniMax's final section says the report and weights will be released over the next 10 days on Hugging Face and GitHub.
MiniMax calls M3 the first open-weight model to combine frontier coding, native multimodality and a 1M-token context window. By Monday afternoon, the public material consisted of the launch post, model page and API documentation, not a downloadable M3 checkpoint. The methodology notes also cite internal infrastructure, Mini-SWE-Agent, Claude Code scaffolding and MiniMax scoring choices for different tests.
Benchmark claims start with coding
MiniMax listed 59.0% on SWE-Bench Pro, 66.0% on Terminal-Bench 2.1 and 74.2% on MCP Atlas in its launch material. In a company-run Hopper test, M3 worked on FP8 matrix multiplication for about 24 hours, made 147 benchmark submissions and 1,959 tool calls, then raised hardware use from 7.6% to 71.3%.
MSA handles long context
According to the launch post, MSA partitions cached keys and values into blocks and reads each selected block once through a "KV outer gather Q" operator design. MiniMax says that design cuts per-token compute at a 1-million-token context length to one-twentieth of its previous-generation model.
The API page lists Anthropic-compatible and OpenAI-compatible endpoints. It also lists image and video inputs with text output, which puts M3 in the same product lane as coding agents that need repository text, screenshots, diagrams and long tool histories in one session.
Track the AI model race daily
Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.
No spam. Unsubscribe anytime.
Pricing meets STAR filing
MiniMax lists standard API pricing up to 512,000 input tokens at $0.60 per million input and $2.40 per million output. VentureBeat cited lower first-week rates from MiniMax and platform partners, while MiniMax's own subscription plans start at $20 a month for about 1.7 billion M3 tokens.
Know someone who'd find this useful? ✉️ Email it to a friend in one click, or they can subscribe free here.
MarketWatch wrote Monday that MiniMax shares fell 16% in Hong Kong after the company released M3 and disclosed plans for a Shanghai STAR Market listing. The report cited a listing-guidance agreement with Citic Securities and a filing with the Shanghai bureau of the China Securities Regulatory Commission.
Weights remain pending
South China Morning Post wrote that MiniMax did not disclose M3's model size or the compute infrastructure used for training. The missing figures limit comparison with models whose parameter counts, chip clusters and training budgets are public.
Developers can test M3 now through MiniMax Code, subscription plans and API access. The next dated check is the 10-day release window MiniMax set for the technical report and weights, which points to roughly June 11.
Frequently Asked Questions
What is MiniMax M3?
MiniMax M3 is the Shanghai company's new model for coding agents, long-context work and multimodal inputs. MiniMax says it supports a 1M-token context window and image and video input with text output.
Is MiniMax M3 open-weight now?
MiniMax markets M3 as open-weight, but the draft treats that as a pledge. The company says it will publish model weights and a technical report within 10 days of launch.
What is MiniMax Sparse Attention?
MiniMax Sparse Attention is the architecture MiniMax says lets M3 handle long context efficiently. It filters relevant key-value blocks and avoids full attention across every context token.
How much does MiniMax M3 cost?
MiniMax lists standard pricing up to 512,000 input tokens at $0.60 per million input and $2.40 per million output. Subscription plans start at $20 a month.
Why did MiniMax shares fall?
MarketWatch reported that MiniMax shares fell 16% in Hong Kong after the M3 release and disclosure of plans for a Shanghai STAR Market listing.
AI-generated summary, reviewed by an editor. More on our AI guidelines.



IMPLICATOR