MiniMax's open-weight flagship, a Mixture-of-Experts model with roughly 428B total parameters and about 23B active per token. It is natively multimodal, accepting text, image, and video input, and supports a 1M-token context window. The model is built on MiniMax Sparse Attention (MSA), which the team reports gives more than 9x faster prefill and more than 15x faster decoding at 1M context versus M2. On agentic and coding benchmarks it scores 59.0% on SWE-Bench Pro, 66.0% on Terminal-Bench 2.1, 74.2% on MCP Atlas, 34.8% on SWE-fficiency, and 28.8% on KernelBench Hard.
A workable 428B-parameter MoE language model from MiniMax. Pulls ahead on graduate-level reasoning (GPQA) (93/100), so reach for it when that's the dimension that matters. Newly released, so production-readiness is still being shaken out.
Generated from this model’s benchmarks and ranking signals. Editor reviews refine it over time.
Access model weights, configuration files, and documentation.
See how different quantization levels affect VRAM requirements and quality for this model.
| Format | VRAM Required | Quality | |
|---|---|---|---|
| Q2_K | 193.0 GB | Low | |
| Q4_K_MRecommended | 197.8 GB | Good | |
| Q5_K_M | 200.1 GB | Very Good | |
| Q6_K | 202.9 GB | Excellent | |
| Q8_0 | 208.7 GB | Near Perfect | |
| FP16 | 230.5 GB | Full |
See which devices can run this model and at what quality level.
| SS | 32.6 tok/s | 197.8 GB | ||
| AA | 24.4 tok/s | 197.8 GB | ||
| AA | 28.9 tok/s | 197.8 GB | ||
| AA | 28.9 tok/s | 197.8 GB | ||
| AA | 28.9 tok/s | 197.8 GB | ||
| AA | 28.9 tok/s | 197.8 GB | ||
SuperMicro Super AI StationSuperMicro | AA | 28.9 tok/s | 197.8 GB | |
Gigabyte W775-V10-L01Gigabyte | AA | 28.9 tok/s | 197.8 GB | |
| BB | 3.3 tok/s | 197.8 GB | ||
| BB | 3.3 tok/s | 197.8 GB | ||
NVIDIA B200 GPUNVIDIA | BB | 32.6 tok/s | 197.8 GB | |
Google TPU v7 (Ironwood)Google | BB | 30.0 tok/s | 197.8 GB | |
| CC | 21.6 tok/s | 197.8 GB | ||
| DD | 3.3 tok/s | 197.8 GB | ||
| FF | 2.1 tok/s | 197.8 GB | ||
| FF | 1.1 tok/s | 197.8 GB | ||
| FF | 1.2 tok/s | 197.8 GB | ||
| FF | 1.8 tok/s | 197.8 GB | ||
| FF | 2.5 tok/s | 197.8 GB | ||
| FF | 3.3 tok/s | 197.8 GB | ||
| FF | 3.9 tok/s | 197.8 GB | ||
| FF | 2.6 tok/s | 197.8 GB | ||
| FF | 2.6 tok/s | 197.8 GB | ||
Apple M4Apple | FF | 0.5 tok/s | 197.8 GB | |
| FF | 2.2 tok/s | 197.8 GB |
Energy cost on AMD Instinct MI300X (~22 tok/s, Q4_K_M) vs flagship API pricing.
| Source | Cost per 1M tokens |
|---|---|
Local (energy only)MiniMax M3 on AMD Instinct MI300X · ~22 tok/s · 750W | $1.16 |
GPT-5.5OpenAI · in $5.00 · out $30.00 | $12.50 |
Claude Opus 4.7 ThinkingAnthropic · in $5.00 · out $25.00 | $11.00 |
Gemini 3.5 FlashGoogle · in $1.50 · out $9.00 | $3.75 |
Grok 4.3xAI · in $1.25 · out $2.50 | $1.63 |
API prices blended at 70% input / 30% output.
Hardware amortisation not included. Run the full ROI calculator for payback math.
Cheapest current cloud rentals with at least 198 GB VRAM, refreshed hourly.
| Option | Cost / GPU-hour |
|---|---|
NVIDIA B300Vast.ai · Spot · 288 GB VRAM | $3.50 |
NVIDIA B300Vast.ai · On-Demand · 288 GB VRAM | $3.75 |
NVIDIA B300RunPod · Community · 288 GB VRAM | $6.94 |
NVIDIA B300RunPod · Spot · 288 GB VRAM | $6.94 |
NVIDIA B300RunPod · Secure · 288 GB VRAM | $7.39 |
Per-GPU rate across RunPod and the Vast.ai marketplace.
Spot tier is interruptible. Plan for restarts when comparing against on-demand prices.

Explore the Provider
Aggregate stats, leaderboard, release timeline, and benchmark coverage across every MiniMax model we track.

Explore the Family
The full MiniMax family leaderboard with sizes, benchmark scores, and a release timeline.