Copilot+ AI PC mini desktop with AMD Ryzen AI 9 HX 370 (Strix Point), Radeon 890M iGPU, and a 50 TOPS XDNA 2 NPU. 32GB DDR5, 1TB SSD, OCuLink for eGPU expansion. Always-on local AI sidekick.
Good balance for indie developers running local copilots and chat. 30B+ models are reachable but only with aggressive quantization and short context.
Generated from this product’s spec sheet. Editor reviews refine it over time.
The MINISFORUM AI X1 Pro-370 is a Copilot+ certified mini desktop that delivers 50 TOPS from its AMD XDNA 2 NPU, backed by a Ryzen AI 9 HX 370 processor and Radeon 890M iGPU. At $1,099 (32GB/1TB), it occupies the prosumer sweet spot—more capable than a typical office mini PC, but at a fraction of the cost and power draw of a workstation-class machine. For AI engineers and local-inference practitioners, the X1 Pro-370 offers a balance of on-device AI acceleration, expandability via OCuLink, and a form factor that fits on a desk or in a rack.
MINISFORUM positions this as the value leader in its X1 Pro lineup, trading the marginal NPU boost of the 470 model (86 vs 80 combined TOPS) for a lower price without sacrificing core AI capability. It competes directly with other Strix Point mini PCs like the ASUS NUC 14 Pro+ and the Geekom A8, but stands out by including an OCuLink port for external GPU expansion—critical for AI workloads that exceed the iGPU’s shared memory budget.
---
The HX 370’s 12-core CPU (4 Zen 5 + 8 Zen 5c) provides ample headroom for pre-processing and orchestration, but the AI story lives in the iGPU and NPU. The Radeon 890M (RDNA 3.5, 16 CUs) delivers roughly 30 TOPS of FP16/INT8 compute, and the XDNA 2 NPU adds another 50 dedicated TOPS for sustained inference. Combined platform AI throughput is 80 TOPS, enough to run Copilot+ features and small-to-medium LLMs entirely on-device.
The critical constraint is memory. The system ships with 32GB DDR5-5600 in dual-channel, delivering ~90 GB/s bandwidth. That bandwidth is shared between CPU and iGPU—there is no dedicated VRAM. For inference, the iGPU can address up to 16GB (configurable in BIOS), but that allocation comes from system memory. This limits model size and token generation speed relative to a dedicated GPU with high-bandwidth memory.
| Metric | Value | Impact on AI |
|---|---|---|
| VRAM (iGPU allocatable) | 16 GB | Fits 13B at Q4, 8B at Q4 with room for context |
| Memory bandwidth | 90 GB/s | ~20-30 tok/s for 7B Llama 3.1 at Q4; ~10-15 tok/s for 13B |
| INT8 TOPS (NPU) | 50 | Efficient for small models (<3B), vision transformers |
| TDP | 28W base / 54W configurable | Excellent energy efficiency for always-on agents |
| Expansion | OCuLink | Enables 32B at Q3 via eGPU (e.g., RTX 4060 or A-series) |
At 54W peak (135W adapter), the X1 Pro-370 sips power compared to a desktop GPU solution. For always-on local agents or edge deployment scenarios, this is a key advantage—noise is low, heat is minimal, and electricity costs are negligible. The integrated power supply (brick adapter) and compact chassis make it suitable for co-location or remote sites.
---
The X1 Pro-370 is a capable local inference machine for models up to about 13B parameters, with a clear trade-off between quality and speed. Here’s what you can expect for popular open-weight LLMs.
Plug in an external GPU (e.g., RTX 4060 with 12GB VRAM, or an RTX 3090 with 24GB). This unlocks:
The Radeon 890M can run LLaVA-NeXT 7B (Q4) at ~15 tok/s and smaller vision transformers (e.g., CLIP, SigLIP) efficiently via the NPU. Stable Diffusion XL 1.0 base runs at ~1-2 it/s on the iGPU—usable for single-image generation but slow for batch work. OCuLink dramatically improves this.
For daily driver local LLM inference, target 7B-8B models at Q4_K_M. They fit comfortably, deliver interactive speeds, and the 32GB system RAM allows multiple model swaps or large context windows (up to 32K tokens). If you need higher quality from a 13B or MoE model, use the OCuLink eGPU path—or accept slower single-stream output.
---
The X1 Pro-370 is ideal for developers prototyping agentic workflows, local RAG pipelines, or testing model quantization. The NPU offloads small inference tasks (classification, embedding, function calling) while the CPU/GPU handles heavier loads. The upgradeable SO-DIMM RAM (up to 128GB) and three M.2 slots mean you can expand storage and memory without replacing the unit.
At 54W, this machine is a solid candidate for edge inference boxes, kiosks, or always-on local AI assistants. Dual 2.5GbE and WiFi 7 provide robust networking for multi-node setups. The OCuLink port offers a future path to higher compute without replacing the base unit.
If you need a low-power, quiet server for serving small models to a few concurrent users, the X1 Pro-370 works well with llama.cpp, Ollama, or vLLM. Within a homelab or small office, it can handle 2-3 concurrent 7B Q4 streams at 15-20 tok/s each.
The shared memory and modest iGPU make fine-tuning impractical beyond lightweight LoRA adapters with batch size 1. For training, look to a dedicated GPU workstation or cloud instance.
---
The NUC 14 Pro+ offers 34 TOPS NPU (Intel AI Boost) and up to 96GB RAM, but no OCuLink. Its iGPU (Arc Xe-LPG) lags behind the Radeon 890M for LLM inference. The X1 Pro-370 wins on raw NPU TOPS and the expansion path for eGPU. Choose the NUC if you need more CPU memory bandwidth (LP-DDR5x-7467) for CPU-based inference.
The Geekom A8 uses the previous-gen Zen 4 architecture with a 16 TOPS NPU and Radeon 780M iGPU. While cheaper (~$799), it lacks both the dedicated NPU throughput and OCuLink for future upgrades. For AI workloads, the X1 Pro-370 is the clear step-up—more than double the NPU performance and real expandability.
When to pick the MINISFORUM AI X1 Pro-370: You need local AI inference today, want the flexibility of eGPU expansion without committing to a full desktop build, and value energy efficiency and a compact footprint. For under $1,100, it delivers the best combination of on-device AI and upgrade path in a mini desktop form factor.
Qwen3-30B-A3BAlibaba Cloud (Qwen) | 30B(3B active) | BB | 13.5 tok/s | 5.4 GB | |
Qwen3.6 35B-A3BAlibaba Cloud | 35B(3B active) | BB | 8.5 tok/s | 8.5 GB | |
Qwen3.5-35B-A3BAlibaba Cloud (Qwen) | 35B(3B active) | BB | 8.5 tok/s | 8.5 GB | |
Mixtral 8x7B InstructMistral AI | 46.7B(12.9B active) | BB | 6.4 tok/s | 11.4 GB | |
Gemma 4 26B-A4B ITGoogle | 26B(4B active) | BB | 6.6 tok/s | 11.0 GB | |
Llama 2 13B ChatMeta | 13B | BB | 8.6 tok/s | 8.5 GB | |
| 8B | BB | 12.8 tok/s | 5.7 GB | ||
| 9B | BB | 12.0 tok/s | 6.0 GB | ||
| Ad | |||||
Llama 2 7B ChatMeta | 7B | BB | 15.1 tok/s | 4.8 GB | |
Gemma 4 E4B ITGoogle | 4B | BB | 10.5 tok/s | 6.9 GB | |
Gemma 3 4B ITGoogle | 4B | BB | 10.5 tok/s | 6.9 GB | |
Mistral 7B InstructMistral AI | 7B | BB | 11.3 tok/s | 6.4 GB | |
Gemma 4 E2B ITGoogle | 2B | BB | 19.5 tok/s | 3.7 GB | |
| 8B | CC | 5.4 tok/s | 13.3 GB | ||
Qwen3.5-9BAlibaba Cloud (Qwen) | 9B | FF | 2.9 tok/s | 24.6 GB | |
Mistral Small 3 24BMistral AI | 24B | FF | 1.9 tok/s | 39.0 GB | |
| Ad | |||||
Qwen3.6-27BAlibaba Cloud | 27B | FF | 1.0 tok/s | 72.8 GB | |
Gemma 3 27B ITGoogle | 27B | FF | 1.7 tok/s | 43.8 GB | |
Qwen3.5-27BAlibaba Cloud (Qwen) | 27B | FF | 1.0 tok/s | 72.8 GB | |
Gemma 4 31B ITGoogle | 31B | FF | 0.9 tok/s | 82.0 GB | |
Qwen3-32BAlibaba Cloud (Qwen) | 32.8B | FF | 1.3 tok/s | 53.9 GB | |
Falcon 40B InstructTechnology Innovation Institute | 40B | FF | 3.0 tok/s | 24.4 GB | |
LLaMA 65BMeta | 65B | FF | 1.8 tok/s | 39.3 GB | |
Llama 2 70B ChatMeta | 70B | FF | 1.7 tok/s | 43.4 GB | |
| Ad | |||||
| 70B | FF | 1.6 tok/s | 45.7 GB | ||