Business and education HTPC variant of the Reatan Gorgon Point platform. 48GB DDR5-5600, 2TB M.2 2230 SSD, HDMI 2.1, and 8K USB4 in a fleet-friendly chassis without OCuLink.
Good balance for indie developers running local copilots and chat. 30B+ models are reachable but only with aggressive quantization and short context.
Generated from this product’s spec sheet. Editor reviews refine it over time.
The Reatan HTPC (Ryzen AI 9 HX 470 48GB) is a compact mini-desktop that bridges the gap between a home theater PC and a capable edge AI workstation. Built on AMD’s Gorgon Point platform, this system is a fleet-friendly variant of the Reatan Gorgon Point line, targeting business, education, and home environments that need local AI inference without the power draw or footprint of a full tower.
At $899, it sits at the prosumer-to-enterprise boundary. It competes with other mini PCs like the Minisforum AI X1 Pro, but differentiates itself with 48GB of dual-channel DDR5-5600 memory and a deliberate omission of OCuLink (keeping the chassis slim and deployment-ready for IT-managed fleets). For AI practitioners, the key draw is the integrated AMD Radeon 890M GPU with 16 RDNA 3.5 compute units and the 55 TOPS XDNA 2 NPU, giving the system a combined platform AI performance of 86 TOPS. This makes it a viable option for running moderate-sized LLMs locally, especially in scenarios where low power consumption and a small physical footprint are non-negotiable.
The specs that matter for inference workloads are straightforward. The system uses unified memory architecture: the 48GB of DDR5-5600 is shared between CPU and GPU. The GPU can allocate up to 16GB of that as VRAM. This is important because it determines the maximum model size you can load without swapping to system memory.
| Spec | Value |
|---|---|
| VRAM (allocatable) | 16 GB |
| Memory Bandwidth | 90 GB/s (dual-channel DDR5-5600) |
| INT8 Performance (NPU) | 55 TOPS |
| INT8 Performance (GPU) | ~12-15 TOPS (estimated, RDNA 3.5) |
| Combined Platform AI | 86 TOPS |
| TDP | 54 W (configurable, 28W base) |
| CPU | AMD Ryzen AI 9 HX 470 (12C/24T, up to 5.2 GHz) |
| GPU | AMD Radeon 890M (16 CUs, RDNA 3.5) |
| NPU | XDNA 2 (55 TOPS) |
Memory bandwidth at 90 GB/s is the bottleneck for token generation on integrated GPUs. For comparison, a dedicated RTX 4060 laptop GPU offers ~256 GB/s, but costs more and draws 115W. The Reatan HTPC’s 54W TDP makes it extremely efficient for always-on inference servers or edge deployments where thermal and power constraints are tight.
The 55 TOPS NPU is a dedicated accelerator for lightweight AI tasks like speech-to-text, image classification, or small LLM inference (e.g., Phi-3, Gemma 2B). For larger models, you’ll rely on the Radeon 890M via ROCm or DirectML. Note that AMD’s ROCm support for integrated GPUs is still maturing; Windows users will primarily use DirectML or ONNX Runtime with the NPU fallback.
This is the practical question. With 16GB of allocatable VRAM, the Reatan HTPC can comfortably run 7B-13B parameter models at 4-bit or 5-bit quantization. At Q4, a 13B model uses roughly 7-8 GB, leaving headroom for context and system processes. At Q5, the same model uses ~9-10 GB. The sweet spot for quality-to-speed is Q4_K_M on a 13B model like Llama 3.1-8B or Qwen 2.5-7B.
Expected tokens/second (using llama.cpp with Vulkan or DirectML backend):
Larger models with patience:
Long-context tasks: The 48GB system memory allows large context windows (e.g., 128K tokens) at 4-bit, but the GPU VRAM is the bottleneck. For long-context, you’ll need to offload layers to CPU, which tank throughput. This machine is better for interactive chat (2K-8K context) than document-level analysis.
NPU-accelerated models: The XDNA 2 NPU can handle small on-device models (e.g., Whisper base, Phi-3-mini) at low latency. It’s useful for always-on voice assistants or real-time transcription without GPU overhead.
This system is not for training. It’s for inference, and specifically for deployment scenarios where you need a small, low-power, always-on box that can run local LLMs, agents, or RAG pipelines.
Who should buy it:
Not for:
vs. Minisforum AI X1 Pro (Ryzen AI 9 HX 470, 32GB)
The Minisforum AI X1 Pro is a direct competitor but comes with only 32GB of single-channel RAM in many configurations. The Reatan HTPC’s 48GB dual-channel setup gives it a significant memory bandwidth advantage (90 GB/s vs. ~45 GB/s if single-channel). For AI inference, dual-channel is critical — single-channel cuts token throughput by 30-50%. The Reatan also has a larger SSD (2TB vs. 1TB often). However, the Minisforum offers OCuLink for eGPU expansion, which the Reatan omits. If you need external GPU support, the Minisforum is the better pick. If you want a self-contained, out-of-the-box inference machine with more VRAM headroom, the Reatan wins.
vs. Apple Mac Mini M4 Pro (24GB unified memory)
The Mac Mini M4 Pro offers higher memory bandwidth (~200 GB/s) and better GPU compute for ML (Metal Performance Shaders). It can run 13B models at Q4 faster (~30-40 tokens/s). But it starts at $1,399 and maxes out at 48GB (which costs $1,899). The Reatan HTPC at $899 is half the price for comparable VRAM capacity. The tradeoff is slower inference and less mature software ecosystem (ROCm vs. Metal). If you’re already in the Apple ecosystem and need maximum performance per watt, the Mac Mini is superior. If you need Windows compatibility, lower cost, and fleet manageability, the Reatan is the pragmatic choice.
When to pick the Reatan HTPC:
Qwen3-30B-A3BAlibaba | 30B(3B active) | BB | 13.5 tok/s | 5.4 GB | |
Qwen3.6 35B-A3BAlibaba | 35B(3B active) | BB | 8.5 tok/s | 8.5 GB | |
Qwen3.5-35B-A3BAlibaba | 35B(3B active) | BB | 8.5 tok/s | 8.5 GB | |
Mixtral 8x7B InstructMistral AI | 46.7B(12.9B active) | BB | 6.4 tok/s | 11.4 GB | |
Gemma 4 26B-A4B ITGoogle | 26B(4B active) | BB | 6.6 tok/s | 11.0 GB | |
Llama 2 13B ChatMeta | 13B | BB | 8.6 tok/s | 8.5 GB | |
| 8B | BB | 12.8 tok/s | 5.7 GB | ||
| 9B | BB | 12.0 tok/s | 6.0 GB | ||
| Ad | |||||
Llama 2 7B ChatMeta | 7B | BB | 15.1 tok/s | 4.8 GB | |
Gemma 4 E4B ITGoogle | 4B | BB | 10.5 tok/s | 6.9 GB | |
Gemma 3 4B ITGoogle | 4B | BB | 10.5 tok/s | 6.9 GB | |
Mistral 7B InstructMistral AI | 7B | BB | 11.3 tok/s | 6.4 GB | |
Gemma 4 E2B ITGoogle | 2B | BB | 19.5 tok/s | 3.7 GB | |
| 8B | CC | 5.4 tok/s | 13.3 GB | ||
Qwen3.5-9BAlibaba | 9B | FF | 2.9 tok/s | 24.6 GB | |
Mistral Small 3 24BMistral AI | 24B | FF | 1.9 tok/s | 39.0 GB | |
| Ad | |||||
Qwen3.6-27BAlibaba | 27B | FF | 1.0 tok/s | 72.8 GB | |
Gemma 3 27B ITGoogle | 27B | FF | 1.7 tok/s | 43.8 GB | |
Qwen3.5-27BAlibaba | 27B | FF | 1.0 tok/s | 72.8 GB | |
Gemma 4 31B ITGoogle | 31B | FF | 0.9 tok/s | 82.0 GB | |
Qwen3-32BAlibaba | 32.8B | FF | 1.3 tok/s | 53.9 GB | |
Falcon 40B InstructTechnology Innovation Institute | 40B | FF | 3.0 tok/s | 24.4 GB | |
LLaMA 65BMeta | 65B | FF | 1.8 tok/s | 39.3 GB | |
Llama 2 70B ChatMeta | 70B | FF | 1.7 tok/s | 43.4 GB | |
| Ad | |||||
| 70B | FF | 1.6 tok/s | 45.7 GB | ||