Maxed-out Arrow Lake-H NUC-class mini PC with Intel Core Ultra 9 285H, Arc 140T iGPU, 32GB DDR5, 2TB NVMe. 99 TOPS combined platform AI for 8B-class on-device inference.
8 GB will run a 7B Q4 quant and most embedding models, but the KV cache budget is tight. Better as a stepping stone than a long-term home for AI work.
Generated from this product’s spec sheet. Editor reviews refine it over time.
The GEEKOM IT15 (Ultra 9 285H) is a NUC-class mini PC that packs Intel’s Arrow Lake-H flagship mobile silicon into a 4.4” x 4.6” chassis. At $1,399 (MSRP, though street pricing often lands ~$1,200), it targets the prosumer edge of the AI inference market—a bridge between consumer-grade laptops and workstation-class machines. GEEKOM has been a reliable builder of compact systems, and with this model they’ve pushed the combined platform AI performance to 99 TOPS, leveraging the CPU, Intel Arc 140T iGPU, and NPU simultaneously.
For practitioners running 8B-class models locally, this hardware hits a practical sweet spot. It’s not a data-center GPU, but it doesn’t need to be; the shared 32GB DDR5 system memory (90 GB/s bandwidth) acts as unified VRAM for the iGPU, enabling 8B parameter models at Q4–Q5 quantizations with workable token rates. The 65W sustained TDP (peaking around 100W briefly) means it’s viable for always-on inference servers, field deployments, or desktop AI workstations that don’t demand a rack.
What sets the IT15 apart is its efficiency per watt. You can run local LLMs, multimodal models, and agent pipelines without a dedicated GPU, at a power draw that won’t spike your utility bill. It competes directly with high-end mini PCs from Minisforum and ASUS (e.g., NUC 14 Pro+), and with lower-TDP laptops that don’t have the same memory bandwidth.
The shared memory architecture means model fitting depends on both weight size and context length. Here’s what works:
llama.cpp (CPU offload layer mixing). Acceptable for chat, summarization, code generation.8B at Q4_K_S with 4K context. This yields ~13–15 tok/s, balancing quality and speed. For coding or structured tasks, Q8_0 on a 3B model (e.g., Phi-3-mini) runs at 25+ tok/s—great for autocomplete and agent scripts.
If you want a dedicated always-on machine for ChatGPT-style queries, local RAG, or proof-of-concept agents, the IT15 eliminates GPU complexity. Plug in, install LM Studio or Ollama, and start inferring. The 2TB SSD holds dozens of quantized models.
The 65W sustained power, compact size (less than 2” tall), and passive capability under light load make it viable for remote edge nodes—digital signage with AI, on-device transcription, or low-volume inference at a retail location. Pair with a battery UPS for mobile use.
Test agents, chaining calls to local models, or run a local inference server for a small team (1–3 concurrent users). The USB4 port supports an external GPU if you later need more VRAM. The IT15 is a flexible dev sandbox.
The shared memory and iGPU compute (2.7 TFLOPS FP16) are insufficient for any meaningful training or fine-tuning. Stick to inference.
Qwen3-30B-A3BAlibaba Cloud (Qwen) | 30B(3B active) | AA | 13.5 tok/s | 5.4 GB | |
| 8B | BB | 12.8 tok/s | 5.7 GB | ||
| 9B | BB | 12.0 tok/s | 6.0 GB | ||
Llama 2 7B ChatMeta | 7B | BB | 15.1 tok/s | 4.8 GB | |
Gemma 4 E2B ITGoogle | 2B | BB | 19.5 tok/s | 3.7 GB | |
Mistral 7B InstructMistral AI | 7B | BB | 11.3 tok/s | 6.4 GB | |
Gemma 4 E4B ITGoogle | 4B | CC | 10.5 tok/s | 6.9 GB | |
Gemma 3 4B ITGoogle | 4B | CC | 10.5 tok/s | 6.9 GB | |
| Ad | |||||
Qwen3.6 35B-A3BAlibaba Cloud | 35B(3B active) | DD | 8.5 tok/s | 8.5 GB | |
Qwen3.5-35B-A3BAlibaba Cloud (Qwen) | 35B(3B active) | DD | 8.5 tok/s | 8.5 GB | |
Llama 2 13B ChatMeta | 13B | DD | 8.6 tok/s | 8.5 GB | |
| 8B | FF | 5.4 tok/s | 13.3 GB | ||
Qwen3.5-9BAlibaba Cloud (Qwen) | 9B | FF | 2.9 tok/s | 24.6 GB | |
Mistral Small 3 24BMistral AI | 24B | FF | 1.9 tok/s | 39.0 GB | |
Gemma 4 26B-A4B ITGoogle | 26B(4B active) | FF | 6.6 tok/s | 11.0 GB | |
Qwen3.6-27BAlibaba Cloud | 27B | FF | 1.0 tok/s | 72.8 GB | |
| Ad | |||||
Gemma 3 27B ITGoogle | 27B | FF | 1.7 tok/s | 43.8 GB | |
Qwen3.5-27BAlibaba Cloud (Qwen) | 27B | FF | 1.0 tok/s | 72.8 GB | |
Gemma 4 31B ITGoogle | 31B | FF | 0.9 tok/s | 82.0 GB | |
Qwen3-32BAlibaba Cloud (Qwen) | 32.8B | FF | 1.3 tok/s | 53.9 GB | |
Falcon 40B InstructTechnology Innovation Institute | 40B | FF | 3.0 tok/s | 24.4 GB | |
Mixtral 8x7B InstructMistral AI | 46.7B(12.9B active) | FF | 6.4 tok/s | 11.4 GB | |
LLaMA 65BMeta | 65B | FF | 1.8 tok/s | 39.3 GB | |
Llama 2 70B ChatMeta | 70B | FF | 1.7 tok/s | 43.4 GB | |
| Ad | |||||
| 70B | FF | 1.6 tok/s | 45.7 GB | ||