
A corporate-targeted GB10 mini-tower equipped with a 1TB SSD to provide a cost-optimized platform for localized AI development.
Sized for production serving of 70B–200B class models at full or lightly-quantized precision. Overkill for a homelab; right call when the workload pays for itself in token volume.
Generated from this product’s spec sheet. Editor reviews refine it over time.
The Lenovo ThinkStation PGX - 1TB is a specialized, small-form-factor workstation designed to bridge the gap between consumer-grade workstations and enterprise-grade data center infrastructure. Built around the NVIDIA GB10 Grace Blackwell architecture, this mini-tower is a dedicated platform for local AI development, fine-tuning, and inference. Unlike traditional workstations that rely on x86 architectures, the PGX utilizes a 20-core Arm-based CPU (Cortex-X925 and Cortex-A725) paired with Blackwell-generation Tensor cores, providing a high-efficiency environment for running large language models (LLMs) and agentic workflows at the edge.
For AI engineers and researchers, the ThinkStation PGX represents a move toward decentralized AI. It offers a "sandbox" environment that mirrors the NVIDIA software stack found in DGX systems but at a $4,100 MSRP. This makes it a primary contender for organizations that need to keep sensitive data local while maintaining the performance required for modern, high-parameter models. In the market for AI PCs and laptops, the PGX stands out by prioritizing VRAM capacity and unified memory over traditional desktop versatility.
The core value proposition of the Lenovo ThinkStation PGX - 1TB for AI is its 128GB of unified LPDDR5x memory. In the context of local AI, VRAM is the primary bottleneck; without enough of it, large models simply will not load or will fall back to system RAM, causing performance to crater.
The 273 GB/s memory bandwidth is a critical spec for token generation. While lower than high-end H100 or B200 data center GPUs, it is significantly higher than most consumer-grade laptops and matches or exceeds many high-end desktop configurations. This bandwidth ensures that the Lenovo ThinkStation PGX - 1TB AI inference performance remains stable even when processing long-context windows. Furthermore, the 140W TDP is remarkably efficient for a machine capable of 250 TOPS, making it suitable for continuous edge deployment where power and heat management are concerns.
The ThinkStation PGX is specifically advertised as a platform for hardware for running 200B parameter models. This is made possible through the 128GB VRAM pool and the use of 4-bit or 5-bit quantization (GGUF, EXL2, or AWQ formats).
While actual throughput depends on the specific quantization and optimization (TensorRT-LLM vs. llama.cpp), users can expect:
The Lenovo ThinkStation PGX - 1TB is not a general-purpose gaming rig or a standard office PC. It is a specialized tool for:
Developers building agentic workflows using frameworks like LangChain, CrewAI, or AutoGPT need reliable, local inference to iterate quickly without incurring cloud API costs or latency. The 128GB VRAM allows for running multiple models simultaneously (e.g., a primary reasoning model like DeepSeek-R1 alongside a smaller embedding model).
For industries like healthcare, finance, or defense, the PGX acts as a secure node for local LLM deployment. Its small form factor (150mm x 150mm) allows it to be tucked away in server closets or integrated into medical imaging carts to provide real-time data synthesis without sending data to the cloud.
While the 29.71 TFLOPS of FP16 performance is modest compared to a full DGX H100, the 128GB VRAM makes it an excellent "sandbox" for Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA and QLoRA. Researchers can prototype fine-tuning runs on large models locally before scaling to a cluster.
When evaluating the Lenovo ThinkStation PGX - 1TB vs. competitors, the primary alternatives are the Apple Mac Studio (M2/M3 Ultra) and Custom Multi-GPU Linux Desktops (Dual RTX 3090/4090s).
The Mac Studio is the closest competitor in terms of a compact, high-VRAM "AI PC."
A custom PC with two RTX 4090s provides 48GB of VRAM and significantly higher raw compute (TFLOPS).
The Lenovo ThinkStation PGX - 1TB is the best AI chip for local deployment when the priority is model size and ecosystem compatibility over raw floating-point speed. It is a purpose-built "inference appliance" that simplifies the path from development to local production.
Qwen3-30B-A3BAlibaba | 30B(3B active) | SS | 40.8 tok/s | 5.4 GB | |
| 8B | AA | 38.8 tok/s | 5.7 GB | ||
| 9B | AA | 36.5 tok/s | 6.0 GB | ||
Llama 2 7B ChatMeta | 7B | AA | 45.9 tok/s | 4.8 GB | |
Gemma 4 E2B ITGoogle | 2B | AA | 59.3 tok/s | 3.7 GB | |
Qwen3.6 35B-A3BAlibaba | 35B(3B active) | AA | 25.8 tok/s | 8.5 GB | |
Qwen3.5-35B-A3BAlibaba | 35B(3B active) | AA | 25.8 tok/s | 8.5 GB | |
Mistral 7B InstructMistral AI | 7B | AA | 34.4 tok/s | 6.4 GB | |
| Ad | |||||
Llama 2 13B ChatMeta | 13B | AA | 26.0 tok/s | 8.5 GB | |
Gemma 4 E4B ITGoogle | 4B | AA | 31.8 tok/s | 6.9 GB | |
Gemma 3 4B ITGoogle | 4B | AA | 31.8 tok/s | 6.9 GB | |
Mixtral 8x7B InstructMistral AI | 46.7B(12.9B active) | BB | 19.3 tok/s | 11.4 GB | |
Gemma 4 26B-A4B ITGoogle | 26B(4B active) | BB | 20.0 tok/s | 11.0 GB | |
Mistral Large 3 675BMistral AI | 675B(41B active) | BB | 3.3 tok/s | 66.3 GB | |
GLM-4.6Z.ai | 355B(32B active) | BB | 3.1 tok/s | 70.3 GB | |
DeepSeek-V3DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
| Ad | |||||
DeepSeek-R1DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
DeepSeek-V3.1DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
DeepSeek-V3.2DeepSeek | 685B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
Kimi K2 Instruct 0905Moonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
Kimi K2 ThinkingMoonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
Kimi K2.5Moonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
GLM-5Z.ai | 744B(40B active) | BB | 2.5 tok/s | 87.7 GB | |
GLM-5.1Z.ai | 744B(40B active) | BB | 2.5 tok/s | 87.7 GB | |
| Ad | |||||
Kimi K2.6Moonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 86.2 GB | |


