
An $85,000 liquid-cooled GB300 monolith delivering 20 PFLOPS, engineered with dual 400GbE ports and PCIe 5.0 expansion slots for discrete GPUs.
The MSI XpertStation WS300 is a liquid-cooled, deskside AI supercomputer designed to bridge the gap between traditional workstations and data center racks. Built on the NVIDIA DGX Station architecture and powered by the GB300 Grace Blackwell Ultra Desktop Superchip, this "monolith" is engineered for organizations that require massive local compute without the overhead of a dedicated server room. At an MSRP of $85,000, it targets enterprise R&D teams, private AI labs, and developers building sovereign AI infrastructure.
While marketed under MSI’s workstation line, the WS300 is functionally a localized node of the Blackwell architecture. It competes directly with high-end multi-GPU H100/H200 configurations and specialized hardware like the Apple Mac Studio (M2/M3 Ultra) or Lambda Labs Vector workstations. However, the WS300 differentiates itself through its unified memory architecture and massive 7.1 TB/s bandwidth, which eliminates the PCIe bottlenecks common in standard multi-GPU setups.
For AI engineers, the most critical metric of the WS300 is its 748 GB of unified VRAM. Unlike traditional setups where model weights must be sharded across multiple discrete GPUs, the Blackwell Ultra Superchip allows the entire memory pool to be addressed coherently. This is vital for maintaining high throughput during the KV cache expansion of long-context windows.
The 7100 GB/s memory bandwidth is the standout specification. For local LLM inference, performance is almost always memory-bandwidth limited rather than compute-limited. This bandwidth allows the WS300 to achieve token generation speeds that dwarf standard 4090 or even A100-based workstations. Furthermore, the inclusion of dual 400GbE ports ensures that this machine can function as a high-speed node in a larger cluster, capable of feeding data at rates required for real-time RAG (Retrieval-Augmented Generation) at scale.
The MSI XpertStation WS300 is specifically designed for 1T (one trillion) parameter models. While most consumer and prosumer hardware struggles with 70B or 400B models, the WS300 provides enough headroom to run the world's most capable open-weights models at high precision.
On a Llama 3.1 70B model, the WS300 is capable of saturating the inference pipeline, likely delivering hundreds of tokens per second. For the 405B variant, the 7.1 TB/s bandwidth ensures that the user experience remains fluid, avoiding the "word-at-a-time" crawl seen on lesser hardware.
The MSI XpertStation WS300 is not a general-purpose workstation; it is a dedicated AI development and inference environment.
The MSI XpertStation WS300 sits in a unique price-to-performance bracket.
The WS300 is the direct spiritual successor. While the older A100 stations provided 320GB of VRAM, the WS300 more than doubles this to 748GB. More importantly, the jump to the Blackwell architecture provides a massive increase in FP16 compute, making the WS300 significantly more "future-proof" for the next three years of model scaling.
A custom-built server with eight RTX 6000 Ada GPUs provides 384GB of VRAM and costs roughly $60,000–$70,000. While cheaper, that setup suffers from PCIe scaling issues and lacks the unified memory pool of the Grace Blackwell Superchip. The WS300 offers nearly double the VRAM and much higher memory bandwidth (7.1 TB/s vs ~7.6 TB/s aggregate but split across 8 buses), making the WS300 the superior choice for single-model inference of massive 400B+ parameter sets.
Choose the WS300 if your workload requires running models larger than 400B parameters locally with maximum throughput. If you are building an "AI-first" office where multiple engineers need to hit a local inference API simultaneously, the WS300 is the most compact and powerful way to deliver that capability without a server rack.
Llama 4 MaverickMeta | 400B(17B active) | SS | 39.1 tok/s | 146.4 GB | |
| 70B | SS | 50.7 tok/s | 112.8 GB | ||
| 70B | SS | 50.7 tok/s | 112.8 GB | ||
Nvidia Nemotron 3 SuperNVIDIA | 120B(12B active) | SS | 55.2 tok/s | 103.5 GB | |
GLM-5Z.ai | 744B(40B active) | SS | 65.2 tok/s | 87.7 GB | |
GLM-5.1Z.ai | 744B(40B active) | SS | 65.2 tok/s | 87.7 GB | |
Kimi K2.6Moonshot AI | 1000B(32B active) | SS | 66.3 tok/s | 86.2 GB | |
Kimi K2 Instruct 0905Moonshot AI | 1000B(32B active) | SS | 67.6 tok/s | 84.6 GB | |
Kimi K2 ThinkingMoonshot AI | 1000B(32B active) | SS | 67.6 tok/s | 84.6 GB | |
Kimi K2.5Moonshot AI | 1000B(32B active) | SS | 67.6 tok/s | 84.6 GB | |
GLM-4.6Z.ai | 355B(32B active) | SS | 81.3 tok/s | 70.3 GB | |
Mistral Large 3 675BMistral AI | 675B(41B active) | SS | 86.3 tok/s | 66.3 GB | |
DeepSeek-V3DeepSeek | 671B(37B active) | SS | 95.5 tok/s | 59.8 GB | |
DeepSeek-R1DeepSeek | 671B(37B active) | SS | 95.5 tok/s | 59.8 GB | |
DeepSeek-V3.1DeepSeek | 671B(37B active) | SS | 95.5 tok/s | 59.8 GB | |
DeepSeek-V3.2DeepSeek | 685B(37B active) | SS | 95.5 tok/s | 59.8 GB | |
GLM-4.5Z.ai | 355B(32B active) | SS | 110.3 tok/s | 51.8 GB | |
GLM-4.7Z.ai | 358B(32B active) | SS | 108.6 tok/s | 52.6 GB | |
Kimi K2 InstructMoonshot AI | 1000B(32B active) | SS | 110.3 tok/s | 51.8 GB | |
| 70B | SS | 125.1 tok/s | 45.7 GB | ||
Qwen3.5-397B-A17BAlibaba Cloud (Qwen) | 397B(17B active) | SS | 124.2 tok/s | 46.0 GB | |
Llama 2 70B ChatMeta | 70B | SS | 131.7 tok/s | 43.4 GB | |
Mixtral 8x22B InstructMistral AI | 141B(39B active) | SS | 131.2 tok/s | 43.6 GB | |
Qwen 3.5 OmniAlibaba Cloud | 397B(17B active) | SS | 126.5 tok/s | 45.2 GB | |
Qwen3-235B-A22BAlibaba Cloud (Qwen) | 235B(22B active) | SS | 157.3 tok/s | 36.3 GB |