SuperMicro

SuperMicro Super AI Station

A 5U enterprise-grade tower delivering 20 PFLOPS of AI compute, featuring closed-loop liquid cooling and dedicated BMC management.

AI PCs & LaptopsAnnounced

EnterpriseProduction ReadyHigh Throughput

Buy on Manufacturer

Quick Specs

VRAM748 GB

FP165000 TFLOPS

INT8330 TOPS

TDP1400 W

Memory BW7100 GB/s

Max Params1T

GPU ArchitectureBlackwell Ultra (B300)

Cooling SystemClosed-loop liquid cooling (Direct to Chip)

Power Supply1600W Titanium Level (94% efficiency)

Form Factor5U / Tower

Specifications

The SuperMicro Super AI Station is a 5U enterprise-grade tower designed to bridge the gap between consumer-grade workstations and rack-mounted data center infrastructure. Built on the NVIDIA Blackwell Ultra (B300) architecture, this system is engineered specifically for the development and deployment of autonomous agents and frontier-class models. By packaging the GB300 Grace Blackwell Ultra Superchip into a deskside form factor, SuperMicro is targeting AI engineers and researchers who require data-center-level performance without the infrastructure requirements of a traditional server room.

Unlike standard workstations that rely on PCIe-based GPU expansion, the Super AI Station operates as a unified "AI Factory" in a box. It is positioned as the premier hardware for local AI agents, offering the compute density required for long-running autonomous workflows. While it competes with high-end Mac Studio configurations for developer mindshare, its raw throughput and 748GB of unified VRAM place it in a category of its own, outclassing even multi-GPU RTX 6000 Ada builds in memory bandwidth and total parameter capacity.

AI Performance & Specifications

For AI inference, the most critical bottleneck is often memory bandwidth and VRAM capacity. The Super AI Station addresses these directly with 748GB of high-bandwidth memory and a staggering 7100 GB/s of memory bandwidth. To put this in perspective, this is roughly 7x the bandwidth of a top-tier consumer GPU, which translates directly into superior tokens-per-second (TPS) for large-batch inference and high-concurrency agentic workloads.

Key Technical Specifications:

VRAM: 748 GB (Unified Memory)
Memory Bandwidth: 7100 GB/s
FP16 Performance: 5000 TFLOPS
INT8 Performance: 330 TOPS
GPU Architecture: Blackwell Ultra (B300)
TDP / Power: 1400W (NEMA 5-20 compatible)
Cooling: Closed-loop Direct-to-Chip liquid cooling

The Super AI Station features a 1600W Titanium Level power supply with 94% efficiency, allowing it to run on a conventional 20A circuit. The integrated closed-loop liquid cooling system is a critical design choice for practitioners; it enables the system to maintain peak performance during sustained training or long-context inference while remaining quiet enough for a shared office environment. Additionally, the inclusion of a dedicated BMC (Baseboard Management Controller) allows for enterprise-level remote management, a feature typically absent from consumer AI PCs.

What Models Can It Run?

The 748GB VRAM capacity makes the Super AI Station one of the few deskside solutions capable of running 1-trillion parameter models locally. This is a significant milestone for privacy-conscious enterprises and researchers who cannot leak proprietary data to cloud-based APIs.

Large Language Models (LLMs)

Llama 3.1 405B: This system can run the full FP16 version of Llama 3.1 405B with room to spare for massive KV caches. At 4-bit or 8-bit quantization, you can run multiple instances of 405B simultaneously for agentic swarms.
DeepSeek-V3 / R1: With 748GB of VRAM, the Super AI Station is the "sweet spot" for DeepSeek’s MoE (Mixture of Experts) models. It can comfortably host the full-weight models while maintaining extremely high throughput due to the 7100 GB/s bandwidth.
1T+ Parameter Models: The system is explicitly designed for frontier-class models approaching the 1-trillion parameter mark, which previously required a multi-node cluster.

Multimodal and Long-Context Performance

The massive memory overhead is particularly beneficial for long-context tasks (e.g., analyzing 100k+ token documents). While consumer cards like the RTX 4090 (24GB) struggle with context window expansion, the Super AI Station allows for massive context windows in models like Qwen 2.5 or Mixtral 8x22B without hitting OOM (Out of Memory) errors. For multimodal workloads, such as running local video-to-text or high-resolution image generation (Stable Diffusion 3 / Flux.1), the 5000 TFLOPS of FP16 performance ensures near-instantaneous generation.

Use Cases & Target Audience

The SuperMicro Super AI Station is not a general-purpose workstation; it is a specialized tool for high-throughput AI production.

Agentic AI & Autonomous Workflows

The primary use case for this hardware is running autonomous agents. Using frameworks like NVIDIA NemoClaw, developers can deploy agents that operate 24/7. The high memory bandwidth allows the system to handle the rapid-fire reasoning loops required for agents to browse the web, write code, and execute tasks without the latency of cloud round-trips.

Fine-Tuning and LoRA Training

While primarily marketed for inference, the 5000 TFLOPS of FP16 compute makes this an exceptional machine for fine-tuning. ML researchers can perform full-parameter fine-tuning on 70B models or extensive LoRA training on 400B+ models locally. This is a game-changer for teams working with sensitive datasets in healthcare, finance, or defense.

Multi-User Private Cloud

For small AI startups or research labs, the Super AI Station can serve as a centralized inference server. The dedicated BMC and enterprise networking capabilities allow it to be partitioned or shared across a team, providing a "private cloud" experience where multiple developers can run inference against shared local weights.

How It Compares

When evaluating the Super AI Station, practitioners typically look at three alternatives: high-end Mac Studios, DIY multi-GPU builds, and enterprise rack servers.

Super AI Station vs. Mac Studio (M2/M3 Ultra)

The Mac Studio is the popular choice for local LLMs due to its unified memory (up to 192GB). However, the Super AI Station offers nearly 4x the VRAM and significantly higher compute throughput. While the Mac is a silent consumer device, the Super AI Station is a production-grade machine capable of running 1T models that the Mac simply cannot fit in memory.

Super AI Station vs. Multi-GPU (4x RTX 6000 Ada)

A workstation with four RTX 6000 Ada cards provides 192GB of VRAM. To match the 748GB of the Super AI Station, you would need multiple linked workstations, which introduces massive bottlenecks in NVLink or PCIe communication. The Super AI Station’s Blackwell Ultra architecture provides a unified memory pool and bandwidth that a multi-GPU PCIe setup cannot replicate, making it the superior choice for high-throughput AI inference and large-scale model development.

Super AI Station vs. NVIDIA DGX

The Super AI Station brings DGX-level performance to a deskside form factor. While a DGX is designed for a data center rack with 3-phase power and industrial cooling, the Super AI Station is optimized for the "Edge" and "Prosumer" markets, offering a plug-and-play experience on standard NEMA 5-20 outlets. It is the logical choice for teams that need data center power but lack the facilities to house a traditional server.

Compatible AI Models

Hide F tierOnly popular models

148 models


Llama 4 MaverickMeta	400B(17B active)	SS	39.1 tok/s	146.4 GB
Llama 3.1 70B InstructMeta	70B	SS	50.7 tok/s	112.8 GB
Llama 3.3 70B InstructMeta	70B	SS	50.7 tok/s	112.8 GB
Nvidia Nemotron 3 SuperNVIDIA	120B(12B active)	SS	55.2 tok/s	103.5 GB
GLM-5Z.ai	744B(40B active)	SS	65.2 tok/s	87.7 GB
GLM-5.1Z.ai	744B(40B active)	SS	65.2 tok/s	87.7 GB
Kimi K2.6Moonshot AI	1000B(32B active)	SS	66.3 tok/s	86.2 GB
Kimi K2 Instruct 0905Moonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
Kimi K2 ThinkingMoonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
Kimi K2.5Moonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
GLM-4.6Z.ai	355B(32B active)	SS	81.3 tok/s	70.3 GB
Mistral Large 3 675BMistral AI	675B(41B active)	SS	86.3 tok/s	66.3 GB
DeepSeek-V3DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-R1DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-V3.1DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-V3.2DeepSeek	685B(37B active)	SS	95.5 tok/s	59.8 GB
GLM-4.5Z.ai	355B(32B active)	SS	110.3 tok/s	51.8 GB
GLM-4.7Z.ai	358B(32B active)	SS	108.6 tok/s	52.6 GB
Kimi K2 InstructMoonshot AI	1000B(32B active)	SS	110.3 tok/s	51.8 GB
Llama 3 70B InstructMeta	70B	SS	125.1 tok/s	45.7 GB
Qwen3.5-397B-A17BAlibaba Cloud (Qwen)	397B(17B active)	SS	124.2 tok/s	46.0 GB
Llama 2 70B ChatMeta	70B	SS	131.7 tok/s	43.4 GB
Mixtral 8x22B InstructMistral AI	141B(39B active)	SS	131.2 tok/s	43.6 GB
Qwen 3.5 OmniAlibaba Cloud	397B(17B active)	SS	126.5 tok/s	45.2 GB
Qwen3-235B-A22BAlibaba Cloud (Qwen)	235B(22B active)	SS	157.3 tok/s	36.3 GB

Rows per page

Page 1 of 6

SuperMicro Super AI Station

A 5U enterprise-grade tower delivering 20 PFLOPS of AI compute, featuring closed-loop liquid cooling and dedicated BMC management.

AI PCs & LaptopsAnnounced

EnterpriseProduction ReadyHigh Throughput

Buy on Manufacturer

Quick Specs

VRAM748 GB

FP165000 TFLOPS

INT8330 TOPS

TDP1400 W

Memory BW7100 GB/s

Max Params1T

GPU ArchitectureBlackwell Ultra (B300)

Cooling SystemClosed-loop liquid cooling (Direct to Chip)

Power Supply1600W Titanium Level (94% efficiency)

Form Factor5U / Tower

Specifications

AI Performance & Specifications

Key Technical Specifications:

VRAM: 748 GB (Unified Memory)
Memory Bandwidth: 7100 GB/s
FP16 Performance: 5000 TFLOPS
INT8 Performance: 330 TOPS
GPU Architecture: Blackwell Ultra (B300)
TDP / Power: 1400W (NEMA 5-20 compatible)
Cooling: Closed-loop Direct-to-Chip liquid cooling

What Models Can It Run?

Large Language Models (LLMs)

Llama 3.1 405B: This system can run the full FP16 version of Llama 3.1 405B with room to spare for massive KV caches. At 4-bit or 8-bit quantization, you can run multiple instances of 405B simultaneously for agentic swarms.
DeepSeek-V3 / R1: With 748GB of VRAM, the Super AI Station is the "sweet spot" for DeepSeek’s MoE (Mixture of Experts) models. It can comfortably host the full-weight models while maintaining extremely high throughput due to the 7100 GB/s bandwidth.
1T+ Parameter Models: The system is explicitly designed for frontier-class models approaching the 1-trillion parameter mark, which previously required a multi-node cluster.

Multimodal and Long-Context Performance

Use Cases & Target Audience

The SuperMicro Super AI Station is not a general-purpose workstation; it is a specialized tool for high-throughput AI production.

Agentic AI & Autonomous Workflows

Fine-Tuning and LoRA Training

Multi-User Private Cloud

How It Compares

When evaluating the Super AI Station, practitioners typically look at three alternatives: high-end Mac Studios, DIY multi-GPU builds, and enterprise rack servers.

Super AI Station vs. Mac Studio (M2/M3 Ultra)

Super AI Station vs. Multi-GPU (4x RTX 6000 Ada)

Super AI Station vs. NVIDIA DGX

Compatible AI Models

Hide F tierOnly popular models

148 models


Llama 4 MaverickMeta	400B(17B active)	SS	39.1 tok/s	146.4 GB
Llama 3.1 70B InstructMeta	70B	SS	50.7 tok/s	112.8 GB
Llama 3.3 70B InstructMeta	70B	SS	50.7 tok/s	112.8 GB
Nvidia Nemotron 3 SuperNVIDIA	120B(12B active)	SS	55.2 tok/s	103.5 GB
GLM-5Z.ai	744B(40B active)	SS	65.2 tok/s	87.7 GB
GLM-5.1Z.ai	744B(40B active)	SS	65.2 tok/s	87.7 GB
Kimi K2.6Moonshot AI	1000B(32B active)	SS	66.3 tok/s	86.2 GB
Kimi K2 Instruct 0905Moonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
Kimi K2 ThinkingMoonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
Kimi K2.5Moonshot AI	1000B(32B active)	SS	67.6 tok/s	84.6 GB
GLM-4.6Z.ai	355B(32B active)	SS	81.3 tok/s	70.3 GB
Mistral Large 3 675BMistral AI	675B(41B active)	SS	86.3 tok/s	66.3 GB
DeepSeek-V3DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-R1DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-V3.1DeepSeek	671B(37B active)	SS	95.5 tok/s	59.8 GB
DeepSeek-V3.2DeepSeek	685B(37B active)	SS	95.5 tok/s	59.8 GB
GLM-4.5Z.ai	355B(32B active)	SS	110.3 tok/s	51.8 GB
GLM-4.7Z.ai	358B(32B active)	SS	108.6 tok/s	52.6 GB
Kimi K2 InstructMoonshot AI	1000B(32B active)	SS	110.3 tok/s	51.8 GB
Llama 3 70B InstructMeta	70B	SS	125.1 tok/s	45.7 GB
Qwen3.5-397B-A17BAlibaba Cloud (Qwen)	397B(17B active)	SS	124.2 tok/s	46.0 GB
Llama 2 70B ChatMeta	70B	SS	131.7 tok/s	43.4 GB
Mixtral 8x22B InstructMistral AI	141B(39B active)	SS	131.2 tok/s	43.6 GB
Qwen 3.5 OmniAlibaba Cloud	397B(17B active)	SS	126.5 tok/s	45.2 GB
Qwen3-235B-A22BAlibaba Cloud (Qwen)	235B(22B active)	SS	157.3 tok/s	36.3 GB

Rows per page

Page 1 of 6

SuperMicro Super AI Station

Quick Specs

Specifications

AI Performance & Specifications

Key Technical Specifications:

What Models Can It Run?

Large Language Models (LLMs)

Multimodal and Long-Context Performance

Use Cases & Target Audience

Agentic AI & Autonomous Workflows

Fine-Tuning and LoRA Training

Multi-User Private Cloud

How It Compares

Super AI Station vs. Mac Studio (M2/M3 Ultra)

Super AI Station vs. Multi-GPU (4x RTX 6000 Ada)

Super AI Station vs. NVIDIA DGX

Compatible AI Models

Similar Products

Lenovo ThinkStation PGX - 4TB

Lenovo ThinkStation PGX - 1TB

HP ZGX Nano AI Station

GIGABYTE AI TOP ATOM

SuperMicro Super AI Station

Quick Specs

Specifications

AI Performance & Specifications

Key Technical Specifications:

What Models Can It Run?

Large Language Models (LLMs)

Multimodal and Long-Context Performance

Use Cases & Target Audience

Agentic AI & Autonomous Workflows

Fine-Tuning and LoRA Training

Multi-User Private Cloud

How It Compares

Super AI Station vs. Mac Studio (M2/M3 Ultra)

Super AI Station vs. Multi-GPU (4x RTX 6000 Ada)

Super AI Station vs. NVIDIA DGX

Compatible AI Models

Similar Products

Lenovo ThinkStation PGX - 4TB

Lenovo ThinkStation PGX - 1TB

HP ZGX Nano AI Station

GIGABYTE AI TOP ATOM