MSI

MSI EdgeXpert - 13SUS

A flagship GB10 mini-PC engineered with a custom Vapor Chamber, enabling 10% faster sustained LLM token throughput than the reference design.

AI PCs & LaptopsIn Stock

Edge AIHigh Throughput

Buy on Amazon$4,699Calculate ROI

Quick Specs

VRAM128 GB

FP1629.71 TFLOPS

INT8250 TOPS

TDP140 W

Memory BW273 GB/s

Max Params200B

CPU Architecture20-Core Arm (10 Cortex-X925 + 10 Cortex-A725)

Storage InterfacePCIe Gen 5.0

Cooling SystemVapor Chamber

Dimensions151 x 151 x 52 mm

Specifications

The MSI EdgeXpert - 13SUS is a high-density AI workstation designed to bridge the gap between consumer-grade hardware and enterprise-level data center clusters. Built on the NVIDIA GB10 (Grace Blackwell) architecture, this compact "black box" is engineered specifically for high-throughput local inference and edge AI deployment. Unlike standard reference designs, MSI has focused on thermal stability to solve the primary bottleneck of small-form-factor AI hardware: thermal throttling during long-context inference.

At an MSRP of $4,699, the EdgeXpert - 13SUS targets AI engineers, ML researchers, and teams deploying agentic workflows who require massive VRAM capacity without the footprint or noise of a 4U rack server. It competes directly with the NVIDIA DGX Spark and high-end Mac Studio configurations, offering a distinct advantage for those integrated into the CUDA ecosystem.

AI Performance & Specifications

The defining characteristic of the MSI EdgeXpert - 13SUS for AI workloads is its 128GB of unified LPDDR5x memory. For practitioners, VRAM is the primary constraint for local LLM deployment; the 13SUS effectively removes this ceiling for the vast majority of open-source models.

Compute and Throughput

INT8 Performance: 250 TOPS. This makes the unit a powerhouse for quantized inference, where most local LLM deployments live.
FP16 Performance: 29.71 TFLOPS. While primarily an inference machine, this provides enough raw compute for fine-tuning tasks (LoRA/QLoRA) on smaller model architectures.
Memory Bandwidth: 273 GB/s. In LLM terms, memory bandwidth is the "speed limit" for token generation. At 273 GB/s, this unit delivers significantly higher throughput than standard consumer desktops, though it sits below the bandwidth of H100-tier data center GPUs.

Thermal Engineering and Sustained Performance

The "13SUS" designation highlights MSI’s custom cooling solution. By implementing a high-end Vapor Chamber (VC) coupled with a three-heat-pipe module and large-area copper fins, the EdgeXpert maintains a 10% faster sustained token throughput compared to the standard GB10 reference design. In testing, the chassis runs up to 15°C cooler than the DGX Spark, preventing the frequency drops that typically plague mini-PCs during the generation of long-form responses or complex reasoning chains.

What Models Can It Run?

The MSI EdgeXpert - 13SUS is capable of running models with up to 200 billion parameters. This puts it in a rare class of hardware that can host "frontier-class" open-weight models locally.

Large Language Models (LLMs)

Llama 3.1 405B: While the full 405B model exceeds the 128GB VRAM, the EdgeXpert is the "sweet spot" for Llama 3.1 70B at FP16 or high-bitrate GGUF/EXL2 quantizations. You can run multiple 70B instances or a single 70B model with a massive 128k context window.
DeepSeek-V3 / DeepSeek-R1: At 4-bit or 5-bit quantization (using IQ4_XS or similar), these massive MoE (Mixture of Experts) models can fit within the 128GB envelope, allowing for private, local "reasoning" capabilities.
Qwen 2.5 72B: Can be run at full 16-bit precision or with maximum context, delivering exceptionally low latency for agentic tool-calling.

Multimodal and Vision Models

The 128GB VRAM is particularly useful for Stable Diffusion XL or Flux.1 workflows, especially when running batch generations or training LoRAs. For video generation (Sora-like architectures or CogVideo), the VRAM capacity allows for longer temporal consistency without offloading to slower system RAM.

Quantization Tradeoffs

On this hardware, the best quality-to-speed tradeoff is typically found at 6-bit or 8-bit quantization. While 4-bit is faster, the 128GB VRAM is large enough that you don’t need to sacrifice perplexity for memory savings on 70B-class models.

Use Cases & Target Audience

Local AI Agent Development

For developers building agentic workflows (AutoGPT, CrewAI, LangGraph), the EdgeXpert - 13SUS acts as a local "brain." Agents often require multiple model calls or long-running processes; the 140W TDP and Vapor Chamber cooling ensure the system remains stable over hours of autonomous operation.

Edge Deployment & Inference Servers

With dimensions of just 151 x 151 x 52 mm, this unit is designed for edge deployment. It can serve as a localized inference node for a small team, providing API access to internal models without the latency or privacy concerns of the cloud. The inclusion of PCIe Gen 5.0 and a 4TB Gen 5 SSD ensures that model weights load into VRAM almost instantaneously.

Privacy-First ML Research

Researchers working with sensitive datasets (medical, legal, or proprietary IP) can utilize the 13SUS to run 200B parameter models entirely offline. The 20-core Arm architecture (10 Cortex-X925 + 10 Cortex-A725) provides a modern, efficient environment for managing data pre-processing and model orchestration.

How It Compares

MSI EdgeXpert - 13SUS vs. NVIDIA DGX Spark

Both systems utilize the GB10 architecture, but the EdgeXpert is the superior choice for sustained workloads. MSI’s thermal design allows for a 12% higher power draw when needed, translating to a 10% performance lead in tokens per second (TPS). If your workload involves continuous inference, the EdgeXpert's cooling prevents the performance "cliff" found in the Spark.

MSI EdgeXpert - 13SUS vs. Apple Mac Studio (M2/M3 Ultra)

The Mac Studio offers higher peak memory bandwidth (up to 800 GB/s), which can result in faster raw TPS for some models. However, the EdgeXpert - 13SUS is built on NVIDIA CUDA, providing native support for the entire ecosystem of AI tools, kernels (FlashAttention, PagedAttention), and libraries that often arrive on Linux/Windows months before macOS. For engineers who need to mirror their production environment (usually NVIDIA-based), the EdgeXpert is the more practical developer tool.

MSI EdgeXpert - 13SUS vs. Custom Multi-GPU PC

A custom PC with dual RTX 3090/4090s provides 48GB of VRAM and higher raw TFLOPS but consumes 600W-850W and requires a massive tower. The EdgeXpert provides nearly 3x the VRAM (128GB) in a chassis that fits on a desk, making it the better choice for running the largest models (100B+ params) that simply won't fit on consumer GPU arrays without significant quantization loss.

Compatible AI Models

Hide F tierOnly popular models

56 models


Qwen3-30B-A3BAlibaba Cloud (Qwen)	30B(3B active)	SS	40.8 tok/s	5.4 GB
Llama 3 8B InstructMeta	8B	AA	38.8 tok/s	5.7 GB
Carnice-9b for Hermes agentkai-os	9B	AA	36.5 tok/s	6.0 GB
Llama 2 7B ChatMeta	7B	AA	45.9 tok/s	4.8 GB
Gemma 4 E2B ITGoogle	2B	AA	59.3 tok/s	3.7 GB
Qwen3.6 35B-A3BAlibaba Cloud	35B(3B active)	AA	25.8 tok/s	8.5 GB
Qwen3.5-35B-A3BAlibaba Cloud (Qwen)	35B(3B active)	AA	25.8 tok/s	8.5 GB
Mistral 7B InstructMistral AI	7B	AA	34.4 tok/s	6.4 GB
Llama 2 13B ChatMeta	13B	AA	26.0 tok/s	8.5 GB
Gemma 4 E4B ITGoogle	4B	AA	31.8 tok/s	6.9 GB
Gemma 3 4B ITGoogle	4B	AA	31.8 tok/s	6.9 GB
Mixtral 8x7B InstructMistral AI	46.7B(12.9B active)	BB	19.3 tok/s	11.4 GB
Gemma 4 26B-A4B ITGoogle	26B(4B active)	BB	20.0 tok/s	11.0 GB
Mistral Large 3 675BMistral AI	675B(41B active)	BB	3.3 tok/s	66.3 GB
GLM-4.6Z.ai	355B(32B active)	BB	3.1 tok/s	70.3 GB
DeepSeek-V3DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-R1DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-V3.1DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-V3.2DeepSeek	685B(37B active)	BB	3.7 tok/s	59.8 GB
Kimi K2 Instruct 0905Moonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
Kimi K2 ThinkingMoonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
Kimi K2.5Moonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
GLM-5Z.ai	744B(40B active)	BB	2.5 tok/s	87.7 GB
GLM-5.1Z.ai	744B(40B active)	BB	2.5 tok/s	87.7 GB
Kimi K2.6Moonshot AI	1000B(32B active)	BB	2.6 tok/s	86.2 GB

Rows per page

Page 1 of 3

MSI EdgeXpert - 13SUS

A flagship GB10 mini-PC engineered with a custom Vapor Chamber, enabling 10% faster sustained LLM token throughput than the reference design.

AI PCs & LaptopsIn Stock

Edge AIHigh Throughput

Buy on Amazon$4,699Calculate ROI

Quick Specs

VRAM128 GB

FP1629.71 TFLOPS

INT8250 TOPS

TDP140 W

Memory BW273 GB/s

Max Params200B

CPU Architecture20-Core Arm (10 Cortex-X925 + 10 Cortex-A725)

Storage InterfacePCIe Gen 5.0

Cooling SystemVapor Chamber

Dimensions151 x 151 x 52 mm

Specifications

AI Performance & Specifications

Compute and Throughput

INT8 Performance: 250 TOPS. This makes the unit a powerhouse for quantized inference, where most local LLM deployments live.
FP16 Performance: 29.71 TFLOPS. While primarily an inference machine, this provides enough raw compute for fine-tuning tasks (LoRA/QLoRA) on smaller model architectures.
Memory Bandwidth: 273 GB/s. In LLM terms, memory bandwidth is the "speed limit" for token generation. At 273 GB/s, this unit delivers significantly higher throughput than standard consumer desktops, though it sits below the bandwidth of H100-tier data center GPUs.

Thermal Engineering and Sustained Performance

What Models Can It Run?

The MSI EdgeXpert - 13SUS is capable of running models with up to 200 billion parameters. This puts it in a rare class of hardware that can host "frontier-class" open-weight models locally.

Large Language Models (LLMs)

Llama 3.1 405B: While the full 405B model exceeds the 128GB VRAM, the EdgeXpert is the "sweet spot" for Llama 3.1 70B at FP16 or high-bitrate GGUF/EXL2 quantizations. You can run multiple 70B instances or a single 70B model with a massive 128k context window.
DeepSeek-V3 / DeepSeek-R1: At 4-bit or 5-bit quantization (using IQ4_XS or similar), these massive MoE (Mixture of Experts) models can fit within the 128GB envelope, allowing for private, local "reasoning" capabilities.
Qwen 2.5 72B: Can be run at full 16-bit precision or with maximum context, delivering exceptionally low latency for agentic tool-calling.

Multimodal and Vision Models

Quantization Tradeoffs

Use Cases & Target Audience

Local AI Agent Development

Edge Deployment & Inference Servers

Privacy-First ML Research

How It Compares

MSI EdgeXpert - 13SUS vs. NVIDIA DGX Spark

MSI EdgeXpert - 13SUS vs. Apple Mac Studio (M2/M3 Ultra)

MSI EdgeXpert - 13SUS vs. Custom Multi-GPU PC

Compatible AI Models

Hide F tierOnly popular models

56 models


Qwen3-30B-A3BAlibaba Cloud (Qwen)	30B(3B active)	SS	40.8 tok/s	5.4 GB
Llama 3 8B InstructMeta	8B	AA	38.8 tok/s	5.7 GB
Carnice-9b for Hermes agentkai-os	9B	AA	36.5 tok/s	6.0 GB
Llama 2 7B ChatMeta	7B	AA	45.9 tok/s	4.8 GB
Gemma 4 E2B ITGoogle	2B	AA	59.3 tok/s	3.7 GB
Qwen3.6 35B-A3BAlibaba Cloud	35B(3B active)	AA	25.8 tok/s	8.5 GB
Qwen3.5-35B-A3BAlibaba Cloud (Qwen)	35B(3B active)	AA	25.8 tok/s	8.5 GB
Mistral 7B InstructMistral AI	7B	AA	34.4 tok/s	6.4 GB
Llama 2 13B ChatMeta	13B	AA	26.0 tok/s	8.5 GB
Gemma 4 E4B ITGoogle	4B	AA	31.8 tok/s	6.9 GB
Gemma 3 4B ITGoogle	4B	AA	31.8 tok/s	6.9 GB
Mixtral 8x7B InstructMistral AI	46.7B(12.9B active)	BB	19.3 tok/s	11.4 GB
Gemma 4 26B-A4B ITGoogle	26B(4B active)	BB	20.0 tok/s	11.0 GB
Mistral Large 3 675BMistral AI	675B(41B active)	BB	3.3 tok/s	66.3 GB
GLM-4.6Z.ai	355B(32B active)	BB	3.1 tok/s	70.3 GB
DeepSeek-V3DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-R1DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-V3.1DeepSeek	671B(37B active)	BB	3.7 tok/s	59.8 GB
DeepSeek-V3.2DeepSeek	685B(37B active)	BB	3.7 tok/s	59.8 GB
Kimi K2 Instruct 0905Moonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
Kimi K2 ThinkingMoonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
Kimi K2.5Moonshot AI	1000B(32B active)	BB	2.6 tok/s	84.6 GB
GLM-5Z.ai	744B(40B active)	BB	2.5 tok/s	87.7 GB
GLM-5.1Z.ai	744B(40B active)	BB	2.5 tok/s	87.7 GB
Kimi K2.6Moonshot AI	1000B(32B active)	BB	2.6 tok/s	86.2 GB

Rows per page

Page 1 of 3

MSI EdgeXpert - 13SUS

Quick Specs

Specifications

AI Performance & Specifications

Compute and Throughput

Thermal Engineering and Sustained Performance

What Models Can It Run?

Large Language Models (LLMs)

Multimodal and Vision Models

Quantization Tradeoffs

Use Cases & Target Audience

Local AI Agent Development

Edge Deployment & Inference Servers

Privacy-First ML Research

How It Compares

MSI EdgeXpert - 13SUS vs. NVIDIA DGX Spark

MSI EdgeXpert - 13SUS vs. Apple Mac Studio (M2/M3 Ultra)

MSI EdgeXpert - 13SUS vs. Custom Multi-GPU PC

Compatible AI Models

Similar Products

Framework Laptop 13 Pro

Lenovo ThinkStation PGX - 4TB

Lenovo ThinkStation PGX - 1TB

HP ZGX Nano AI Station

MSI EdgeXpert - 13SUS

Quick Specs

Specifications

AI Performance & Specifications

Compute and Throughput

Thermal Engineering and Sustained Performance

What Models Can It Run?

Large Language Models (LLMs)

Multimodal and Vision Models

Quantization Tradeoffs

Use Cases & Target Audience

Local AI Agent Development

Edge Deployment & Inference Servers

Privacy-First ML Research

How It Compares

MSI EdgeXpert - 13SUS vs. NVIDIA DGX Spark

MSI EdgeXpert - 13SUS vs. Apple Mac Studio (M2/M3 Ultra)

MSI EdgeXpert - 13SUS vs. Custom Multi-GPU PC

Compatible AI Models

Similar Products

Framework Laptop 13 Pro

Lenovo ThinkStation PGX - 4TB

Lenovo ThinkStation PGX - 1TB

HP ZGX Nano AI Station