
A compact AI edge device featuring a 4TB PCIe Gen 5.0 SSD, highly optimized for thermal efficiency and dual-system stacking.
The ASUS Ascent GX10 - 4TB is a high-density, small-form-factor AI workstation designed to bridge the gap between consumer desktops and enterprise-grade rack servers. Built around the NVIDIA Grace Blackwell (GB10) Superchip, this device integrates an ARM v9.2-A CPU with a Blackwell-architecture GPU. For AI engineers and researchers, the standout feature of this specific 4TB SKU is the inclusion of a PCIe Gen 5.0 x4 SSD, providing the massive I/O throughput necessary for loading large model weights into memory rapidly.
Positioned as a "Desktop AI Supercomputer," the GX10 competes directly with the Mac Studio (M2/M3 Ultra) and specialized edge AI boxes from vendors like GIGABYTE and Dell. However, the GX10 distinguishes itself through its thermal design—utilizing a dual vapor chamber and a three-fan array—and its focus on "dual-system stacking," allowing teams to scale local compute by physically stacking units with coherent interconnects.
For AI workloads, the hardware's value is defined by its memory architecture and compute throughput. The GX10 manages 250 INT8 TOPS, making it a powerhouse for quantized inference at the edge.
The 128GB VRAM capacity is the primary selling point for the ASUS Ascent GX10 - 4TB. This allows practitioners to run models that are typically reserved for data center hardware.
While exact tokens per second depend on the quantization method (K-Quants vs. AWQ), users can expect:
The ASUS Ascent GX10 - 4TB is not a gaming machine or a general-purpose office PC; it is a specialized tool for the following personas:
When evaluating the ASUS Ascent GX10 - 4TB, practitioners typically look at two main alternatives:
The Mac Studio is the GX10's closest competitor in the prosumer space.
The ASUS Ascent GX10 - 4TB is a purpose-built inference engine. For teams that need to run 70B+ parameter models locally with a native NVIDIA software stack, but cannot justify the cost or power requirements of an H100, the GX10 is the current benchmark for compact AI compute.
Qwen3-30B-A3BAlibaba Cloud (Qwen) | 30B(3B active) | SS | 40.8 tok/s | 5.4 GB | |
BAGEL-7B-MoTBytedance | 14B(7B active) | AA | 45.9 tok/s | 4.8 GB | |
Stable Diffusion 3.5 LargeStability AI | 8.1B | AA | 40.2 tok/s | 5.5 GB | |
e5-mistral-7b-instructintfloat (Microsoft Research) | 7.1B | AA | 45.9 tok/s | 4.8 GB | |
SFR-Embedding-MistralSalesforce | 7.1B | AA | 45.9 tok/s | 4.8 GB | |
Linq-Embed-MistralLinq AI Research | 7.1B | AA | 45.9 tok/s | 4.8 GB | |
GritLM-7BGritLM (Contextual AI) | 7.2B | AA | 45.3 tok/s | 4.9 GB | |
llama-embed-nemotron-8bNVIDIA | 7.5B | AA | 45.9 tok/s | 4.8 GB | |
F2LLM-v2-8BCodeFuse-AI (Ant Group) | 7.6B | AA | 46.5 tok/s | 4.7 GB | |
Octen-Embedding-8BOcten AI | 7.6B | AA | 46.5 tok/s | 4.7 GB | |
Qwen3-Embedding-8BQwen/Alibaba | 7.6B | AA | 46.5 tok/s | 4.7 GB | |
gte-Qwen2-7B-instructAlibaba-NLP (Tongyi Lab) | 7.1B | AA | 49.0 tok/s | 4.5 GB | |
| 8B | AA | 38.8 tok/s | 5.7 GB | ||
| 9B | AA | 36.5 tok/s | 6.0 GB | ||
FLUX.2 [klein] 9BBlack Forest Labs | 9B | AA | 36.5 tok/s | 6.0 GB | |
| 9B | AA | 36.5 tok/s | 6.0 GB | ||
Llama 2 7B ChatMeta | 7B | AA | 45.9 tok/s | 4.8 GB | |
Phi-4-multimodal-instructMicrosoft | 5.6B | AA | 55.9 tok/s | 3.9 GB | |
Z-Image-TurboAlibaba | 6B | AA | 52.6 tok/s | 4.2 GB | |
BOOM_4B_v1ICT-CAS TIME / Querit | 4B | AA | 81.2 tok/s | 2.7 GB | |
F2LLM-v2-4BCodeFuse-AI (Ant Group) | 4B | AA | 81.2 tok/s | 2.7 GB | |
Qwen3-Embedding-4BQwen/Alibaba | 4B | AA | 81.2 tok/s | 2.7 GB | |
FLUX.2 [klein] 4BBlack Forest Labs | 4B | AA | 74.5 tok/s | 3.0 GB | |
Mochi 1 PreviewGenmo AI | 10B | AA | 33.2 tok/s | 6.6 GB | |
| 11.8B | AA | 30.9 tok/s | 7.1 GB |