ASUS

ASUS Ascent GX10 - 1TB

Name: ASUS Ascent GX10 - 1TB
Brand: ASUS
Price: 3497 USD
Availability: InStock

An affordable entry point into the GB10 ecosystem utilizing a PCIe Gen 4.0 1TB SSD to drastically lower the MSRP.

AI PCs & LaptopsIn Stock

Budget FriendlyEdge AI

Buy on Amazon$3,497Calculate ROI

Quick Specs

VRAM128 GB

FP1629.71 TFLOPS

INT8250 TOPS

TDP140 W

Memory BW273 GB/s

Max Params200B

Storage InterfacePCIe Gen 4.0 x4

Network Interface10G LAN, ConnectX-7, Wi-Fi 7, BT 5

Dimensions150 x 150 x 51 mm

Specifications

The ASUS Ascent GX10 - 1TB is a compact, high-density AI workstation designed around the NVIDIA GB10 Superchip. By pairing a 20-core ARM v9.2-A CPU with integrated Blackwell graphics and 128GB of unified LPDDR5x memory, ASUS has created a "desktop supercomputer" that fits within a 150 x 150 x 51 mm footprint. This specific 1TB configuration serves as the entry-level SKU for the GX10 lineup, utilizing a PCIe Gen 4.0 SSD to lower the barrier of entry into the Grace Blackwell ecosystem while maintaining the same 128GB VRAM pool as the higher-end variants.

For AI engineers and researchers, the Ascent GX10 represents a shift away from traditional x86 + discrete GPU setups toward a unified memory architecture. This design eliminates the PCIe bottleneck between the CPU and GPU, using NVLink-C2C to provide coherent memory access. In the market for AI PCs and laptops, the GX10 competes directly with the Apple Mac Studio (M2/M3 Ultra) and specialized edge AI boxes from vendors like GIGABYTE and Dell. However, its primary advantage is the native NVIDIA software stack—CUDA, TensorRT, and NIM—running on a system with enough VRAM to handle large-scale LLMs that typically require dual A6000 or H100 configurations.

AI Performance & Specifications

The defining feature of the ASUS Ascent GX10 - 1TB is its 128GB of unified VRAM. Unlike consumer GPUs capped at 16GB or 24GB, this 128GB pool is shared across the entire system, allowing for the loading of massive model weights that would otherwise require multi-GPU clusters.

Compute Throughput: The system delivers 29.71 TFLOPS of FP16 performance and 250 TOPS of INT8 performance. While the raw TFLOPS might seem lower than a flagship RTX 4090, the GX10 is optimized for high-efficiency inference and throughput rather than peak gaming clock speeds.
Memory Bandwidth: At 273 GB/s, the memory bandwidth is the primary driver for token generation speed (inference). While lower than the H100's HBM3 bandwidth, it is significantly higher than standard DDR5-based workstations, providing a smooth experience for local LLM interaction.
Power Efficiency: With a TDP of only 140W, the GX10 offers an exceptional performance-per-watt ratio. This makes it an ideal candidate for edge AI and continuous inference workloads where a 450W+ GPU would be thermally or economically unfeasible.
Networking: The inclusion of an NVIDIA ConnectX-7 SmartNIC and 10G LAN ensures that this unit can function as a high-speed node in a larger cluster, capable of rapid data ingest and low-latency communication.

What Models Can It Run?

The ASUS Ascent GX10 - 1TB is specifically engineered for running 200B parameter models. Because the VRAM is unified, you can allocate nearly the entire 128GB pool to the model weights, leaving a small overhead for the OS and KV cache.

Large Language Models (LLMs)

Llama 3.1 70B & 405B: The 70B model fits easily at FP16 or high-bit quantization (Q8_0), leaving massive headroom for long context windows. For the 405B model, the GX10 can handle it at lower quantization levels (IQ2_M or Q3_K_L), allowing for local testing of frontier-class models that usually require enterprise data center hardware.
DeepSeek-V3 / DeepSeek-R1: These MoE (Mixture of Experts) models are a perfect match for the GX10. You can run these at 4-bit or 5-bit quantization with high performance, making this one of the best hardware options for local AI agents 2026.
Qwen 2.5 72B: This model runs at near-instantaneous speeds on the Blackwell architecture, providing a "snappy" chatbot experience even with complex system prompts.

Multimodal & Long Context

Vision-Language Models: Models like Pixtral 12B or Llama 3.2 Vision run with massive KV caches, enabling the analysis of long video files or hundreds of high-resolution images in a single context window.
Context Window: With 128GB of VRAM, you can push context lengths to 128k or even 256k on 8B-30B parameter models, which is critical for RAG (Retrieval-Augmented Generation) workflows and large document analysis.

Use Cases & Target Audience

The ASUS Ascent GX10 - 1TB is a specialized tool for practitioners who have outgrown consumer hardware but do not have the budget or infrastructure for a liquid-cooled H100 rack.

Local AI Agent Development: Developers building agentic workflows need high VRAM to keep multiple models (e.g., a planner, a coder, and a critic) resident in memory simultaneously. The GX10 handles this "multi-model" residency without the latency of swapping weights to disk.
Edge AI Deployment: Its 150mm square footprint and 140W TDP make it suitable for deployment in retail, manufacturing, or secure office environments where a full-sized server is too loud or power-hungry.
ML Research & Fine-Tuning: While primarily an inference powerhouse, the 128GB pool allows for Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA and QLoRA on large models that would OOM (Out of Memory) on standard 24GB cards.
Privacy-First Teams: Organizations that cannot send sensitive data to OpenAI or Anthropic can use the GX10 to run Llama 3.1 70B internally at speeds comparable to cloud providers.

How It Compares

When evaluating the ASUS Ascent GX10 - 1TB, it is helpful to look at it against its closest competitors in the high-VRAM workstation space.

ASUS Ascent GX10 vs. Apple Mac Studio (M2/M3 Ultra)

The Mac Studio offers up to 192GB of unified memory and is a popular choice for local LLMs. However, the GX10 wins on software compatibility. Because the GX10 uses an NVIDIA Blackwell GPU, it supports the full CUDA ecosystem natively. For developers working with DeepSpeed, FlashAttention-2, or specific TensorRT kernels, the GX10 offers a "path of least resistance" compared to the Metal-based optimization required for Apple Silicon.

ASUS Ascent GX10 vs. Dual RTX 3090/4090 Workstations

A custom-built PC with two 24GB GPUs provides only 48GB of VRAM (even with NVLink on older 3090s). To match the 128GB VRAM of the GX10, you would need six RTX 3090s, requiring a massive chassis, a 1600W+ power supply, and complex multi-GPU orchestration. The GX10 provides nearly 3x the VRAM of a dual-4090 setup in a fraction of the space and power.

ASUS Ascent GX10 vs. NVIDIA Jetson AGX Orin

The Jetson AGX Orin is a fantastic edge device but is limited to 64GB of memory and significantly lower compute ceilings. The GX10 is the logical "next step" for those who find the Jetson AGX Orin insufficient for 70B+ parameter models or high-throughput production inference.

The 1TB storage configuration of the GX10 is the most cost-effective way to access the Blackwell GB10 architecture. While the PCIe 4.0 interface is technically slower than the 5.0 interface found in the 4TB model, the impact on LLM inference is negligible, as the bottleneck for token generation is memory bandwidth (273 GB/s), not disk read speed. For teams looking for the best AI chip for local deployment on a budget, the Ascent GX10 - 1TB is the most efficient entry point into the 128GB VRAM tier.

Compatible AI Models

Hide F tierOnly popular models

148 models


Qwen3-30B-A3BAlibaba Cloud (Qwen)	30B(3B active)	SS	40.8 tok/s	5.4 GB
BAGEL-7B-MoTBytedance	14B(7B active)	AA	45.9 tok/s	4.8 GB
Stable Diffusion 3.5 LargeStability AI	8.1B	AA	40.2 tok/s	5.5 GB
e5-mistral-7b-instructintfloat (Microsoft Research)	7.1B	AA	45.9 tok/s	4.8 GB
SFR-Embedding-MistralSalesforce	7.1B	AA	45.9 tok/s	4.8 GB
Linq-Embed-MistralLinq AI Research	7.1B	AA	45.9 tok/s	4.8 GB
GritLM-7BGritLM (Contextual AI)	7.2B	AA	45.3 tok/s	4.9 GB
llama-embed-nemotron-8bNVIDIA	7.5B	AA	45.9 tok/s	4.8 GB
F2LLM-v2-8BCodeFuse-AI (Ant Group)	7.6B	AA	46.5 tok/s	4.7 GB
Octen-Embedding-8BOcten AI	7.6B	AA	46.5 tok/s	4.7 GB
Qwen3-Embedding-8BQwen/Alibaba	7.6B	AA	46.5 tok/s	4.7 GB
gte-Qwen2-7B-instructAlibaba-NLP (Tongyi Lab)	7.1B	AA	49.0 tok/s	4.5 GB
Llama 3 8B InstructMeta	8B	AA	38.8 tok/s	5.7 GB
Granite Speech 3.3 8BIBM	9B	AA	36.5 tok/s	6.0 GB
FLUX.2 [klein] 9BBlack Forest Labs	9B	AA	36.5 tok/s	6.0 GB
Carnice-9b for Hermes agentkai-os	9B	AA	36.5 tok/s	6.0 GB
Llama 2 7B ChatMeta	7B	AA	45.9 tok/s	4.8 GB
Phi-4-multimodal-instructMicrosoft	5.6B	AA	55.9 tok/s	3.9 GB
Z-Image-TurboAlibaba	6B	AA	52.6 tok/s	4.2 GB
BOOM_4B_v1ICT-CAS TIME / Querit	4B	AA	81.2 tok/s	2.7 GB
F2LLM-v2-4BCodeFuse-AI (Ant Group)	4B	AA	81.2 tok/s	2.7 GB
Qwen3-Embedding-4BQwen/Alibaba	4B	AA	81.2 tok/s	2.7 GB
FLUX.2 [klein] 4BBlack Forest Labs	4B	AA	74.5 tok/s	3.0 GB
Mochi 1 PreviewGenmo AI	10B	AA	33.2 tok/s	6.6 GB
KaLM-Embedding-Gemma3-12B-2511Tencent	11.8B	AA	30.9 tok/s	7.1 GB

Rows per page

Page 1 of 6