made by agents

NVIDIA's most powerful embedded AI platform with 800 TOPS, 128GB LPDDR5X, and Blackwell GPU. Designed for humanoid robotics, autonomous vehicles, and safety-critical AI systems.
The NVIDIA Jetson AGX Thor Developer Kit represents the pinnacle of edge computing, specifically engineered to bridge the gap between data-center-class performance and embedded power constraints. As the successor to the AGX Orin, Thor utilizes the Blackwell GPU architecture to deliver a massive leap in compute density. For engineers building autonomous systems, humanoid robotics, or high-throughput agentic workflows at the edge, this is currently the highest-performing silicon available in a compact form factor.
Positioned as a premium, production-ready development platform, the AGX Thor is not a consumer-grade toy. It is a specialized tool for ML researchers and robotics engineers who require massive INT8 throughput and significant VRAM overhead for multi-modal sensor fusion and local LLM reasoning. While it competes loosely with high-end desktop GPUs like the RTX 4090 or specialized Mac Studio configurations, its true competition lies in the industrial sector—outperforming the previous Jetson AGX Orin by 7.5x in AI performance and offering a 3.5x improvement in efficiency.
The defining characteristic of the NVIDIA Jetson AGX Thor Developer Kit for AI is its Blackwell-based architecture. This isn't just a marginal upgrade; it introduces FP4 precision support, which is critical for the next generation of quantized model deployment.
With 800 TOPS of INT8 performance and 2,070 FP4 TFLOPS, Thor provides the raw compute necessary for real-time vision transformers and high-speed LLM inference. For practitioners, this means the ability to run complex perception pipelines alongside large language models without hitting the compute ceiling that plagues smaller edge devices.
The 128GB LPDDR5X memory is a game-changer for local AI agents. In the edge AI space, VRAM is the primary bottleneck for model size. With 128GB of unified memory, Thor effectively functions as a high-capacity GPU for AI, allowing developers to load massive weights that would typically require a multi-GPU server rack. The 273 GB/s memory bandwidth ensures that while it may not match the 1TB/s+ speeds of an H100, it provides sufficient data movement to keep token generation fluid for most real-time applications.
NVIDIA has optimized Thor for power-constrained environments where traditional 450W desktop cards are non-viable. The 3.5x efficiency gain over Orin allows for higher sustained clock speeds during long-running inference tasks, making it the best AI chip for local deployment in rugged or mobile environments like autonomous vehicles.
The NVIDIA Jetson AGX Thor Developer Kit VRAM for large language models opens doors that were previously closed to edge devices. This hardware is specifically designed for running 70B+ parameter models at Q4 quantization and beyond.
Thor's primary design intent is for humanoid robotics, meaning it excels at running segmentation (SAM), object detection (YOLOv11), and vision-language models (Vila, LLaVA) simultaneously. In an autonomous workflow, Thor can process multiple 4K camera streams while running a local LLM to make navigational decisions based on visual input.
While exact benchmarks vary by optimization (TensorRT-LLM vs. llama.cpp), users can expect the following NVIDIA Jetson AGX Thor Developer Kit AI inference performance:
The AGX Thor is the best edge device for autonomous workflows where cloud latency is unacceptable and data privacy is paramount.
When evaluating the NVIDIA Jetson AGX Thor Developer Kit vs. alternatives, it is important to distinguish between "raw desktop power" and "edge-integrated power."
The Orin was the previous gold standard with 275 TOPS. Thor provides a 7.5x increase in AI compute. If your workload involves LLMs larger than 13B or requires real-time 3D world-model generation, the upgrade to Thor is mandatory. Orin remains a viable mid-tier option for simpler CV tasks, but Thor is the clear choice for "Agentic" edge AI.
The Mac Studio is often cited as the best hardware for local AI agents 2025 due to its unified memory (up to 192GB). However, the Mac Studio lacks the industrial I/O, GMSL2 camera inputs, and ruggedized power delivery required for edge deployment. While the Mac is a superior desktop development environment, the AGX Thor is the superior deployment platform for robotics and field-based AI.
An RTX 6000 Ada offers 48GB of VRAM and higher raw TFLOPS, but at a significantly higher power draw and without the ARM-based embedded ecosystem. For practitioners who need more than 48GB of VRAM in a single, efficient package without building a multi-GPU tower, the Thor’s 128GB unified memory pool is a more elegant and power-efficient solution for large-scale model inference.
Qwen3-30B-A3BAlibaba Cloud (Qwen) | 30B(3B active) | SS | 40.8 tok/s | 5.4 GB | |
| 8B | AA | 38.8 tok/s | 5.7 GB | ||
Llama 2 7B ChatMeta | 7B | AA | 45.9 tok/s | 4.8 GB | |
Gemma 4 E2B ITGoogle | 2B | AA | 59.3 tok/s | 3.7 GB | |
Qwen3.5-35B-A3BAlibaba Cloud (Qwen) | 35B(3B active) | AA | 25.8 tok/s | 8.5 GB | |
Mistral 7B InstructMistral AI | 7B | AA | 34.4 tok/s | 6.4 GB | |
Llama 2 13B ChatMeta | 13B | AA | 26.0 tok/s | 8.5 GB | |
Gemma 4 E4B ITGoogle | 4B | AA | 31.8 tok/s | 6.9 GB | |
Gemma 3 4B ITGoogle | 4B | AA | 31.8 tok/s | 6.9 GB | |
Mixtral 8x7B InstructMistral AI | 46.7B(12.9B active) | BB | 19.3 tok/s | 11.4 GB | |
Gemma 4 26B-A4B ITGoogle | 26B(4B active) | BB | 20.0 tok/s | 11.0 GB | |
Mistral Large 3 675BMistral AI | 675B(41B active) | BB | 3.3 tok/s | 66.3 GB | |
DeepSeek-V3DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
DeepSeek-R1DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
DeepSeek-V3.1DeepSeek | 671B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
DeepSeek-V3.2DeepSeek | 685B(37B active) | BB | 3.7 tok/s | 59.8 GB | |
Kimi K2 Instruct 0905Moonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
Kimi K2 ThinkingMoonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
Kimi K2.5Moonshot AI | 1000B(32B active) | BB | 2.6 tok/s | 84.6 GB | |
Kimi K2 InstructMoonshot AI | 1000B(32B active) | BB | 4.2 tok/s | 51.8 GB | |
Qwen3.5-27BAlibaba Cloud (Qwen) | 27B | BB | 3.0 tok/s | 72.8 GB | |
Gemma 4 31B ITGoogle | 31B | BB | 2.7 tok/s | 82.0 GB | |
| 70B | BB | 4.8 tok/s | 45.7 GB | ||
Qwen3.5-397B-A17BAlibaba Cloud (Qwen) | 397B(17B active) | BB | 4.8 tok/s | 46.0 GB | |
Llama 2 70B ChatMeta | 70B | BB | 5.1 tok/s | 43.4 GB |