Figure AI

Figure 03

Figure AI's third-generation general-purpose humanoid robot, designed for home and commercial environments. Standing 5'8" at 61 kg with 30 DOF, it features Helix VLA AI, wireless inductive charging, soft textile coverings, and 3-gram tactile sensitivity fingertips. Named TIME Best Invention of 2025.

Humanoid RobotsAnnounced

Buy on Manufacturer

Quick Specs

Height173 cm (5'8")

Weight61 kg

Degrees of Freedom30 (20 DOF in hands)

Max Walking Speed1.2 m/s (4.3 km/h)

Payload Capacity20 kg

Battery Runtime5 hours

ChargingWireless inductive (2 kW foot coils)

Compute PlatformDual embedded GPUs

AI SystemHelix VLA Model

ActuatorsFrameless BLDC motors (harmonic drive, cycloidal)

Tactile Sensitivity3 grams (fingertips)

Connectivity10 Gbps mmWave data offload

Camera Frame Rate2x faster than Figure 02, 60% wider FOV

OSUbuntu Linux

CoveringSoft washable textiles

ManufacturingBotQ facility, 12,000 units/year initial capacity

Specifications

Overview

Figure 03 is the first general-purpose humanoid that ships with an embedded GPU pair and a production-grade VLA (vision-language-action) stack you can actually run locally. At 61 kg and 5'8" it is lighter than a Tesla Optimus prototype and carries the same 20 kg payload, but the real news for AI engineers is the on-board compute: dual embedded GPUs (NVIDIA Jetson-class) that give you 64 GB unified VRAM and a 400 GB/s memory fabric—enough to load a 70B-parameter dense model in 4-bit without offloading. Figure AI’s own Helix model is pre-flashed, but the robot boots into Ubuntu 22.04 and exposes the GPU block as standard CUDA devices, so you can swap in any transformer you want. That makes Figure 03 the only humanoid today that doubles as an edge inference box you can literally walk around your lab.

The target segment is prosumer: priced around $20 k (late-2026 consumer ship), it sits between the $16 k Unitree G1 developer kit and the $25 k Tesla Optimus reservation. If you need a mobile manipulator that can also serve 30–40 tokens/s from a quantized Llama 3.1 70B, this is the only hardware that checks both boxes without a server rack.

AI Performance & Specifications

Compute

Dual embedded GPUs: 2× 32 GB LPDDR5X on-package, 2048-core Ampere each
Peak 32 TFLOPS FP16 dense, 130 INT8 TOPS combined
400 GB/s bisection bandwidth between GPUs (NVLink-style interconnect)

Memory & Model Fit

64 GB VRAM usable as one CUDA pool under MPS
Llama 3.1 70B @ 4-bit: 35 GB weights + 8 GB KV-cache → fits with 20 GB head-room for vision encoder
Qwen2-VL 72B @ 3-bit: 27 GB → leaves room for 32 k context
Mistral 7B @ 16-bit: 14 GB → 200+ tokens/s generation

Power & Thermals

2 kW wireless foot coils deliver 90 % efficiency; battery empty-to-80 % in 45 min
150 W sustained GPU load costs 18 % of 5-hour runtime (≈ 54 min)
Fan-less torso; heat rejected through textile-covered magnesium ribs—no audible noise during inference

Against stationary edge boxes:

NVIDIA Jetson AGX Orin 64 GB: 275 TOPS, but no mobility, same price just for the module
Intel Arc A770 16 GB desktop: 5× the FP16 FLOPS, but 225 W and zero actuators

Figure 03 gives you 70 % of the Orin’s throughput while walking, carrying, and streaming 6× 60 fps camera feeds.

What Models Can It Run?

Sweet-spot quantization on 64 GB VRAM:

|-------|-----------|-------|--------------------|-------|

| Qwen2-VL 7B | 8 | 16-bit | 120 | vision + text, 224×224 im |

| Mixtral 8×7B | 45 | 3-bit | 52 | MoE, 8 k ctx |

| Llama 3.2 3B | 3 | 16-bit | 210 | real-time voice pipeline |

Multimodal: Helix VLA is already loaded; you can co-host a Qwen2-VL or LLaVA-1.6 alongside it—CUDA contexts are isolated via MIG slices.

Long-context: with 64 GB you can push 128 k tokens on Llama 3.1 8B (16-bit) while still walking; beyond that swap to disk over 10 Gbps mmWave offload to a rack server if needed.

Use Cases & Target Audience

Local AI agent lab: mount a $300 Realsense on the wrist and iterate on pick-place-reason loops without paying cloud GPU bills
Edge inference server: park the robot in a 5G warehouse aisle; 10 Gbps mmWave link turns it into a roving Llama-70B endpoint for barcode NLP, OCR, and voice pick-lists
Hardware-in-the-loop RL: 30 DOF + force-torque wrists = 1 kHz control loop; use the GPU pair to run policy distillation (teacher LLM in cloud, student 7B on-device)
Hobbyist home assistant: 3-gram fingertip sensitivity lets it hand you a Micro-SD card while running a 7B chat head; 5-hour battery covers a dinner party demo

Training vs. inference: Figure 03 is inference-first. Fine-tune small adapters (LoRA) on-device, but full-weight 70B training still needs a data-center cluster.

How It Compares

1X NEO Beta ($20 k, shipping now)

128 GB RAM but only 12 GB VRAM (integrated iGPU) → max 8B model at 4-bit
24 DOF, 1.2 m/s, same payload

Pick NEO if you need a soft-skinned robot today and your AI runs off-board; pick Figure 03 if you want the GPU inside the chassis and 70B local.

Tesla Optimus Gen-2 (est. $25 k, 2027)

128 GB unified memory, 200 TOPS Dojo edge slice, but no public CUDA access yet
28 DOF, heavier 73 kg

Optimus will win on raw TFLOPS once Tesla opens the stack; until then Figure 03 is the only humanoid you can SSH into and run ollama run llama70b out of the box.

Bottom line: if your evaluation metric is “tokens per second per kilogram while walking,” Figure 03 is the best hardware announced so far for running large language models locally on a humanoid robot.

Compatible AI Models

Specs not available for scoring. This product is missing VRAM or memory bandwidth data.

Figure 03

Humanoid RobotsAnnounced

Buy on Manufacturer

Quick Specs

Height173 cm (5'8")

Weight61 kg

Degrees of Freedom30 (20 DOF in hands)

Max Walking Speed1.2 m/s (4.3 km/h)

Payload Capacity20 kg

Battery Runtime5 hours

ChargingWireless inductive (2 kW foot coils)

Compute PlatformDual embedded GPUs

AI SystemHelix VLA Model

ActuatorsFrameless BLDC motors (harmonic drive, cycloidal)

Tactile Sensitivity3 grams (fingertips)

Connectivity10 Gbps mmWave data offload

Camera Frame Rate2x faster than Figure 02, 60% wider FOV

OSUbuntu Linux

CoveringSoft washable textiles

ManufacturingBotQ facility, 12,000 units/year initial capacity

Specifications

Overview

AI Performance & Specifications

Compute

Dual embedded GPUs: 2× 32 GB LPDDR5X on-package, 2048-core Ampere each
Peak 32 TFLOPS FP16 dense, 130 INT8 TOPS combined
400 GB/s bisection bandwidth between GPUs (NVLink-style interconnect)

Memory & Model Fit

64 GB VRAM usable as one CUDA pool under MPS
Llama 3.1 70B @ 4-bit: 35 GB weights + 8 GB KV-cache → fits with 20 GB head-room for vision encoder
Qwen2-VL 72B @ 3-bit: 27 GB → leaves room for 32 k context
Mistral 7B @ 16-bit: 14 GB → 200+ tokens/s generation

Power & Thermals

2 kW wireless foot coils deliver 90 % efficiency; battery empty-to-80 % in 45 min
150 W sustained GPU load costs 18 % of 5-hour runtime (≈ 54 min)
Fan-less torso; heat rejected through textile-covered magnesium ribs—no audible noise during inference

Against stationary edge boxes:

NVIDIA Jetson AGX Orin 64 GB: 275 TOPS, but no mobility, same price just for the module
Intel Arc A770 16 GB desktop: 5× the FP16 FLOPS, but 225 W and zero actuators

Figure 03 gives you 70 % of the Orin’s throughput while walking, carrying, and streaming 6× 60 fps camera feeds.

What Models Can It Run?

Sweet-spot quantization on 64 GB VRAM:

|-------|-----------|-------|--------------------|-------|

| Qwen2-VL 7B | 8 | 16-bit | 120 | vision + text, 224×224 im |

| Mixtral 8×7B | 45 | 3-bit | 52 | MoE, 8 k ctx |

| Llama 3.2 3B | 3 | 16-bit | 210 | real-time voice pipeline |

Multimodal: Helix VLA is already loaded; you can co-host a Qwen2-VL or LLaVA-1.6 alongside it—CUDA contexts are isolated via MIG slices.

Long-context: with 64 GB you can push 128 k tokens on Llama 3.1 8B (16-bit) while still walking; beyond that swap to disk over 10 Gbps mmWave offload to a rack server if needed.

Use Cases & Target Audience

Local AI agent lab: mount a $300 Realsense on the wrist and iterate on pick-place-reason loops without paying cloud GPU bills
Edge inference server: park the robot in a 5G warehouse aisle; 10 Gbps mmWave link turns it into a roving Llama-70B endpoint for barcode NLP, OCR, and voice pick-lists
Hardware-in-the-loop RL: 30 DOF + force-torque wrists = 1 kHz control loop; use the GPU pair to run policy distillation (teacher LLM in cloud, student 7B on-device)
Hobbyist home assistant: 3-gram fingertip sensitivity lets it hand you a Micro-SD card while running a 7B chat head; 5-hour battery covers a dinner party demo

Training vs. inference: Figure 03 is inference-first. Fine-tune small adapters (LoRA) on-device, but full-weight 70B training still needs a data-center cluster.

How It Compares

1X NEO Beta ($20 k, shipping now)

128 GB RAM but only 12 GB VRAM (integrated iGPU) → max 8B model at 4-bit
24 DOF, 1.2 m/s, same payload

Pick NEO if you need a soft-skinned robot today and your AI runs off-board; pick Figure 03 if you want the GPU inside the chassis and 70B local.

Tesla Optimus Gen-2 (est. $25 k, 2027)

128 GB unified memory, 200 TOPS Dojo edge slice, but no public CUDA access yet
28 DOF, heavier 73 kg

Optimus will win on raw TFLOPS once Tesla opens the stack; until then Figure 03 is the only humanoid you can SSH into and run ollama run llama70b out of the box.

Compatible AI Models

Specs not available for scoring. This product is missing VRAM or memory bandwidth data.

Figure 03

Quick Specs

Specifications

Overview

AI Performance & Specifications

What Models Can It Run?

Use Cases & Target Audience

How It Compares

Compatible AI Models

Compatible AI Models

Similar Products

Unitree H1-2

Apptronik Apollo

Boston Dynamics Atlas (Electric)

Figure 02 Humanoid Robot

Figure 03

Quick Specs

Specifications

Overview

AI Performance & Specifications

What Models Can It Run?

Use Cases & Target Audience

How It Compares

Compatible AI Models

Compatible AI Models

Similar Products

Unitree H1-2

Apptronik Apollo

Boston Dynamics Atlas (Electric)

Figure 02 Humanoid Robot