NVIDIA's Llama-3.1-8B-based bidirectional multilingual embedder; #1 MMTEB Borda at October 2025 release.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
NVIDIA's open-weights universal text embedding model fine-tuned from Llama-3.1-8B by replacing causal attention with bi-directional self-attention. Trained on 16.4M query-document pairs (8M public + 8.4M synthetic) with a novel synthetic-data pipeline; instruction-aware and optimized for multilingual and cross-lingual retrieval. Released for research/non-commercial use.