Foundational 7B LLM-based embedder fine-tuned with GPT-4-synthesized instruction data.
A solid 7.1B-parameter dense embedding model from intfloat (Microsoft Research). Treat the modality benchmarks above as the leading indicator of fit — composite scoring across modalities is still maturing.
Generated from this model’s benchmarks and ranking signals. Editor reviews refine it over time.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
A 7B-parameter embedding model from intfloat (Microsoft Research) derived from Mistral-7B-v0.1 via LoRA fine-tuning, introduced in "Improving Text Embeddings with Large Language Models" (Wang et al., 2024). It pioneered the use of GPT-4-generated synthetic instruction data across hundreds of embedding tasks and 93 languages, set a then-SOTA on MTEB, and serves as the base for many follow-ups including SFR-Embedding-Mistral, Linq-Embed-Mistral, and GritLM-7B.