Popular ~100-language instruction-tuned embedding model built on XLM-RoBERTa-large.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
A 560M-parameter multilingual embedding model from Microsoft Research's intfloat, initialized from XLM-RoBERTa-large and trained in two stages, weakly-supervised contrastive pre-training on 1B pairs followed by instruction-tuned fine-tuning using E5-Mistral synthetic data. Supports ~100 languages and uses natural-language task instructions to customize query embeddings.