MoE Mixtral-8x7B unified embedding+generation model, best-in-class open generation, competitive on MTEB.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
The scaled-up Mixture-of-Experts variant of GritLM, fine-tuned from Mixtral-8x7B via Generative Representational Instruction Tuning to unify generation and embedding in a sparse 47B-parameter model. At release it outperformed all open generative LMs the authors tested while still ranking among the strongest embedding models, demonstrating GRIT's scalability to MoE architectures.