Mid-size 4B Qwen3 embedding model balancing quality and efficiency.
Copy and paste this command to start running the model locally.
ollama run qwen3-embedding:4bAccess model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
The mid-size variant of the Qwen3 Embedding series, fine-tuned from Qwen3-4B-Base using the same 3-stage contrastive + supervised + model-merging recipe as the 8B model. Supports 100+ languages, 32K context, instruction-aware embeddings, and Matryoshka output dimensions, offering a strong balance of quality and inference cost.