A 1.7B bidirectional encoder distilled from causal Qwen3-1.7B via masking + contrastive adaptation.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
The largest text-only model in the BidirLM series, derived from Qwen3-1.7B by switching to bidirectional attention, applying a Masked Next-Token Prediction stage, and finishing with InfoNCE contrastive training on a multilingual multi-domain corpus. Establishes a new open-data Pareto frontier on MTEB and an XTREME-augmented benchmark, demonstrating that adapted causal LLMs serve as strong general-purpose encoders.