Alibaba Qwen's flagship 1.7B-parameter ASR model supporting 52 languages and dialects, achieving SOTA performance among open-source ASR models and competitive with top proprietary APIs.
Access model weights, configuration files, and documentation.
See which devices can run this model and at what quality level.
Qwen3-ASR-1.7B is the flagship model in the Qwen3-ASR family from Alibaba Cloud's Qwen team. It is a Large Audio-Language Model (LALM) post-trained from the Qwen3-Omni foundation model, pairing an audio encoder with a Qwen3 transformer decoder via a learned projector.
Ships with qwen-asr PyPI package, vLLM backend for batch/async/streaming serving, and Docker images.
Meeting transcription, podcast/video subtitling, multilingual voice agents, call-center QA, media monitoring, and research on large audio-language models.