Qwen3-ASR-1.7B โ€” Core AI

Qwen3-ASR-1.7B speech-to-text converted for Apple Core AI, running on-device (iPhone + Mac). The zoo's first ASR model: an AuT audio encoder feeding a Qwen3 decoder on the pipelined engine (audio embeds bound to one static input buffer; {lang}<asr_text>{text} output). โ‰ค30 s clips, 52 languages, automatic language detection.

Driven by CoreAIKit KitASRModel:

let asr = try await KitASRModel(model: .qwen3ASR1_7B)
let r = try await asr.transcribe(samples: pcm16kMono)   // -> (language, text)

Layout: gpu-pipelined/ holds the decoder bundle (*_decode_int8hu_n390_s1, int8) + the paired AuT encoder (*_audio_encoder_fp16_k30, fp16). Same bundles on iOS and macOS.

App: coreai-audio (Transcribe tab โ€” pick Qwen3-ASR or Whisper large-v3-turbo). Card: zoo/qwen3-asr.md.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support