Cohere Transcribe 2B (CoreML + ONNX Hybrid)
Hybrid CoreML encoder + ONNX decoder for Cohere Transcribe, optimized for Apple Silicon inference. The encoder runs on the Neural Engine via CoreML, the decoder runs with ONNX Runtime and KV cache on CPU.
Cohere Transcribe is a 2B-parameter encoder-decoder ASR model that holds #1 on the Open ASR Leaderboard with 5.42% average WER — beating Whisper Large v3, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B.
Features
- 14 languages: English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Greek, Arabic, Japanese, Chinese, Vietnamese, Korean
- #1 accuracy on Open ASR Leaderboard (5.42% WER)
- ANE-accelerated encoder via CoreML (1.9B params on Neural Engine)
- KV-cached decoder via ONNX Runtime (153M params, q4f16 quantized)
- Long-form audio: native 35-second chunking
Specifications
| Property | Value |
|---|---|
| Total Parameters | 2.07B |
| Encoder | CoreML FP16 (3.5 GB) |
| Decoder | ONNX q4f16 (98 MB) |
| Projection | Float32 (5 MB) |
| Total Download | ~3.6 GB |
| License | Apache 2.0 |
Files
coreml/
cohere_encoder.mlmodelc/ # CoreML encoder (ANE-optimized, FP16)
onnx/
decoder_model_merged_q4f16.onnx # ONNX decoder header
decoder_model_merged_q4f16.onnx_data # ONNX decoder weights (q4f16)
config.json
generation_config.json
preprocessor_config.json
tokenizer.json
tokenizer_config.json
decoder_proj_weight.bin # Encoder→decoder projection (1280→1024)
decoder_proj_bias.bin
Usage
This model is designed for use with Petal, a macOS menu bar app for local-first audio transcription.
Architecture:
- Audio → mel spectrogram (128 bins, 16kHz, NeMo-style preprocessing)
- CoreML encoder (Fast-Conformer, 48 layers, d=1280) → encoder hidden states
- Projection layer (1280→1024) applied in Swift via Accelerate
- ONNX decoder (Transformer, 8 layers, d=1024) with KV cache → token IDs
- SentencePiece tokenizer → text
Performance on Apple Silicon:
- Model warmup: ~0.2s (cached CoreML)
- Transcription: ~2s for 4s audio
- Encoder runs on ANE, decoder on CPU with KV cache
License
Apache 2.0 — original model by Cohere.
- Downloads last month
- 19
Model tree for Aayush9029/cohere-transcribe-2b-coreml-onnx
Base model
CohereLabs/cohere-transcribe-03-2026