Cohere Transcribe 2B (CoreML + ONNX Hybrid)

Hybrid CoreML encoder + ONNX decoder for Cohere Transcribe, optimized for Apple Silicon inference. The encoder runs on the Neural Engine via CoreML, the decoder runs with ONNX Runtime and KV cache on CPU.

Cohere Transcribe is a 2B-parameter encoder-decoder ASR model that holds #1 on the Open ASR Leaderboard with 5.42% average WER — beating Whisper Large v3, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B.

Features

14 languages: English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Greek, Arabic, Japanese, Chinese, Vietnamese, Korean
#1 accuracy on Open ASR Leaderboard (5.42% WER)
ANE-accelerated encoder via CoreML (1.9B params on Neural Engine)
KV-cached decoder via ONNX Runtime (153M params, q4f16 quantized)
Long-form audio: native 35-second chunking

Specifications

Property	Value
Total Parameters	2.07B
Encoder	CoreML FP16 (3.5 GB)
Decoder	ONNX q4f16 (98 MB)
Projection	Float32 (5 MB)
Total Download	~3.6 GB
License	Apache 2.0

Files

coreml/
  cohere_encoder.mlmodelc/    # CoreML encoder (ANE-optimized, FP16)
onnx/
  decoder_model_merged_q4f16.onnx       # ONNX decoder header
  decoder_model_merged_q4f16.onnx_data  # ONNX decoder weights (q4f16)
config.json
generation_config.json
preprocessor_config.json
tokenizer.json
tokenizer_config.json
decoder_proj_weight.bin     # Encoder→decoder projection (1280→1024)
decoder_proj_bias.bin

Usage

This model is designed for use with Petal, a macOS menu bar app for local-first audio transcription.

Architecture:

Audio → mel spectrogram (128 bins, 16kHz, NeMo-style preprocessing)
CoreML encoder (Fast-Conformer, 48 layers, d=1280) → encoder hidden states
Projection layer (1280→1024) applied in Swift via Accelerate
ONNX decoder (Transformer, 8 layers, d=1024) with KV cache → token IDs
SentencePiece tokenizer → text

Performance on Apple Silicon:

Model warmup: ~0.2s (cached CoreML)
Transcription: ~2s for 4s audio
Encoder runs on ANE, decoder on CPU with KV cache

License

Apache 2.0 — original model by Cohere.

Downloads last month: 19

Model tree for Aayush9029/cohere-transcribe-2b-coreml-onnx

Base model

CohereLabs/cohere-transcribe-03-2026

Quantized

(24)

this model