Parakeet-TDT-ExecuTorch-Metal
Pre-exported ExecuTorch .pte file
for Parakeet TDT 0.6B with
Metal backend (Apple GPU) and fpa4w quantization (4-bit weight, fp activation).
Fast speech-to-text with word-level timestamps and GPU acceleration on macOS Apple Silicon.
For the XNNPACK (CPU) variant, see Parakeet-TDT-ExecuTorch-XNNPACK.
Installation
git clone https://github.com/pytorch/executorch/ ~/executorch
cd ~/executorch && EXECUTORCH_BUILD_KERNELS_TORCHAO=1 TORCHAO_BUILD_EXPERIMENTAL_MPS=1 ./install_executorch.sh
make parakeet-metal
Download
pip install huggingface_hub
huggingface-cli download younghan-meta/Parakeet-TDT-ExecuTorch-Metal --local-dir ~/parakeet_metal
Run
DYLD_LIBRARY_PATH=/usr/lib:$(brew --prefix libomp)/lib \
cmake-out/examples/models/parakeet/parakeet_runner \
--model_path ~/parakeet_metal/model.pte \
--tokenizer_path ~/parakeet_metal/tokenizer.model \
--audio_path ~/parakeet_metal/poem.wav
Optional flags:
--timestamps segment-- timestamp granularity:none|token|word|segment|all(default:segment)
Export Command
pip install "nemo_toolkit[asr]"
python examples/models/parakeet/export_parakeet_tdt.py \
--backend metal \
--qlinear_encoder fpa4w --qlinear_encoder_group_size 32 \
--qlinear fpa4w --qlinear_group_size 32 \
--output-dir ./parakeet_metal_quantized
Metal fpa4w quantization requires torchao built with experimental MPS ops:
EXECUTORCH_BUILD_KERNELS_TORCHAO=1 TORCHAO_BUILD_EXPERIMENTAL_MPS=1 ./install_executorch.sh
More Info
- Downloads last month
- 11
Model tree for younghan-meta/Parakeet-TDT-ExecuTorch-Metal
Base model
nvidia/parakeet-tdt-0.6b-v2