Qwen3 Voice Embedding Collection Standalone ECAPA-TDNN x-vector speaker encoders extracted from Qwen3-TTS. 1024-dim (0.6B) and 2048-dim (1.7B). • 4 items • Updated Feb 27 • 29
Parakeet ASR Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 5 days ago • 64
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 230
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 713
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Paper • 2410.06885 • Published Oct 9, 2024 • 47
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 43
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch Paper • 2311.03099 • Published Nov 6, 2023 • 32