Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published Jan 15 • 29
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models Paper • 2601.21639 • Published 27 days ago • 50
Running on Zero Featured 1.54k Qwen3-TTS Demo 🎙 1.54k Generate custom speech from text, voice descriptions, or samples
LightOnOCR-2 🦉 Collection LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family • 12 items • Updated 6 days ago • 22