CohereLabs/cohere-transcribe-03-2026 Automatic Speech Recognition β’ Updated about 13 hours ago β’ 241k β’ 892
Running on Zero Agents 25 Fibo-Edit-RMBG Background Removal π¨ 25 Background removal with Fibo-Edit-RMBG
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text β’ 28B β’ Updated 14 days ago β’ 573k β’ 2.73k
tarn59/book_flatten_and_crop_qwen_image_edit_2509 Image-to-Image β’ Updated Nov 18, 2025 β’ 11 β’ β’ 39
Running on Zero Agents Featured 174 ReconViaGen π₯ 174 High-fidelity 3D Geometry Generation from multi-view images
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ β’ 8 items β’ Updated Mar 2 β’ 230
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper β’ 2505.23747 β’ Published May 29, 2025 β’ 69
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper β’ 2505.17612 β’ Published May 23, 2025 β’ 81
Runtime error Agents 61 TRELLIS - Multiple Imagen a 3D π 61 Scalable and Versatile 3D Generation from images
Running Featured 103 Qwen3 WebGPU π 103 A hybrid reasoning model that runs locally in your browser.
docling-project/SmolDocling-256M-preview Image-Text-to-Text β’ Updated Sep 17, 2025 β’ 37.5k β’ 1.61k
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 Sep 25, 2024 β’ 191
view article Article SmolVLM Grows Smaller β Introducing the 256M & 500M Models! +1 Jan 23, 2025 β’ 192
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text β’ 11B β’ Updated Dec 4, 2024 β’ 195k β’ 1.58k