view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases QuentinJG • Nov 5, 2025 • 66
From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models Paper • 2605.20177 • Published 20 days ago • 10
Encoder-Free Human Motion Understanding via Structured Motion Descriptions Paper • 2604.21668 • Published Apr 23 • 3
Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment Paper • 2604.00913 • Published Apr 1 • 4
NanoVDR: Distilling a 2B Vision-Language Retriever into a 70M Text-Only Encoder for Visual Document Retrieval Paper • 2603.12824 • Published Mar 13 • 5
view article Article NanoVDR: A 70M Text-Only Model That Retrieves Visual Documents as Well as a 2B VLM Ryenhails • Mar 16 • 3
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family lightonai • Jan 19 • 95
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG zilliz • Jan 15 • 67
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published Oct 21, 2025 • 68