Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published 14 days ago • 35
SWE-Next: Scalable Real-World Software Engineering Tasks for Agents Paper • 2603.20691 • Published about 1 month ago • 10
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published 21 days ago • 30
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published 21 days ago • 30
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published 21 days ago • 30
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 94
Running on Zero Agents Featured 2.34k Qwen Image Multiple Angles 3D Camera 🎥 2.34k Edit image camera angle with interactive 3D controls
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction Paper • 2602.13294 • Published Feb 9 • 13
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction Paper • 2602.13294 • Published Feb 9 • 13
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published Feb 5 • 36
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published Dec 1, 2025 • 74
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published Sep 30, 2025 • 19
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions Paper • 2510.10666 • Published Oct 12, 2025 • 28