SWITCH: Benchmarking Modeling and Handling of Tangible Interfaces in Long-horizon Embodied Scenarios Paper • 2511.17649 • Published Nov 20, 2025 • 1
SWITCH: Benchmarking Modeling and Handling of Tangible Interfaces in Long-horizon Embodied Scenarios Paper • 2511.17649 • Published Nov 20, 2025 • 1
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 18 days ago • 211
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 18 days ago • 39
MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments Paper • 2602.06075 • Published 24 days ago • 13
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published 21 days ago • 35
Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention Paper • 2602.03338 • Published 24 days ago • 26
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions Paper • 2602.06035 • Published 22 days ago • 23
Reinforcement World Model Learning for LLM-based Agents Paper • 2602.05842 • Published 22 days ago • 27
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents Paper • 2602.02474 • Published 25 days ago • 56
ProAct: Agentic Lookahead in Interactive Environments Paper • 2602.05327 • Published 23 days ago • 25
RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Contextual Adaptation Paper • 2512.24212 • Published Dec 30, 2025 • 1