Policy and World Modeling Co-Training for Language Agents Paper • 2606.02388 • Published 5 days ago • 11
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs Paper • 2505.11227 • Published May 16, 2025
VL-RouterBench: A Benchmark for Vision-Language Model Routing Paper • 2512.23562 • Published Dec 29, 2025 • 1
Large Language Models can be Guided to Evade AI-Generated Text Detection Paper • 2305.10847 • Published May 18, 2023
Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model Paper • 2603.25184 • Published Mar 26
AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design Paper • 2605.08756 • Published 28 days ago • 23
Policy and World Modeling Co-Training for Language Agents Paper • 2606.02388 • Published 5 days ago • 11
Policy and World Modeling Co-Training for Language Agents Paper • 2606.02388 • Published 5 days ago • 11
AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design Paper • 2605.08756 • Published 28 days ago • 23
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published Apr 24 • 227
V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models Paper • 2511.16668 • Published Nov 20, 2025 • 56