Multimodal Fact-Level Attribution for Verifiable Reasoning Paper • 2602.11509 • Published 5 days ago • 4
P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling Paper • 2602.12116 • Published 4 days ago • 4
Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision Paper • 2602.12164 • Published 4 days ago • 4
Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments Paper • 2602.11964 • Published 4 days ago • 11
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published 4 days ago • 32
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies Paper • 2602.09877 • Published 6 days ago • 185
On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs Paper • 2602.12506 • Published 4 days ago • 3
SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents Paper • 2602.12984 • Published 3 days ago • 4
What does RL improve for Visual Reasoning? A Frankenstein-Style Analysis Paper • 2602.12395 • Published 4 days ago • 12
Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Paper • 2602.03773 • Published 13 days ago • 9
Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards Paper • 2602.10231 • Published 6 days ago • 12
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published 7 days ago • 68
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents Paper • 2602.06855 • Published 10 days ago • 70
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 8 days ago • 253
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published 7 days ago • 57
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 8 days ago • 64
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published 6 days ago • 48