Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 7 days ago • 51
AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation Paper • 2605.12925 • Published 7 days ago • 3
StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction Paper • 2605.06642 • Published 13 days ago • 27
SymptomAI: Towards a Conversational AI Agent for Everyday Symptom Assessment Paper • 2605.04012 • Published 15 days ago • 11
ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control Paper • 2604.27711 • Published 20 days ago • 41
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 28 days ago • 240
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
DCAgent2/dev_set_v2_d1_hardened_top4_seq_glm47_20260413_230905 Viewer • Updated Apr 14 • 292 • 14 • 1
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 324