GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 12 days ago • 352
SkillX: Automatically Constructing Skill Knowledge Bases for Agents Paper • 2604.04804 • Published 9 days ago • 31
Squeez: Task-Conditioned Tool-Output Pruning for Coding Agents Paper • 2604.04979 • Published 11 days ago • 10
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens Paper • 2604.04913 • Published 9 days ago • 9
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling Paper • 2604.07209 • Published 7 days ago • 35
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 7 days ago • 16
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 6 days ago • 253
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 6 days ago • 273
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 13 days ago • 40
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published 8 days ago • 41
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Paper • 2604.05091 • Published 9 days ago • 44
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 8 days ago • 114
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 9 days ago • 107
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 13 days ago • 468
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published 19 days ago • 351
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants Paper • 2604.00842 • Published 14 days ago • 13