Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning Paper • 2602.21103 • Published 25 days ago • 8
Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents Paper • 2605.25971 • Published May 25 • 16
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published May 20 • 207
Auditing Multimodal LLM Raters: Central Tendency Bias in Clinical Ordinal Scoring Paper • 2605.16386 • Published May 11 • 3
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published Apr 30 • 59
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 329
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 509
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 638
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines Paper • 2603.06679 • Published Mar 30 • 7
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published Feb 26 • 150