Macaron-A2UI: A Model for Generative UI in Personal Agents Paper • 2605.24830 • Published May 24 • 83
SOD: Step-wise On-policy Distillation for Small Language Model Agents Paper • 2605.07725 • Published May 8 • 25
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published May 20 • 207
Babsie/Qwen3.6-27B-Heretic2-Uncensored-Finetune-Thinking Image-Text-to-Text • 27B • Updated May 21 • 5 • 2
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 274
Squeez: Task-Conditioned Tool-Output Pruning for Coding Agents Paper • 2604.04979 • Published Apr 4 • 11