Ko-WideSearch: A Korean Breadth-Search Benchmark for Exhaustive Set Enumeration by Web Agents Paper • 2606.27595 • Published 8 days ago • 6
OpenThoughts-Agent: Data Recipes for Agentic Models Paper • 2606.24855 • Published 10 days ago • 46
Geometric Action Model for Robot Policy Learning Paper • 2606.17046 • Published 18 days ago • 117
When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA Paper • 2603.28026 • Published Mar 30 • 1
When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA Paper • 2603.28026 • Published Mar 30 • 1
ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time? Paper • 2606.05553 • Published 29 days ago • 50
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published about 1 month ago • 47
CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents Paper • 2603.15421 • Published Apr 20 • 24
CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents Paper • 2603.15421 • Published Apr 20 • 24
MolDeTox: Evaluating Language Model's Stepwise Fragment Editing for Molecular Detoxification Paper • 2605.12181 • Published May 12 • 8
ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack Paper • 2509.25843 • Published Apr 14 • 20
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Paper • 2601.07226 • Published Jan 12 • 33
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published Jan 13 • 53
The Curious Case of Analogies: Investigating Analogical Reasoning in Large Language Models Paper • 2511.20344 • Published Nov 25, 2025 • 14
Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards Paper • 2506.11474 • Published Jun 13, 2025 • 18