SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 12 days ago • 43
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 21 days ago • 34
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 105
Imperceptible Jailbreaking against Large Language Models Paper • 2510.05025 • Published Oct 6, 2025 • 34
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28, 2025 • 176
Language Models Can Learn from Verbal Feedback Without Scalar Rewards Paper • 2509.22638 • Published Sep 26, 2025 • 70
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 79
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 515
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment Paper • 2505.21494 • Published May 27, 2025 • 8
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms Paper • 2505.15141 • Published May 21, 2025 • 4
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published May 22, 2025 • 42
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper • 2505.13438 • Published May 19, 2025 • 36