HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 9 days ago • 27
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 14 days ago • 18
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 27 days ago • 10
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 27 days ago • 10