MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Paper • 2605.26114 • Published 8 days ago • 59 • 3
SpatialBench: Is Your Spatial Foundation Model an All-Round Player? Paper • 2605.27367 • Published 7 days ago • 68 • 4
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published 11 days ago • 78 • 3
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 7 days ago • 130 • 4
Seeing the Needle in the Haystack: Towards Weakly-Supervised Log Instance Anomaly Localization via Counterfactual Perturbation Paper • 2605.10988 • Published 24 days ago • 3 • 4
Decoding the Critique Mechanism in Large Reasoning Models Paper • 2603.16331 • Published 11 days ago • 3
ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison Paper • 2605.20278 • Published 9 days ago • 1 • 3
Pixel-Level Pavement Distress Assessment Using Instance Segmentation Paper • 2605.26095 • Published 8 days ago • 2
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints Paper • 2605.21085 • Published 13 days ago • 4 • 3
Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries Paper • 2605.21712 • Published 13 days ago • 4 • 4
RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator Paper • 2605.21748 • Published 13 days ago • 15 • 3
Representation over Routing: Overcoming Surrogate Hacking in Multi-Timescale PPO Paper • 2604.13517 • Published 12 days ago • 5 • 4
HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction Paper • 2605.23889 • Published 11 days ago • 4 • 3
SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent Benchmarking Paper • 2605.25160 • Published 9 days ago • 4 • 2
Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models Paper • 2605.25189 • Published 9 days ago • 4 • 3
SemBridge: Language Transfer in Sparse Encoders via Multilingual Semantic Bridges Paper • 2605.26002 • Published 8 days ago • 3 • 1
CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test Paper • 2605.23491 • Published 11 days ago • 9 • 3
Reinforcing Few-step Generators via Reward-Tilted Distribution Matching Paper • 2605.26108 • Published 8 days ago • 5 • 4
Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution Paper • 2605.23264 • Published 11 days ago • 7 • 3