Lost in Translation? Exploring the Shift in Grammatical Gender from Latin to Occitan Paper • 2605.09156 • Published 8 days ago • 1
Adapting Multilingual Embedding Models to Turkish via Cross-Lingual Tokenizer Surgery and Offline Distillation Paper • 2605.29992 • Published 6 days ago • 4
Not only where, But when: Temporal Scheduling for RLVR Paper • 2605.25381 • Published 9 days ago • 4
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning Paper • 2606.01682 • Published 1 day ago • 4
Measuring the Depth of LLM Unlearning via Activation Patching Paper • 2605.24614 • Published 11 days ago • 5
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Paper • 2605.30852 • Published 5 days ago • 8
ESPO: Early-Stopping Proximal Policy Optimization Paper • 2605.29860 • Published 6 days ago • 13
Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models Paper • 2605.28132 • Published 7 days ago • 17
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 3 days ago • 19
NITP: Next Implicit Token Prediction for LLM Pre-training Paper • 2605.24956 • Published 10 days ago • 22
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published 6 days ago • 23
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization Paper • 2606.02564 • Published 1 day ago • 22
Draft-OPD: On-Policy Distillation for Speculative Draft Models Paper • 2605.29343 • Published 6 days ago • 25
Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Paper • 2605.29707 • Published 6 days ago • 26
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 1 day ago • 55
One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation Paper • 2605.29429 • Published 6 days ago • 3
GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models Paper • 2605.29398 • Published 6 days ago • 4
FRAPPE: Full Input, Residual Output Autoencoding with Projection Pursuit Encoder Paper • 2605.28992 • Published 7 days ago • 5
VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies Paper • 2605.30011 • Published 6 days ago • 8
When Confidence Misleads: Suffix Anchoring and Anchor-Proximity Confidence Modulation for Diffusion Language Models Paper • 2605.28181 • Published 7 days ago • 3