Lost in Translation? Exploring the Shift in Grammatical Gender from Latin to Occitan Paper • 2605.09156 • Published 8 days ago • 1
Adapting Multilingual Embedding Models to Turkish via Cross-Lingual Tokenizer Surgery and Offline Distillation Paper • 2605.29992 • Published 6 days ago • 4
Not only where, But when: Temporal Scheduling for RLVR Paper • 2605.25381 • Published 9 days ago • 4
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning Paper • 2606.01682 • Published 2 days ago • 5
Measuring the Depth of LLM Unlearning via Activation Patching Paper • 2605.24614 • Published 11 days ago • 5
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Paper • 2605.30852 • Published 5 days ago • 8
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 162 items • Updated about 8 hours ago • 31
ESPO: Early-Stopping Proximal Policy Optimization Paper • 2605.29860 • Published 6 days ago • 14
Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models Paper • 2605.28132 • Published 7 days ago • 17
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 3 days ago • 20
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 162 items • Updated about 8 hours ago • 31
NITP: Next Implicit Token Prediction for LLM Pre-training Paper • 2605.24956 • Published 10 days ago • 23
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published 6 days ago • 24
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization Paper • 2606.02564 • Published 2 days ago • 22
Draft-OPD: On-Policy Distillation for Speculative Draft Models Paper • 2605.29343 • Published 6 days ago • 26
Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Paper • 2605.29707 • Published 6 days ago • 26
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 2 days ago • 57
One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation Paper • 2605.29429 • Published 6 days ago • 5
GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models Paper • 2605.29398 • Published 6 days ago • 4
FRAPPE: Full Input, Residual Output Autoencoding with Projection Pursuit Encoder Paper • 2605.28992 • Published 7 days ago • 5