Train
updated
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop
Reasoning with Transformers
Paper
• 2504.20752
• Published
• 94
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language
Models in Math
Paper
• 2504.21233
• Published
• 49
AF Adapter: Continual Pretraining for Building Chinese Biomedical
Language Model
Paper
• 2211.11363
• Published
• 1
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper
• 2405.12130
• Published
• 50
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language
Model Fine-Tuning
Paper
• 2403.17919
• Published
• 16
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
• 2403.03507
• Published
• 189
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language
Models
Paper
• 2403.16187
• Published
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper
• 2505.03335
• Published
• 189
AceReason-Nemotron: Advancing Math and Code Reasoning through
Reinforcement Learning
Paper
• 2505.16400
• Published
• 36
Model Merging in Pre-training of Large Language Models
Paper
• 2505.12082
• Published
• 40
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation
through Low-Rank Clone
Paper
• 2505.12781
• Published
• 2
Paper
• 2505.09388
• Published
• 335
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous
Concept Space
Paper
• 2505.15778
• Published
• 19
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT
and RL Synergy
Paper
• 2506.13284
• Published
• 26
Paper
• 2506.11305
• Published
• 8
Reinforcement Pre-Training
Paper
• 2506.08007
• Published
• 263
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming
and Distillation
Paper
• 2506.18349
• Published
• 13
TPTT: Transforming Pretrained Transformer into Titans
Paper
• 2506.17671
• Published
• 5
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent
LLMs
Paper
• 2508.17188
• Published
• 17
No Label Left Behind: A Unified Surface Defect Detection Model for all
Supervision Regimes
Paper
• 2508.19060
• Published
• 12
Set Block Decoding is a Language Model Inference Accelerator
Paper
• 2509.04185
• Published
• 54
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning
for LLMs
Paper
• 2510.11696
• Published
• 181
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper
• 2602.05400
• Published
• 336