Train - a Doramong Collection

Doramong 's Collections

Train

updated 16 days ago

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers

Paper • 2504.20752 • Published Apr 29, 2025 • 94
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30, 2025 • 49
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model

Paper • 2211.11363 • Published Nov 21, 2022 • 1
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Paper • 2403.17919 • Published Mar 26, 2024 • 16
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Paper • 2403.16187 • Published Mar 24, 2024
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22, 2025 • 36
Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17, 2025 • 40
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone

Paper • 2505.12781 • Published May 19, 2025 • 2
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 335
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Paper • 2505.15778 • Published May 21, 2025 • 19
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

Paper • 2506.13284 • Published Jun 16, 2025 • 26
Don't Pay Attention

Paper • 2506.11305 • Published Jun 12, 2025 • 8
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation

Paper • 2506.18349 • Published Jun 23, 2025 • 13
TPTT: Transforming Pretrained Transformer into Titans

Paper • 2506.17671 • Published Jun 21, 2025 • 5
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs

Paper • 2508.17188 • Published Aug 24, 2025 • 17
No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes

Paper • 2508.19060 • Published Aug 26, 2025 • 12
Set Block Decoding is a Language Model Inference Accelerator

Paper • 2509.04185 • Published Sep 4, 2025 • 54
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13, 2025 • 181
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published 22 days ago • 336