Deep Think
updated
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper
• 2504.07128
• Published
• 87
BM25S: Orders of magnitude faster lexical search via eager sparse
scoring
Paper
• 2407.03618
• Published
• 14
Deep Think with Confidence
Paper
• 2508.15260
• Published
• 90
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Paper
• 2508.05004
• Published
• 130
Omni-Thinker: Scaling Cross-Domain Generalization in LLMs via Multi-Task
RL with Hybrid Rewards
Paper
• 2507.14783
• Published
• 4
GHPO: Adaptive Guidance for Stable and Efficient LLM Reinforcement
Learning
Paper
• 2507.10628
• Published
• 2
A Survey of Reinforcement Learning for Large Reasoning Models
Paper
• 2509.08827
• Published
• 190
Why Language Models Hallucinate
Paper
• 2509.04664
• Published
• 196
Reverse-Engineered Reasoning for Open-Ended Generation
Paper
• 2509.06160
• Published
• 149
Reinforcement Learning Foundations for Deep Research Systems: A Survey
Paper
• 2509.06733
• Published
• 32
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
Paper
• 2509.03646
• Published
• 33
Staying in the Sweet Spot: Responsive Reasoning Evolution via
Capability-Adaptive Hint Scaffolding
Paper
• 2509.06923
• Published
• 22
Sharing is Caring: Efficient LM Post-Training with Collective RL
Experience Sharing
Paper
• 2509.08721
• Published
• 662
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow
Real Instructions?
Paper
• 2509.04292
• Published
• 58
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM
Fine-Tuning Data from Unstructured Documents
Paper
• 2507.04009
• Published
• 54
Scaling Agents via Continual Pre-training
Paper
• 2509.13310
• Published
• 117
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading
Paper
• 2509.09995
• Published
• 16
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published
• 80
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model
Reasoning
Paper
• 2509.19894
• Published
• 34
When Does Reasoning Matter? A Controlled Study of Reasoning's
Contribution to Model Performance
Paper
• 2509.22193
• Published
• 38