Collections
Discover the best community collections!
Collections including paper arxiv:2603.25040
-
RynnBrain: Open Embodied Foundation Models
Paper • 2602.14979 • Published • 45 -
InCoder-32B: Code Foundation Model for Industrial Scenarios
Paper • 2603.16790 • Published • 307 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 125 -
The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models
Paper • 2604.04155 • Published • 3
-
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper • 2501.18585 • Published • 61 -
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
Paper • 2502.07374 • Published • 40 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 153 -
S*: Test Time Scaling for Code Generation
Paper • 2502.14382 • Published • 63
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 79 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 125 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 132 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 121
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 125 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 132 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 121
-
RynnBrain: Open Embodied Foundation Models
Paper • 2602.14979 • Published • 45 -
InCoder-32B: Code Foundation Model for Industrial Scenarios
Paper • 2603.16790 • Published • 307 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 125 -
The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models
Paper • 2604.04155 • Published • 3
-
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper • 2501.18585 • Published • 61 -
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
Paper • 2502.07374 • Published • 40 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 153 -
S*: Test Time Scaling for Code Generation
Paper • 2502.14382 • Published • 63
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 79 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45