Collections
Discover the best community collections!
Collections including paper arxiv:2412.15115
-
Attention Is All You Need
Paper • 1706.03762 • Published • 109 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 19 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-to-Text • 109B • Updated • 189k • 1.2k -
keras-io/GauGAN-Image-generation
Updated • 6 • 4
-
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 228 -
A Survey on Diffusion Language Models
Paper • 2508.10875 • Published • 34 -
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 18 -
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper • 2403.03206 • Published • 71
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 436 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 259
-
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 228 -
A Survey on Diffusion Language Models
Paper • 2508.10875 • Published • 34 -
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 18 -
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper • 2403.03206 • Published • 71
-
Attention Is All You Need
Paper • 1706.03762 • Published • 109 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 19 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-to-Text • 109B • Updated • 189k • 1.2k -
keras-io/GauGAN-Image-generation
Updated • 6 • 4
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 436 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 259