view article Article Train AI models with Unsloth and Hugging Face Jobs for FREE +4 burtenshaw, danielhanchen, shimmyshimmer, mlabonne, davanstrien, evalstate • Feb 20 • 101
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day sionic-ai • Dec 8, 2025 • 57
view article Article We Got Claude to Fine-Tune an Open Source LLM burtenshaw, evalstate • Dec 4, 2025 • 624
view article Article Australian-made LLM beats OpenAI and Google at legal retrieval isaacus • Oct 23, 2025 • 27
view article Article There is no such thing as a tokenizer-free lunch catherinearnett • Sep 25, 2025 • 98
The Majority is not always right: RL training for solution aggregation Paper • 2509.06870 • Published Sep 8, 2025 • 15
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models Paper • 2505.11711 • Published May 16, 2025 • 11
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb • May 21, 2025 • 258
view article Article Tiny Agents: an MCP-powered agent in 50 lines of code julien-c • Apr 25, 2025 • 308
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18, 2025 • 141
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know qgallouedec • Apr 18, 2025 • 72
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 54 items • Updated 24 days ago • 115
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 +1 eliebak, lvwerra, lewtun • Jan 28, 2025 • 889