view article Article Custom Kernels for All from Codex and Claude +2 burtenshaw, sayakpaul, ariG23498, evalstate • Feb 13 • 80
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
view article Article Understanding NDCG@k (Normalized Discounted Cumulative Gain) charchits7 • Dec 3, 2025 • 4
view article Article Diffusers welcomes FLUX-2 +6 YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart • Nov 25, 2025 • 191
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 613
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch AviSoori1x • May 7, 2024 • 121
Streaming Long Video Understanding with Large Language Models Paper • 2405.16009 • Published May 25, 2024 • 1
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 128