view article Article Fixing Gradient Accumulation +4 lysandre, ArthurZ, muellerzr, ydshieh, BenjaminB, pcuenq • Oct 16, 2024 • 66
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate +2 mirinflim, aldopareja, muellerzr, stas • Jun 13, 2024 • 62
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware +7 Titus-von-Koeller, jiaweizhao, mdouglas, hiyouga, ybelkada, muellerzr, amyeroberts, smangrul, BenjaminB • Mar 20, 2024 • 32
view article Article From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease muellerzr • Oct 21, 2022 • 44
view article Article CO2 Emissions and the 🤗 Hub: Leading the Charge +1 sasha, muellerzr, nateraw • Apr 22, 2022 • 23