view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 3 days ago • 31
view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts 4 days ago • 14
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 Jan 18, 2024 • 79