AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning Paper • 2308.03526 • Published Aug 7, 2023 • 29
Simple synthetic data reduces sycophancy in large language models Paper • 2308.03958 • Published Aug 7, 2023 • 23