SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling Paper • 2410.12481 • Published Oct 16, 2024
MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces Paper • 2502.07709 • Published Feb 11, 2025
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting Paper • 2410.19920 • Published Oct 25, 2024
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 207
Running 3.79k The Ultra-Scale Playbook 🌌 3.79k The ultimate guide to training LLM on large GPU Clusters
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15, 2024 • 21
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15, 2024 • 21
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent +2 Apr 22, 2024 • 81