LLM TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38 User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
Leaderboards Running Featured 599 Image Arena Leaderboard 📊 599 Image Generation and Image Editing Arena & Leaderboard Running on CPU Upgrade 7.44k MTEB Leaderboard 🥇 7.44k Embedding Leaderboard Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots Running 4.9k Arena Leaderboard 🏆 4.9k View the LMArena leaderboard in full‑screen
Running Featured 599 Image Arena Leaderboard 📊 599 Image Generation and Image Editing Arena & Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots
LLM TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38 User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
Leaderboards Running Featured 599 Image Arena Leaderboard 📊 599 Image Generation and Image Editing Arena & Leaderboard Running on CPU Upgrade 7.44k MTEB Leaderboard 🥇 7.44k Embedding Leaderboard Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots Running 4.9k Arena Leaderboard 🏆 4.9k View the LMArena leaderboard in full‑screen
Running Featured 599 Image Arena Leaderboard 📊 599 Image Generation and Image Editing Arena & Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots