Leaderboards documentation
Leaderboards and Evaluations
Leaderboards and Evaluations
The Hub contains leaderboards and evaluations for machine learning models, including LLMs, chatbots, and more. There are three types of leaderboards:
- Eval Results from official benchmark datasets like GPQA, MMLU-Pro, or other datasets used in academic papers. When results are published in model repositories, the scores are are shown on the model page.
- Community Managed Leaderboards live on Spaces and are managed by the community for specific use cases.
- Open LLM Leaderboard was a project curated by the Hugging Face team to evaluate and rank open source LLMs and chatbots, and provide reproducible scores separating marketing fluff from actual progress in the field.
