Lost in Sampling: Assessing Lexical Reachability in LLMs via the Word Coverage Score (WCS) Paper • 2605.27268 • Published 11 days ago • 13
Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas? Paper • 2508.01408 • Published Aug 2, 2025 • 10
Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas? Paper • 2508.01408 • Published Aug 2, 2025 • 10 • 2
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations Paper • 2507.13302 • Published Jul 17, 2025 • 6
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations Paper • 2507.13302 • Published Jul 17, 2025 • 6
Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans Paper • 2506.22439 • Published May 29, 2025 • 3
Is There a Case for Conversation Optimized Tokenizers in Large Language Models? Paper • 2506.18674 • Published Jun 23, 2025 • 8
Running Agents 10 Recursive Inpainting 🧟 10 Apply recursive inpainting to an image and see the changes
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong Paper • 2501.09775 • Published Jan 16, 2025 • 32
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong Paper • 2501.09775 • Published Jan 16, 2025 • 32
Running on CPU Upgrade Agents 76 La Leaderboard 🌸 76 Evaluate open LLMs in the languages of LATAM and Spain.
How Stable is Stable Diffusion under Recursive InPainting (RIP)? Paper • 2407.09549 • Published Jun 27, 2024 • 2