SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks Paper ⢠2606.09669 ⢠Published 4 days ago ⢠41
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research Paper ⢠2606.07591 ⢠Published 15 days ago ⢠85
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper ⢠2604.05404 ⢠Published Apr 7 ⢠43
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper ⢠2604.05404 ⢠Published Apr 7 ⢠43
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper ⢠2602.08354 ⢠Published Feb 9 ⢠266
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper ⢠2512.22334 ⢠Published Dec 26, 2025 ⢠36
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper ⢠2512.16969 ⢠Published Dec 18, 2025 ⢠120
pyannote/speaker-diarization-3.1 Automatic Speech Recognition ⢠Updated May 10, 2024 ⢠8.18M ⢠2.27k
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation Paper ⢠2508.07901 ⢠Published Aug 11, 2025 ⢠40
Running 350 LLM Embeddings Explained: A Visual and Intuitive Guide š 350 How Language Models Turn Text into Meaning, From Traditional