GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces Paper • 2604.04017 • Published Apr 5 • 4
TokenHD Collection Token-level hallucination detectors trained with the TokenHD pipeline. Models range from 0.6B to 8B parameters, all based on Qwen3 backbones. • 7 items • Updated May 14
TokenHD Collection Token-level hallucination detectors trained with the TokenHD pipeline. Models range from 0.6B to 8B parameters, all based on Qwen3 backbones. • 7 items • Updated May 14
TokenHD Collection Token-level hallucination detectors trained with the TokenHD pipeline. Models range from 0.6B to 8B parameters, all based on Qwen3 backbones. • 7 items • Updated May 14