CodeCircuit: Toward Inferring LLM-Generated Code Correctness via Attribution Graphs Paper • 2602.07080 • Published 8 days ago • 6
Verifying Chain-of-Thought Reasoning via Its Computational Graph Paper • 2510.09312 • Published Oct 10, 2025 • 1
PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants Paper • 2506.09902 • Published Jun 11, 2025 • 2
PiCSAR: Probabilistic Confidence Selection And Ranking Paper • 2508.21787 • Published Aug 29, 2025 • 4