SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Paper • 2604.10688 • Published 3 days ago • 5
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24, 2025 • 53
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs Paper • 2505.24120 • Published May 30, 2025 • 50
CSVQA Collection A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs • 2 items • Updated Aug 13, 2025 • 6