RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated about 1 month ago • 49
RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated about 1 month ago • 49
RyanYr/pg_sais-dapo_shuffled-0_offline-grpo_qwen2.5-math-1.5B_kl_matheval Updated about 1 month ago • 53
RyanYr/pg_sais-dapo_shuffled-0_offline-grpo_qwen2.5-math-1.5B_kl_matheval Updated about 1 month ago • 53
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated May 5 • 1.55k • 34
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated May 5 • 1.55k • 34