arxiv:2402.17139
Sherry Yang
sherryy
AI & ML interests
None yet
Organizations
None yet
models 10
sherryy/Qwen2-0.5B-GRPO-test
Updated
sherryy/best5-next10-nopizza-nonomad_sft_90
Text Generation • 8B • Updated
sherryy/pizza_rwr_2k-1k
Text Generation • 8B • Updated
• 2
sherryy/pizza_rwr_k10_iter1
Text Generation • 8B • Updated
• 1
sherryy/pizza_rwr_iter1
Text Generation • 8B • Updated
sherryy/pizza_rwr_k10
Text Generation • 8B • Updated
• 2
sherryy/pizza_rwr
Text Generation • 8B • Updated
• 2
sherryy/pizza_sft_90
Text Generation • 8B • Updated
• 2
sherryy/pizza_sft
Text Generation • 8B • Updated
• 2
sherryy/math-baseline
Text Generation • 8B • Updated
• 2
datasets 14
sherryy/best5-next10-nopizza-nonomad_sft_90
Viewer
• Updated
• 78.6k • 29
sherryy/pizza_rwr_k10_iter1
Viewer
• Updated
• 24.4k • 3
sherryy/pizza_rwr_iter1
Viewer
• Updated
• 42.4k • 2
sherryy/pizza_rwr
Viewer
• Updated
• 83k • 22
sherryy/tree_dataset
Viewer
• Updated
• 11.1k • 3
sherryy/pizza_sft
Viewer
• Updated
• 37.8k • 23
sherryy/pizza_dpo
Viewer
• Updated
• 5.61k • 2
sherryy/math12k
Viewer
• Updated
• 12.5k • 6
sherryy/random-acts-of-pizza
Viewer
• Updated
• 59.5k • 24
sherryy/test_data
Updated
• 5