arxiv:2501.03895
Yang Feng
fengyang0317
·
AI & ML interests
None yet
Organizations
None yet
models 10
fengyang0317/sft_output
Updated
fengyang0317/SmolLM2-FT-DPO
Text Generation • 0.1B • Updated
fengyang0317/SmolLM2-FT-MyDataset
Text Generation • 0.1B • Updated
fengyang0317/ppo-CartPole-v1
Reinforcement Learning • Updated
fengyang0317/unit4
Updated
fengyang0317/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning • Updated
fengyang0317/Taxi-v3
Reinforcement Learning • Updated
fengyang0317/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning • Updated
fengyang0317/ppo-Huggy
Reinforcement Learning • Updated
• 16
fengyang0317/whisper-small-dv
Automatic Speech Recognition • 0.2B • Updated
datasets 10
fengyang0317/commonsense
Viewer
• Updated
• 10.6k • 6
fengyang0317/prosqa
Viewer
• Updated
• 18.7k • 12
fengyang0317/prontoqa
Viewer
• Updated
• 10k • 8
fengyang0317/gsm8k
Viewer
• Updated
• 387k • 7
fengyang0317/listops-32
Viewer
• Updated
• 100k • 16
fengyang0317/listops-64
Viewer
• Updated
• 100k • 196
fengyang0317/listops-128
Viewer
• Updated
• 100k • 43
fengyang0317/listops-d20
Viewer
• Updated
• 100k • 11
fengyang0317/listops-1000
Viewer
• Updated
• 100k • 21
fengyang0317/imagenet-1k
Viewer
• Updated
• 22 • 18