Fardan/llama3.2-1b-alpha_rank_128_64_reasoning_instruct_1k_steps_merged Text Generation • 1B • Updated 13 days ago • 46
Fardan/Qwen2.5-1.5B-Instruct-DPO-Human-Like-DPO-Dataset Text Generation • 2B • Updated 28 days ago • 28