Gemma 3 (1B) model with GRPO training
Sarthak Thakur
sarthak247
AI & ML interests
None yet
Organizations
models 7
sarthak247/gemma-3-1B-GRPO-float16
Text Generation • 1.0B • Updated
• 4
sarthak247/gemma-3-1B-GRPO-Adapter
Updated
sarthak247/Wan2.1-T2V-1.3B-nf4
Text-to-Video • Updated
• 20 • 5
sarthak247/qwen2.5-grpo-gsm8k-250steps-gguf
3B • Updated
• 14
sarthak247/qwen2.5-grpo-gsm8k-250steps-lora-adapters
Updated
sarthak247/qwen2.5-grpo-gsm8k-250steps-fp16
Text Generation • Updated
• 1
sarthak247/codellama-7b-humaneval-java-fim
Updated
• 2