Models

72,982

Full-text search

Active filters: reinforcement-learning

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 136 • 142

nvidia/GEAR-SONIC

Reinforcement Learning • Updated 25 days ago • 41

JonusNattapong/AI-XAUUSD-Trading

Reinforcement Learning • Updated Oct 10, 2025 • 33

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 528

Accio-Lab/Metis-8B-RL

Image-Text-to-Text • 9B • Updated 26 days ago • 492 • 4

THU-KEG/LongWriter-Zero-32B

Text Generation • 33B • Updated Jul 3, 2025 • 179 • • 113

IntelliGrow/FetchPickAndPlace-v4

Reinforcement Learning • Updated Aug 16, 2025 • 177 • 3

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 1.29k • 336

DocPereira/PEAL_V4_LHP_Zero_Entropy_Controlled

Reinforcement Learning • Updated about 9 hours ago • 278 • 1

OpenDataArena/ODA-Fin-RL-8B

Reinforcement Learning • 8B • Updated Mar 10 • 96 • 2

MBZUAI/MediX-R1-8B

Image-Text-to-Text • 9B • Updated Feb 27 • 167 • 5

nvidia/EGM-8B

Image-Text-to-Text • 9B • Updated 26 days ago • 567 • 7

LeonOverload/PRIMO-R1-7B

Video-Text-to-Text • 8B • Updated 15 days ago • 19 • 1

LeonOverload/PRIMO-COT-SFT-7B

Video-Text-to-Text • 849k • Updated 15 days ago • 35 • 1

Camais03/camie-crafter

Reinforcement Learning • Updated Mar 29 • 23 • 5

zlab-princeton/Vero-Qwen25-7B

Image-Text-to-Text • 8B • Updated 29 days ago • 39 • 1

Huggggooo/ProtoCycle-7B

Text Generation • 8B • Updated 17 days ago • 790 • 1

mradermacher/ProtoCycle-7B-GGUF

Reinforcement Learning • 8B • Updated 17 days ago • 458 • 1

mradermacher/PRIMO-R1-7B-GGUF

Reinforcement Learning • 8B • Updated 14 days ago • 561 • 1

mradermacher/PRIMO-COT-SFT-7B-GGUF

Reinforcement Learning • 8B • Updated 14 days ago • 588 • 1

Falconss1/VideoThinker-R1-3B

Video-Text-to-Text • 4B • Updated about 18 hours ago • 18 • 1

mradermacher/VideoThinker-R1-3B-GGUF

Question Answering • 3B • Updated 12 days ago • 940 • 1

imran785/medical-triage-qwen-3b-trained

Reinforcement Learning • 3B • Updated 10 days ago • 242 • 1

eressss/among-agents-qwen-1.5b-finetuned

Text Generation • Updated 10 days ago • 1

anshumanatrey/pharmarl-llama-3b-trained-anshuman

Text Generation • Updated 7 days ago • 41 • 1

NickupAI/alphabypass3

Reinforcement Learning • Updated 9 days ago • 2

Phanindra2503/Reinforce-CartPole-v1

Reinforcement Learning • Updated 5 days ago • 1

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 3 days ago • 64 • 1

SpatialReward/SpatialReward-8B

Image-Text-to-Text • 770k • Updated about 22 hours ago • 1

ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4

Reinforcement Learning • 15B • Updated Feb 13, 2025 • 1.59k • 830