Nitesh Kumar Sharma
carbene101
AI & ML interests
LLMs, OCR
Recent Activity
reacted
to
sergiopaniego's
post
with ๐ฅ
3 days ago
New TRL + OpenEnv example! ๐ฅ
Fine tune an LLM for playing Sudoku using an RL env via OpenEnv
Includes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.
Enjoy!
Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynb
Script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py
upvoted
a
paper
2 months ago
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
reacted
to
sergiopaniego's
post
with ๐ฅ
2 months ago
we've just added several example scripts to TRL showing how to train models with GRPO using some of the new OpenEnv environments
train a model to interact with a browser (๐ฎ BrowserGym Env), play Wordle (๐ฎ Wordle Env) and moooore!
TRL (GRPO + vLLM) + OpenEnv! โก๏ธ
๐ go play with them: https://github.com/huggingface/trl/tree/main/examples/scripts/openenv
๐ examples list: https://huggingface.co/docs/trl/main/en/example_overview#scripts