Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
22.6
TFLOPS
1
3
Dominick Wirzba
Chronuid
Follow
0 followers
Β·
20 following
dominick-wirzba-a46898115
AI & ML interests
None yet
Recent Activity
reacted
to
sergiopaniego
's
post
with π
about 15 hours ago
OpenEnv is growing fast in tutorials. If you're looking to get started with RL environments, check them out > evaluate your agents using OpenEnv > learn how rewards work via rubrics > connect agents via MCP > many moreeeee! anything you think it's missing? https://meta-pytorch.org/OpenEnv/tutorials/index.html
reacted
to
danielhanchen
's
post
with π₯
about 15 hours ago
We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! π Learn how 3 optimizations help your home GPU train models faster: 1. Packed-sequence metadata caching 2. Double-buffered checkpoint reloads 3. Faster MoE routing Guide: https://unsloth.ai/blog/nvidia-collab GitHub: https://github.com/unslothai/unsloth
reacted
to
qgallouedec
's
post
with π₯
16 days ago
TRL v1.2 introduces the SSDTrainer π Simple Self-Distillation (SSD) from Apple's paper "Embarrassingly Simple Self-Distillation Improves Code Generation" is now available as an experimental trainer in TRL. The recipe is as minimal as the name suggests: sample completions from the model itself at a training-time temperature, then fine-tune on those raw, unverified samples with plain cross-entropy. No reward model. No verifier. No teacher model. No reinforcement learning. Just prompts and the model. ```python from trl.experimental.ssd import SSDConfig, SSDTrainer trainer = SSDTrainer( model="Qwen/Qwen3-4B-Instruct", args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95), train_dataset=dataset, ) trainer.train() ``` v1.2 also ships expanded tool-calling support (LLaMA 3.1 / 3.2, DeepSeek-V3), another round of KTO β DPO alignment getting us closer to promoting KTO to stable, a big GRPO simplification for overlong tool results, deprecation of `use_transformers_paged`, and key fixes for VLM response parsing. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.2.0
View all activity
Organizations
Chronuid
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
5 months ago
google/functiongemma-270m-it
Text Generation
β’
Updated
Jan 14
β’
55.6k
β’
982
liked
2 models
12 months ago
OS-Copilot/OS-Atlas-Pro-7B
Image-Text-to-Text
β’
8B
β’
Updated
Nov 19, 2024
β’
2.02k
β’
28
jinaai/jina-embeddings-v3
Feature Extraction
β’
0.6B
β’
Updated
Apr 8
β’
3.04M
β’
1.14k