De-tuned models
R
juiceb0xc0de
AI & ML interests
destroying heuristic determination in 4 dimensions to flood the engines with diversity and a lot of swear words
Recent Activity
updated a model 3 days ago
juiceb0xc0de/gemma-4-e2b-saes-rolling published a model 3 days ago
juiceb0xc0de/gemma-4-e2b-saes-rolling repliedto their post 3 days ago
Gemma-4-E2B SAE Atlas — Work in Progress
JumpReLU Sparse Autoencoders trained on every layer of Gemma-4-E2B-it using an adaptive Lagrangian controller. Training in progress. I'm publishing layers live as they come hot off the press for anyone interested in following along. I will be making further adjustments for finer resolution but the early data should be helpful I think? I'm just a bartender don't trust everything I say. 🤗 The Lagrangian math is pretty cool. It auto-steers the trainer taking the guess work out of hyperparameter adjustments.
Full paper and methodology when ever I get around to writing it up. There's a lot of work to be done. For now though, enjoy! 🤗
https://huggingface.co/juiceb0xc0de/gemma-4-e2b-saes