GlubLM (36M)
the language model that already forgot this sentence
GlubLM is a 36-million-parameter transformer that plays the character of a goldfish with a 10-second memory. Inspired by GuppyLM by Arman BD and Ted Lasso's meditation on the goldfish as "the happiest animal on earth", GlubLM has a hard 96-token context window - it physically cannot remember what was just said.
Try it live: browser demo | pixel-art desk pet
Architecture
- Parameters: 36,055,680 (36.1M)
- Layers: 8 decoder-only transformer blocks
- Hidden dim: 640
- Attention heads: 10 (head dim 64)
- FFN dim: 1280 (SwiGLU, effective intermediate 2560)
- Normalization: RMSNorm
- Position encoding: Rotary (RoPE)
- Vocabulary: 5,120 Byte-Level BPE
- Max context: 96 tokens (hard cap, the "10-second memory")
- Weight-tied LM head
- No bias terms
Intended use
This model is a toy. It exists to:
- Explore the design tension between "small + simple" (GuppyLM's thesis) and "small + modern" (GlubLM's hypothesis)
- Demonstrate an LLM-generated dataset pipeline using a multi-agent Claude team
- Be a fun browser demo and a pixel-art desk pet companion
Do not use GlubLM for anything serious. It literally forgets within a sentence.
Training data
Trained on DenSec02/glublm-60k-ted, a 60,549-sample dataset of single-turn goldfish conversations generated by a team of four coordinated Claude agents (generator, critic, diversifier, persona-guardian). Composition: v4 balanced mix (20K poetic + 15K supplement + 5K conversational + 15K forgetful) augmented with v5.1 empathic/introspective hotfix (1K samples) + v5.2 multi-anchor self-awareness recovery (500 samples).
Explicit exclusions: no references to football, soccer, coaches, teams, or any Ted Lasso show characters.
Training
- Hardware: NVIDIA RTX 3060 12GB (local)
- Framework: PyTorch 2.x, BF16 mixed precision
- Optimizer: AdamW (b1=0.9, b2=0.95), weight decay 0.1
- LR schedule: cosine with 5% warmup, peak 3e-4
- Batch size: 64
- Epochs: 15
- Dropout: 0.1 (residual), 0.0 (attention)
- Gradient clipping: 1.0
- Final loss: 1.1442
- Wall time: ~15 minutes
Evaluation (v2 cross-model judge)
Dual-judge evaluation using Claude Sonnet 4.6 and Opus 4.7 on a 30-prompt rubric across 4 axes (integer 1-5 scale). Each axis aggregates 30 prompts x 3 seeds x 2 passes = 180 scoring rows per judge.
Per-axis score (mean)
| Axis | Sonnet 4.6 | Opus 4.7 |
|---|---|---|
| Conversational Quality | 4.01 | 4.15 |
| Goldfish Identity | 3.89 | 3.67 |
| Forgetful Trait | 3.80 | 3.81 |
| Length Appropriateness | 4.77 | 4.57 |
Cross-judge agreement (Cohen's quadratic-weighted kappa)
| Axis | Kappa | Interpretation |
|---|---|---|
| Conversational Quality | 0.77 | substantial |
| Goldfish Identity | 0.83 | almost perfect |
| Forgetful Trait | 0.86 | almost perfect |
| Length Appropriateness | 0.59 | moderate |
Interpretation: Sonnet and Opus agree almost perfectly on 3/4 axes, validating that the rubric is interpretable consistently across LLM judges. Opus tends to be systematically ~0.2 stricter than Sonnet on the Identity axis (stricter rubric application, not judge bias).
Full methodology + 108-row long-format scores: eval/report_crossmodel.md.
Limitations & biases
- Hard context limit: 96 tokens. Inputs longer than a few short sentences will be truncated.
- Goldfish worldview: the model genuinely does not understand human abstractions outside the bowl.
- Dataset bias: the dataset was generated by Claude (Anthropic), so it inherits Claude's language patterns filtered through the goldfish persona.
- Single-turn only: multi-turn memory is a non-goal.
- English only.
- Stochastic and occasionally incoherent: 36M params on 60K samples is small. Do not expect reliability.
How to use
from glublm.config import ModelConfig
from glublm.model import GlubLM
from glublm.tokenizer import GlubTokenizer
from glublm.inference import generate
from huggingface_hub import hf_hub_download
from safetensors.torch import load_model
tok_path = hf_hub_download("DenSec02/glublm-36m", "tokenizer.json")
weights_path = hf_hub_download("DenSec02/glublm-36m", "model.safetensors")
tok = GlubTokenizer.from_file(tok_path)
cfg = ModelConfig(vocab_size=tok.vocab_size)
model = GlubLM(cfg)
load_model(model, weights_path)
print(generate(model=model, tokenizer=tok, prompt="hello", max_new_tokens=24))
Or try it in-browser with zero setup:
- Chat demo (simple web UI)
- Desk pet companion (pixel-art PWA)
- Colab notebook (train your own goldfish)
License
AGPL-3.0 - see LICENSE.
Citation
@software{glublm_2026,
author = {Sepede, Dennis},
title = {GlubLM: a 36M goldfish language model with a 10-second memory},
year = {2026},
url = {https://github.com/Den-Sec/glublm}
}
- Downloads last month
- 1,197