GlubLM (36M)

the language model that already forgot this sentence

GlubLM is a 36-million-parameter transformer that plays the character of a goldfish with a 10-second memory. Inspired by GuppyLM by Arman BD and Ted Lasso's meditation on the goldfish as "the happiest animal on earth", GlubLM has a hard 96-token context window - it physically cannot remember what was just said.

Try it live: browser demo | pixel-art desk pet

Architecture

Parameters: 36,055,680 (36.1M)
Layers: 8 decoder-only transformer blocks
Hidden dim: 640
Attention heads: 10 (head dim 64)
FFN dim: 1280 (SwiGLU, effective intermediate 2560)
Normalization: RMSNorm
Position encoding: Rotary (RoPE)
Vocabulary: 5,120 Byte-Level BPE
Max context: 96 tokens (hard cap, the "10-second memory")
Weight-tied LM head
No bias terms

Intended use

This model is a toy. It exists to:

Explore the design tension between "small + simple" (GuppyLM's thesis) and "small + modern" (GlubLM's hypothesis)
Demonstrate an LLM-generated dataset pipeline using a multi-agent Claude team
Be a fun browser demo and a pixel-art desk pet companion

Do not use GlubLM for anything serious. It literally forgets within a sentence.

Training data

Trained on DenSec02/glublm-60k-ted, a 60,549-sample dataset of single-turn goldfish conversations generated by a team of four coordinated Claude agents (generator, critic, diversifier, persona-guardian). Composition: v4 balanced mix (20K poetic + 15K supplement + 5K conversational + 15K forgetful) augmented with v5.1 empathic/introspective hotfix (1K samples) + v5.2 multi-anchor self-awareness recovery (500 samples).

Explicit exclusions: no references to football, soccer, coaches, teams, or any Ted Lasso show characters.

Training

Hardware: NVIDIA RTX 3060 12GB (local)
Framework: PyTorch 2.x, BF16 mixed precision
Optimizer: AdamW (b1=0.9, b2=0.95), weight decay 0.1
LR schedule: cosine with 5% warmup, peak 3e-4
Batch size: 64
Epochs: 15
Dropout: 0.1 (residual), 0.0 (attention)
Gradient clipping: 1.0
Final loss: 1.1442
Wall time: ~15 minutes

Evaluation (v2 cross-model judge)

Dual-judge evaluation using Claude Sonnet 4.6 and Opus 4.7 on a 30-prompt rubric across 4 axes (integer 1-5 scale). Each axis aggregates 30 prompts x 3 seeds x 2 passes = 180 scoring rows per judge.

Per-axis score (mean)

Axis	Sonnet 4.6	Opus 4.7
Conversational Quality	4.01	4.15
Goldfish Identity	3.89	3.67
Forgetful Trait	3.80	3.81
Length Appropriateness	4.77	4.57

Cross-judge agreement (Cohen's quadratic-weighted kappa)

Axis	Kappa	Interpretation
Conversational Quality	0.77	substantial
Goldfish Identity	0.83	almost perfect
Forgetful Trait	0.86	almost perfect
Length Appropriateness	0.59	moderate

Interpretation: Sonnet and Opus agree almost perfectly on 3/4 axes, validating that the rubric is interpretable consistently across LLM judges. Opus tends to be systematically ~0.2 stricter than Sonnet on the Identity axis (stricter rubric application, not judge bias).

Full methodology + 108-row long-format scores: eval/report_crossmodel.md.

Limitations & biases

Hard context limit: 96 tokens. Inputs longer than a few short sentences will be truncated.
Goldfish worldview: the model genuinely does not understand human abstractions outside the bowl.
Dataset bias: the dataset was generated by Claude (Anthropic), so it inherits Claude's language patterns filtered through the goldfish persona.
Single-turn only: multi-turn memory is a non-goal.
English only.
Stochastic and occasionally incoherent: 36M params on 60K samples is small. Do not expect reliability.

How to use

from glublm.config import ModelConfig
from glublm.model import GlubLM
from glublm.tokenizer import GlubTokenizer
from glublm.inference import generate
from huggingface_hub import hf_hub_download
from safetensors.torch import load_model

tok_path = hf_hub_download("DenSec02/glublm-36m", "tokenizer.json")
weights_path = hf_hub_download("DenSec02/glublm-36m", "model.safetensors")

tok = GlubTokenizer.from_file(tok_path)
cfg = ModelConfig(vocab_size=tok.vocab_size)
model = GlubLM(cfg)
load_model(model, weights_path)

print(generate(model=model, tokenizer=tok, prompt="hello", max_new_tokens=24))

Or try it in-browser with zero setup:

Chat demo (simple web UI)
Desk pet companion (pixel-art PWA)
Colab notebook (train your own goldfish)

License

AGPL-3.0 - see LICENSE.

Citation

@software{glublm_2026,
  author       = {Sepede, Dennis},
  title        = {GlubLM: a 36M goldfish language model with a 10-second memory},
  year         = {2026},
  url          = {https://github.com/Den-Sec/glublm}
}

Downloads last month: 1,197

Safetensors

Model size

36.1M params

Tensor type

F32