DenSec02 commited on
Commit
8a0eb09
·
verified ·
1 Parent(s): c377ae8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: agpl-3.0
3
+ library_name: pytorch
4
+ tags:
5
+ - tiny-lm
6
+ - goldfish
7
+ - transformer
8
+ - rope
9
+ - swiglu
10
+ pipeline_tag: text-generation
11
+ base_model: []
12
+ ---
13
+
14
+ # GlubLM (36M)
15
+
16
+ > *the language model that already forgot this sentence*
17
+
18
+ **GlubLM** is a 36-million-parameter transformer that plays the character of a goldfish with a 10-second memory. Inspired by [GuppyLM](https://github.com/arman-bd/guppylm) by Arman BD and Ted Lasso's meditation on the goldfish as "the happiest animal on earth", GlubLM has a hard 96-token context window - it *physically* cannot remember what was just said.
19
+
20
+ Try it live: [browser demo](https://den-sec.github.io/glublm/) | [pixel-art desk pet](https://den-sec.github.io/glublm/desk-pet/)
21
+
22
+ ## Architecture
23
+
24
+ - **Parameters**: 36,055,680 (36.1M)
25
+ - **Layers**: 8 decoder-only transformer blocks
26
+ - **Hidden dim**: 640
27
+ - **Attention heads**: 10 (head dim 64)
28
+ - **FFN dim**: 1280 (SwiGLU, effective intermediate 2560)
29
+ - **Normalization**: RMSNorm
30
+ - **Position encoding**: Rotary (RoPE)
31
+ - **Vocabulary**: 5,120 Byte-Level BPE
32
+ - **Max context**: 96 tokens (hard cap, the "10-second memory")
33
+ - **Weight-tied LM head**
34
+ - **No bias terms**
35
+
36
+ ## Intended use
37
+
38
+ This model is a toy. It exists to:
39
+ 1. Explore the design tension between "small + simple" (GuppyLM's thesis) and "small + modern" (GlubLM's hypothesis)
40
+ 2. Demonstrate an LLM-generated dataset pipeline using a multi-agent Claude team
41
+ 3. Be a fun browser demo and a pixel-art desk pet companion
42
+
43
+ **Do not use GlubLM for anything serious.** It literally forgets within a sentence.
44
+
45
+ ## Training data
46
+
47
+ Trained on [`DenSec02/glublm-60k-ted`](https://huggingface.co/datasets/DenSec02/glublm-60k-ted), a 60,549-sample dataset of single-turn goldfish conversations generated by a team of four coordinated Claude agents (generator, critic, diversifier, persona-guardian). Composition: v4 balanced mix (20K poetic + 15K supplement + 5K conversational + 15K forgetful) augmented with v5.1 empathic/introspective hotfix (1K samples) + v5.2 multi-anchor self-awareness recovery (500 samples).
48
+
49
+ **Explicit exclusions**: no references to football, soccer, coaches, teams, or any Ted Lasso show characters.
50
+
51
+ ## Training
52
+
53
+ - **Hardware**: NVIDIA RTX 3060 12GB (local)
54
+ - **Framework**: PyTorch 2.x, BF16 mixed precision
55
+ - **Optimizer**: AdamW (b1=0.9, b2=0.95), weight decay 0.1
56
+ - **LR schedule**: cosine with 5% warmup, peak 3e-4
57
+ - **Batch size**: 64
58
+ - **Epochs**: 15
59
+ - **Dropout**: 0.1 (residual), 0.0 (attention)
60
+ - **Gradient clipping**: 1.0
61
+ - **Final loss**: 1.1442
62
+ - **Wall time**: ~15 minutes
63
+
64
+ ## Evaluation (v2 cross-model judge)
65
+
66
+ Dual-judge evaluation using Claude Sonnet 4.6 and Opus 4.7 on a 30-prompt rubric across 4 axes (integer 1-5 scale). Each axis aggregates 30 prompts x 3 seeds x 2 passes = 180 scoring rows per judge.
67
+
68
+ ### Per-axis score (mean)
69
+
70
+ | Axis | Sonnet 4.6 | Opus 4.7 |
71
+ |---|---:|---:|
72
+ | Conversational Quality | 4.01 | 4.15 |
73
+ | Goldfish Identity | 3.89 | 3.67 |
74
+ | Forgetful Trait | 3.80 | 3.81 |
75
+ | Length Appropriateness | 4.77 | 4.57 |
76
+
77
+ ### Cross-judge agreement (Cohen's quadratic-weighted kappa)
78
+
79
+ | Axis | Kappa | Interpretation |
80
+ |---|---:|---|
81
+ | Conversational Quality | 0.77 | substantial |
82
+ | Goldfish Identity | 0.83 | almost perfect |
83
+ | Forgetful Trait | 0.86 | almost perfect |
84
+ | Length Appropriateness | 0.59 | moderate |
85
+
86
+ **Interpretation**: Sonnet and Opus agree almost perfectly on 3/4 axes, validating that the rubric is interpretable consistently across LLM judges. Opus tends to be systematically ~0.2 stricter than Sonnet on the Identity axis (stricter rubric application, not judge bias).
87
+
88
+ Full methodology + 108-row long-format scores: [`eval/report_crossmodel.md`](https://github.com/Den-Sec/glublm/blob/master/eval/report_crossmodel.md).
89
+
90
+ ## Limitations & biases
91
+
92
+ - **Hard context limit**: 96 tokens. Inputs longer than a few short sentences will be truncated.
93
+ - **Goldfish worldview**: the model genuinely does not understand human abstractions outside the bowl.
94
+ - **Dataset bias**: the dataset was generated by Claude (Anthropic), so it inherits Claude's language patterns filtered through the goldfish persona.
95
+ - **Single-turn only**: multi-turn memory is a non-goal.
96
+ - **English only**.
97
+ - **Stochastic and occasionally incoherent**: 36M params on 60K samples is small. Do not expect reliability.
98
+
99
+ ## How to use
100
+
101
+ ```python
102
+ from glublm.config import ModelConfig
103
+ from glublm.model import GlubLM
104
+ from glublm.tokenizer import GlubTokenizer
105
+ from glublm.inference import generate
106
+ from huggingface_hub import hf_hub_download
107
+ from safetensors.torch import load_model
108
+
109
+ tok_path = hf_hub_download("DenSec02/glublm-36m", "tokenizer.json")
110
+ weights_path = hf_hub_download("DenSec02/glublm-36m", "model.safetensors")
111
+
112
+ tok = GlubTokenizer.from_file(tok_path)
113
+ cfg = ModelConfig(vocab_size=tok.vocab_size)
114
+ model = GlubLM(cfg)
115
+ load_model(model, weights_path)
116
+
117
+ print(generate(model=model, tokenizer=tok, prompt="hello", max_new_tokens=24))
118
+ ```
119
+
120
+ Or try it in-browser with zero setup:
121
+ - [Chat demo](https://den-sec.github.io/glublm/) (simple web UI)
122
+ - [Desk pet companion](https://den-sec.github.io/glublm/desk-pet/) (pixel-art PWA)
123
+ - [Colab notebook](https://colab.research.google.com/github/Den-Sec/glublm/blob/master/notebooks/train_colab.ipynb) (train your own goldfish)
124
+
125
+ ## License
126
+
127
+ AGPL-3.0 - see [LICENSE](https://github.com/Den-Sec/glublm/blob/master/LICENSE).
128
+
129
+ ## Citation
130
+
131
+ ```bibtex
132
+ @software{glublm_2026,
133
+ author = {Sepede, Dennis},
134
+ title = {GlubLM: a 36M goldfish language model with a 10-second memory},
135
+ year = {2026},
136
+ url = {https://github.com/Den-Sec/glublm}
137
+ }
138
+ ```