33 9 91

raincandy_U

raincandy-u

AI & ML interests

幻覚。

Recent Activity

posted an update about 5 hours ago

🤗 Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization. Repo: https://huggingface.co/raincandy-u/Rain-100M Data: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu, ~3B tokens, English only Tokenizer: custom 16k BPE, context length 4096 Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16 Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!

liked a dataset about 9 hours ago

Skywork/SkyPile-150B

updated a model about 10 hours ago

raincandy-u/Rain-100M

View all activity

Organizations

Posts 3

Post

🤗 Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization.

Repo: raincandy-u/Rain-100M

Data: HuggingFaceFW/fineweb-edu, ~3B tokens, English only

Tokenizer: custom 16k BPE, context length 4096

Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16

Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!

Post

2627

🤗 I trained what is probably the smallest (600k ~) TinyStories model! It really can write grammatically correct stories!

raincandy-u/TinyStories-656K

Try this space based on this minuscule model!

raincandy-u/Story-Teller

Edit: Moreover, the model weight size is only 1.31MB under bf16, and can be reduced to the 700KB level when using Q8_0 quantization U•ェ•*U

Edit: Now 1000K params chat model!

raincandy-u/TinyChat-1776K

View all Posts