parkneurals
/

Gemma

Text Generation

text-generation-inference

Model card Files Files and versions

scaled down version of the Gemma architecture trained on the Tiny Shakespeare dataset.

Model

Architecture: Gemma (Transformer Decoder)
Attention: Multi Query Attention (MQA)
Hidden Size: 768
Number of Layers: 12
Number of Query Heads: 2
Number of KV Heads: 1
Sequence Length: 128 (Block Size)
Vocabulary Size: 65 (Character-level encoding)
Total Training Steps: 3,500

Architecture

RMSNorm
GeGLU
RoPE
Embedding Scaling

Usage

You can load this model directly using the Hugging Face transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("parkneurals/Gemma")

# Note: This model uses a custom character-level tokenizer. 
# You can use the provided char_map.json for encoding/decoding.

This model has slow inference due to rotation matrix calcuation on every layer for each token(as I made it only for learning purposes; please bear if anyone using)

Downloads last month: 458

Safetensors

Model size

0.1B params

Tensor type

F32

·

Dataset used to train parkneurals/Gemma