🧠 Gemma-4-E4B-Opus-4.6-Reasoning-Distill

A reasoning-enhanced Gemma-4 model distilled from Claude Opus 4.5/4.6 chain-of-thought data

📖 Model Description

This model is a fine-tuned version of Google's Gemma-4-E4B (4 billion parameters) specifically optimized for step-by-step reasoning and problem-solving tasks. It has been trained on high-quality chain-of-thought reasoning data distilled from Claude Opus 4.5 and 4.6, Anthropic's most capable reasoning models.

Key Features

🎯 Enhanced Reasoning: Trained on curated Opus reasoning traces for improved logical thinking
💭 Thinking Blocks: Uses <think>...</think> format for transparent reasoning
📊 Mathematical Excellence: Strong performance on math and quantitative problems
🔧 Efficient: 4B parameters - runs on consumer GPUs (8GB+ VRAM)
⚡ Fast: Optimized with Unsloth for 2x faster inference

🏋️ Training Details

Base Model

Model: unsloth/gemma-4-E4B-it (Google Gemma-4 4B Instruct)
Architecture: Transformer decoder-only
Parameters: 4 billion (42.4M trainable with LoRA)

Training Configuration

Parameter	Value
Method	LoRA (Low-Rank Adaptation)
LoRA Rank (r)	16
LoRA Alpha	32
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable Parameters	42,401,792 (0.53% of total)
Precision	4-bit quantization (QLoRA)
Epochs	1
Batch Size	4 (effective)
Learning Rate	2e-4
LR Scheduler	Cosine
Warmup Steps	100
Max Sequence Length	4096 tokens
Optimizer	AdamW 8-bit

Training Results

Metric	Value
Final Loss	1.3031
Total Steps	1,184
Training Time	1.49 hours
Hardware	NVIDIA RTX 3090 (24GB)
VRAM Usage	~15GB

📚 Training Data

This model was trained on 4,736 high-quality reasoning examples from three carefully curated datasets:

Dataset	Samples	Description
nohurry/Opus-4.6-Reasoning-3000x-filtered	2,326	Filtered Opus 4.6 reasoning traces
Crownelius/Opus-4.6-Reasoning-3300x	2,160	Opus 4.6 problem-solving examples
TeichAI/claude-4.5-opus-high-reasoning-250x	250	High-quality Opus 4.5 samples

Data Categories

The training data covers diverse reasoning domains:

📐 Mathematics: Algebra, calculus, number theory, geometry
💻 Programming: Algorithm design, debugging, code optimization
🧪 Science: Physics, chemistry, biology problems
🧩 Logic: Puzzles, deductive reasoning, constraint satisfaction
📝 Word Problems: Real-world scenario problem-solving

💬 Response Format

The model uses a structured thinking format for transparent reasoning:

<think>
[Internal reasoning process]
- Step 1: Understand the problem
- Step 2: Break down into sub-problems
- Step 3: Apply relevant knowledge
- Step 4: Verify the solution
</think>

[Final structured answer with clear formatting]

🚀 Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

# Example: Math problem
prompt = """Solve this step by step:
A store sells apples for $2 each and oranges for $3 each. 
If someone buys 5 apples and 3 oranges, and pays with a $50 bill, 
how much change should they receive?"""

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=1024,
    temperature=0.7,
    do_sample=True
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Unsloth (Recommended for Speed)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    "Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill",
    max_seq_length=4096,
    load_in_4bit=True,
)

# Use for inference...

📊 Example Outputs

Math Problem

Input:

Calculate the derivative of f(x) = x³ + 2x² - 5x + 3

Output:

<think>
To find the derivative, I'll apply the power rule to each term:
- For x³: d/dx(x³) = 3x²
- For 2x²: d/dx(2x²) = 4x
- For -5x: d/dx(-5x) = -5
- For 3: d/dx(3) = 0 (constant)
</think>

The derivative of f(x) = x³ + 2x² - 5x + 3 is:

**f'(x) = 3x² + 4x - 5**

Logic Problem

Input:

If all roses are flowers, and some flowers fade quickly, can we conclude that some roses fade quickly?

Output:

<think>
Let me analyze this syllogism:
- Premise 1: All roses are flowers (Roses ⊆ Flowers)
- Premise 2: Some flowers fade quickly (∃x: Flower(x) ∧ FadesQuickly(x))
- Question: Do some roses fade quickly?

The "some flowers" that fade quickly might or might not include roses.
We only know roses are a subset of flowers, not which specific flowers fade.
</think>

**No, we cannot conclude that some roses fade quickly.**

This is a logical fallacy. While all roses are flowers, the flowers that fade quickly might be entirely non-rose flowers (like tulips or daisies). The premises don't provide enough information to make this conclusion.

⚠️ Limitations

English Only: Primarily trained on English data
Knowledge Cutoff: Inherits base model's training cutoff
Specialized Domain: Best for reasoning tasks; general chat may vary
Hallucination Risk: May generate plausible-sounding but incorrect reasoning
Context Length: Optimal performance up to 4096 tokens

🔒 Ethical Considerations

This model inherits biases from both the base Gemma-4 model and the Opus training data
Not intended for medical, legal, or safety-critical applications
Users should verify important outputs independently
Follow Gemma's usage guidelines and license terms

📄 License

This model is released under the Gemma License, which requires:

Compliance with Google's Gemma Terms of Use
Appropriate use in accordance with responsible AI practices

🙏 Acknowledgments

Google for the Gemma-4 base model
Anthropic for Claude Opus (source of distilled reasoning data)
Unsloth for efficient training optimizations
Dataset creators: nohurry, Crownelius, TeichAI

📬 Citation

@misc{gemma4-opus-reasoning-distill,
  author = {Ayodele01},
  title = {Gemma-4-E4B-Opus-4.6-Reasoning-Distill},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill}
}

Made with ❤️ using Unsloth

Fine-tuned on 2026-04-03

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill

Base model

google/gemma-4-E4B-it

Finetuned

unsloth/gemma-4-E4B-it

Finetuned

(48)

this model

Datasets used to train Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill

Evaluation results

Training Loss
self-reported

1.303