🧠 Gemma-4-E4B-Opus-4.6-Reasoning-Distill
📖 Model Description
This model is a fine-tuned version of Google's Gemma-4-E4B (4 billion parameters) specifically optimized for step-by-step reasoning and problem-solving tasks. It has been trained on high-quality chain-of-thought reasoning data distilled from Claude Opus 4.5 and 4.6, Anthropic's most capable reasoning models.
Key Features
- 🎯 Enhanced Reasoning: Trained on curated Opus reasoning traces for improved logical thinking
- 💭 Thinking Blocks: Uses
<think>...</think>format for transparent reasoning - 📊 Mathematical Excellence: Strong performance on math and quantitative problems
- 🔧 Efficient: 4B parameters - runs on consumer GPUs (8GB+ VRAM)
- ⚡ Fast: Optimized with Unsloth for 2x faster inference
🏋️ Training Details
Base Model
- Model:
unsloth/gemma-4-E4B-it(Google Gemma-4 4B Instruct) - Architecture: Transformer decoder-only
- Parameters: 4 billion (42.4M trainable with LoRA)
Training Configuration
| Parameter | Value |
|---|---|
| Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank (r) | 16 |
| LoRA Alpha | 32 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Trainable Parameters | 42,401,792 (0.53% of total) |
| Precision | 4-bit quantization (QLoRA) |
| Epochs | 1 |
| Batch Size | 4 (effective) |
| Learning Rate | 2e-4 |
| LR Scheduler | Cosine |
| Warmup Steps | 100 |
| Max Sequence Length | 4096 tokens |
| Optimizer | AdamW 8-bit |
Training Results
| Metric | Value |
|---|---|
| Final Loss | 1.3031 |
| Total Steps | 1,184 |
| Training Time | 1.49 hours |
| Hardware | NVIDIA RTX 3090 (24GB) |
| VRAM Usage | ~15GB |
📚 Training Data
This model was trained on 4,736 high-quality reasoning examples from three carefully curated datasets:
| Dataset | Samples | Description |
|---|---|---|
| nohurry/Opus-4.6-Reasoning-3000x-filtered | 2,326 | Filtered Opus 4.6 reasoning traces |
| Crownelius/Opus-4.6-Reasoning-3300x | 2,160 | Opus 4.6 problem-solving examples |
| TeichAI/claude-4.5-opus-high-reasoning-250x | 250 | High-quality Opus 4.5 samples |
Data Categories
The training data covers diverse reasoning domains:
- 📐 Mathematics: Algebra, calculus, number theory, geometry
- 💻 Programming: Algorithm design, debugging, code optimization
- 🧪 Science: Physics, chemistry, biology problems
- 🧩 Logic: Puzzles, deductive reasoning, constraint satisfaction
- 📝 Word Problems: Real-world scenario problem-solving
💬 Response Format
The model uses a structured thinking format for transparent reasoning:
<think>
[Internal reasoning process]
- Step 1: Understand the problem
- Step 2: Break down into sub-problems
- Step 3: Apply relevant knowledge
- Step 4: Verify the solution
</think>
[Final structured answer with clear formatting]
🚀 Usage
With Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
# Example: Math problem
prompt = """Solve this step by step:
A store sells apples for $2 each and oranges for $3 each.
If someone buys 5 apples and 3 oranges, and pays with a $50 bill,
how much change should they receive?"""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=1024,
temperature=0.7,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With Unsloth (Recommended for Speed)
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
"Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill",
max_seq_length=4096,
load_in_4bit=True,
)
# Use for inference...
📊 Example Outputs
Math Problem
Input:
Calculate the derivative of f(x) = x³ + 2x² - 5x + 3
Output:
<think>
To find the derivative, I'll apply the power rule to each term:
- For x³: d/dx(x³) = 3x²
- For 2x²: d/dx(2x²) = 4x
- For -5x: d/dx(-5x) = -5
- For 3: d/dx(3) = 0 (constant)
</think>
The derivative of f(x) = x³ + 2x² - 5x + 3 is:
**f'(x) = 3x² + 4x - 5**
Logic Problem
Input:
If all roses are flowers, and some flowers fade quickly, can we conclude that some roses fade quickly?
Output:
<think>
Let me analyze this syllogism:
- Premise 1: All roses are flowers (Roses ⊆ Flowers)
- Premise 2: Some flowers fade quickly (∃x: Flower(x) ∧ FadesQuickly(x))
- Question: Do some roses fade quickly?
The "some flowers" that fade quickly might or might not include roses.
We only know roses are a subset of flowers, not which specific flowers fade.
</think>
**No, we cannot conclude that some roses fade quickly.**
This is a logical fallacy. While all roses are flowers, the flowers that fade quickly might be entirely non-rose flowers (like tulips or daisies). The premises don't provide enough information to make this conclusion.
⚠️ Limitations
- English Only: Primarily trained on English data
- Knowledge Cutoff: Inherits base model's training cutoff
- Specialized Domain: Best for reasoning tasks; general chat may vary
- Hallucination Risk: May generate plausible-sounding but incorrect reasoning
- Context Length: Optimal performance up to 4096 tokens
🔒 Ethical Considerations
- This model inherits biases from both the base Gemma-4 model and the Opus training data
- Not intended for medical, legal, or safety-critical applications
- Users should verify important outputs independently
- Follow Gemma's usage guidelines and license terms
📄 License
This model is released under the Gemma License, which requires:
- Compliance with Google's Gemma Terms of Use
- Appropriate use in accordance with responsible AI practices
🙏 Acknowledgments
- Google for the Gemma-4 base model
- Anthropic for Claude Opus (source of distilled reasoning data)
- Unsloth for efficient training optimizations
- Dataset creators: nohurry, Crownelius, TeichAI
📬 Citation
@misc{gemma4-opus-reasoning-distill,
author = {Ayodele01},
title = {Gemma-4-E4B-Opus-4.6-Reasoning-Distill},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill}
}
Made with ❤️ using Unsloth
Fine-tuned on 2026-04-03
Model tree for Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill
Datasets used to train Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill
Evaluation results
- Training Lossself-reported1.303