🧠 Gemma-4-E4B-Opus-4.6-Reasoning-Distill

A reasoning-enhanced Gemma-4 model distilled from Claude Opus 4.5/4.6 chain-of-thought data

Model License Training

📖 Model Description

This model is a fine-tuned version of Google's Gemma-4-E4B (4 billion parameters) specifically optimized for step-by-step reasoning and problem-solving tasks. It has been trained on high-quality chain-of-thought reasoning data distilled from Claude Opus 4.5 and 4.6, Anthropic's most capable reasoning models.

Key Features

  • 🎯 Enhanced Reasoning: Trained on curated Opus reasoning traces for improved logical thinking
  • 💭 Thinking Blocks: Uses <think>...</think> format for transparent reasoning
  • 📊 Mathematical Excellence: Strong performance on math and quantitative problems
  • 🔧 Efficient: 4B parameters - runs on consumer GPUs (8GB+ VRAM)
  • Fast: Optimized with Unsloth for 2x faster inference

🏋️ Training Details

Base Model

  • Model: unsloth/gemma-4-E4B-it (Google Gemma-4 4B Instruct)
  • Architecture: Transformer decoder-only
  • Parameters: 4 billion (42.4M trainable with LoRA)

Training Configuration

Parameter Value
Method LoRA (Low-Rank Adaptation)
LoRA Rank (r) 16
LoRA Alpha 32
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable Parameters 42,401,792 (0.53% of total)
Precision 4-bit quantization (QLoRA)
Epochs 1
Batch Size 4 (effective)
Learning Rate 2e-4
LR Scheduler Cosine
Warmup Steps 100
Max Sequence Length 4096 tokens
Optimizer AdamW 8-bit

Training Results

Metric Value
Final Loss 1.3031
Total Steps 1,184
Training Time 1.49 hours
Hardware NVIDIA RTX 3090 (24GB)
VRAM Usage ~15GB

📚 Training Data

This model was trained on 4,736 high-quality reasoning examples from three carefully curated datasets:

Dataset Samples Description
nohurry/Opus-4.6-Reasoning-3000x-filtered 2,326 Filtered Opus 4.6 reasoning traces
Crownelius/Opus-4.6-Reasoning-3300x 2,160 Opus 4.6 problem-solving examples
TeichAI/claude-4.5-opus-high-reasoning-250x 250 High-quality Opus 4.5 samples

Data Categories

The training data covers diverse reasoning domains:

  • 📐 Mathematics: Algebra, calculus, number theory, geometry
  • 💻 Programming: Algorithm design, debugging, code optimization
  • 🧪 Science: Physics, chemistry, biology problems
  • 🧩 Logic: Puzzles, deductive reasoning, constraint satisfaction
  • 📝 Word Problems: Real-world scenario problem-solving

💬 Response Format

The model uses a structured thinking format for transparent reasoning:

<think>
[Internal reasoning process]
- Step 1: Understand the problem
- Step 2: Break down into sub-problems
- Step 3: Apply relevant knowledge
- Step 4: Verify the solution
</think>

[Final structured answer with clear formatting]

🚀 Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

# Example: Math problem
prompt = """Solve this step by step:
A store sells apples for $2 each and oranges for $3 each. 
If someone buys 5 apples and 3 oranges, and pays with a $50 bill, 
how much change should they receive?"""

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=1024,
    temperature=0.7,
    do_sample=True
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Unsloth (Recommended for Speed)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    "Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill",
    max_seq_length=4096,
    load_in_4bit=True,
)

# Use for inference...

📊 Example Outputs

Math Problem

Input:

Calculate the derivative of f(x) = x³ + 2x² - 5x + 3

Output:

<think>
To find the derivative, I'll apply the power rule to each term:
- For x³: d/dx(x³) = 3x²
- For 2x²: d/dx(2x²) = 4x
- For -5x: d/dx(-5x) = -5
- For 3: d/dx(3) = 0 (constant)
</think>

The derivative of f(x) = x³ + 2x² - 5x + 3 is:

**f'(x) = 3x² + 4x - 5**

Logic Problem

Input:

If all roses are flowers, and some flowers fade quickly, can we conclude that some roses fade quickly?

Output:

<think>
Let me analyze this syllogism:
- Premise 1: All roses are flowers (Roses ⊆ Flowers)
- Premise 2: Some flowers fade quickly (∃x: Flower(x) ∧ FadesQuickly(x))
- Question: Do some roses fade quickly?

The "some flowers" that fade quickly might or might not include roses.
We only know roses are a subset of flowers, not which specific flowers fade.
</think>

**No, we cannot conclude that some roses fade quickly.**

This is a logical fallacy. While all roses are flowers, the flowers that fade quickly might be entirely non-rose flowers (like tulips or daisies). The premises don't provide enough information to make this conclusion.

⚠️ Limitations

  • English Only: Primarily trained on English data
  • Knowledge Cutoff: Inherits base model's training cutoff
  • Specialized Domain: Best for reasoning tasks; general chat may vary
  • Hallucination Risk: May generate plausible-sounding but incorrect reasoning
  • Context Length: Optimal performance up to 4096 tokens

🔒 Ethical Considerations

  • This model inherits biases from both the base Gemma-4 model and the Opus training data
  • Not intended for medical, legal, or safety-critical applications
  • Users should verify important outputs independently
  • Follow Gemma's usage guidelines and license terms

📄 License

This model is released under the Gemma License, which requires:

  • Compliance with Google's Gemma Terms of Use
  • Appropriate use in accordance with responsible AI practices

🙏 Acknowledgments

  • Google for the Gemma-4 base model
  • Anthropic for Claude Opus (source of distilled reasoning data)
  • Unsloth for efficient training optimizations
  • Dataset creators: nohurry, Crownelius, TeichAI

📬 Citation

@misc{gemma4-opus-reasoning-distill,
  author = {Ayodele01},
  title = {Gemma-4-E4B-Opus-4.6-Reasoning-Distill},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill}
}

Made with ❤️ using Unsloth

Fine-tuned on 2026-04-03

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill

Finetuned
(48)
this model

Datasets used to train Ayodele01/gemma-4-E4B-Opus-4.6-Reasoning-Distill

Evaluation results