Gemma-4-E2B Hard Reasoning
A fine-tuned version of Google's Gemma-4-E2B-it specialized for complex reasoning tasks with chain-of-thought (CoT) capabilities.
Model Description
This model was fine-tuned on 13,181 high-quality reasoning examples spanning mathematics, science, logic, and analytical thinking. The training data includes detailed thinking processes wrapped in <think> tags, teaching the model to show its reasoning step-by-step.
Key Features
- 🧠 Chain-of-thought reasoning: Model outputs thinking process in
<think>tags - 📚 Multi-domain expertise: Math, physics, computer science, logic, philosophy
- 🎯 Hard problems focus: Trained on "hardcore" difficulty examples
- ⚡ Efficient LoRA fine-tuning: Only 32M trainable parameters
Training Details
| Parameter | Value |
|---|---|
| Base Model | google/gemma-4-E2B-it |
| Model Parameters | E2B |
| Training Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| Epochs | 1 |
| Batch Size | None × 4 (effective) |
| Learning Rate | 0.0002 |
| Max Sequence Length | 2048 |
| Precision | 4-bit quantization |
| Framework | Unsloth + TRL |
Training Results
| Metric | Value |
|---|---|
| Final Loss | 1.8077 |
| Training Time | 3.55 hours |
| Steps | 3295 |
Datasets
This model was trained on two high-quality reasoning datasets:
- gemini-3.1-pro-hard-high-reasoning (3,150 samples)
- gemini-3-pro-10000x-hard-high-reasoning (10,031 samples)
Total: 13,181 reasoning examples
Data Format
Each example contains:
- Domain: e.g., "Analytic Number Theory", "Quantum Mechanics"
- Concept: Specific topic within the domain
- Difficulty: Primarily "Hardcore" level
- Text: The problem or question
- Model Thoughts: Detailed reasoning process (used in
<think>tags) - Model Response: Final answer with explanation
Usage
With Transformers + PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-E2B-it",
torch_dtype="auto",
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill")
# Generate with thinking
messages = [{
"role": "user",
"content": "Prove that the square root of 2 is irrational."
}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With Unsloth (Faster)
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
"YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill",
max_seq_length=2048,
load_in_4bit=True,
)
# Use the same generation code as above
Example Output
Input:
**Domain:** Number Theory
**Concept:** Proof by Contradiction
Prove that there are infinitely many prime numbers.
Output:
<think>
I need to prove there are infinitely many primes. The classic approach is Euclid's proof
by contradiction...
Assume there are finitely many primes: p₁, p₂, ..., pₙ
Consider N = (p₁ × p₂ × ... × pₙ) + 1
N is either prime or composite. If prime, we found a new prime not in our list.
If composite, it must have a prime factor. But N leaves remainder 1 when divided
by any pᵢ, so its prime factors aren't in our list.
Either way, contradiction! So there must be infinitely many primes.
</think>
**Proof (Euclid's Theorem):**
Suppose, for contradiction, that there are only finitely many primes: p₁, p₂, ..., pₙ.
Consider the number N = p₁p₂...pₙ + 1.
Since N > 1, either N is prime or N has a prime factorization.
- If N is prime, then N is a prime not in our list (since N > pᵢ for all i), contradiction.
- If N is composite, let p be a prime factor of N. Then p must be one of p₁, ..., pₙ.
But N ≡ 1 (mod pᵢ) for all i, so p cannot divide N, contradiction.
Therefore, there must be infinitely many primes. ∎
Limitations
- May occasionally produce incorrect mathematical derivations
- Performance varies by domain (strongest in pure math, logic)
- Long responses may become repetitive
- Should be verified for critical applications
License
This model inherits the Gemma license from the base model.
Citation
@misc{gemma4-Gemini-3.1-Pro-Reasoning-Distill-2026,
title={Gemma-4-E2B Hard Reasoning},
author={Fine-tuned with Unsloth},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/Ayodele01/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill}
}
Acknowledgments
- Google for the Gemma-4 model family
- Unsloth for efficient fine-tuning
- Roman1111111 for the reasoning datasets
- Downloads last month
- 52
Model tree for Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill
Base model
google/gemma-4-E2B-it