Gemma-4-E2B Hard Reasoning

A fine-tuned version of Google's Gemma-4-E2B-it specialized for complex reasoning tasks with chain-of-thought (CoT) capabilities.

Model Description

This model was fine-tuned on 13,181 high-quality reasoning examples spanning mathematics, science, logic, and analytical thinking. The training data includes detailed thinking processes wrapped in <think> tags, teaching the model to show its reasoning step-by-step.

Key Features

🧠 Chain-of-thought reasoning: Model outputs thinking process in <think> tags
📚 Multi-domain expertise: Math, physics, computer science, logic, philosophy
🎯 Hard problems focus: Trained on "hardcore" difficulty examples
⚡ Efficient LoRA fine-tuning: Only 32M trainable parameters

Training Details

Parameter	Value
Base Model	`google/gemma-4-E2B-it`
Model Parameters	E2B
Training Method	LoRA (Low-Rank Adaptation)
LoRA Rank	16
LoRA Alpha	32
Epochs	1
Batch Size	None × 4 (effective)
Learning Rate	0.0002
Max Sequence Length	2048
Precision	4-bit quantization
Framework	Unsloth + TRL

Training Results

Metric	Value
Final Loss	1.8077
Training Time	3.55 hours
Steps	3295

Datasets

This model was trained on two high-quality reasoning datasets:

gemini-3.1-pro-hard-high-reasoning (3,150 samples)
gemini-3-pro-10000x-hard-high-reasoning (10,031 samples)

Total: 13,181 reasoning examples

Data Format

Each example contains:

Domain: e.g., "Analytic Number Theory", "Quantum Mechanics"
Concept: Specific topic within the domain
Difficulty: Primarily "Hardcore" level
Text: The problem or question
Model Thoughts: Detailed reasoning process (used in <think> tags)
Model Response: Final answer with explanation

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-E2B-it",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill")

# Generate with thinking
messages = [{
    "role": "user",
    "content": "Prove that the square root of 2 is irrational."
}]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Unsloth (Faster)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    "YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill",
    max_seq_length=2048,
    load_in_4bit=True,
)

# Use the same generation code as above

Example Output

Input:

**Domain:** Number Theory
**Concept:** Proof by Contradiction

Prove that there are infinitely many prime numbers.

Output:

<think>
I need to prove there are infinitely many primes. The classic approach is Euclid's proof 
by contradiction...

Assume there are finitely many primes: p₁, p₂, ..., pₙ

Consider N = (p₁ × p₂ × ... × pₙ) + 1

N is either prime or composite. If prime, we found a new prime not in our list.
If composite, it must have a prime factor. But N leaves remainder 1 when divided 
by any pᵢ, so its prime factors aren't in our list.

Either way, contradiction! So there must be infinitely many primes.
</think>

**Proof (Euclid's Theorem):**

Suppose, for contradiction, that there are only finitely many primes: p₁, p₂, ..., pₙ.

Consider the number N = p₁p₂...pₙ + 1.

Since N > 1, either N is prime or N has a prime factorization. 

- If N is prime, then N is a prime not in our list (since N > pᵢ for all i), contradiction.
- If N is composite, let p be a prime factor of N. Then p must be one of p₁, ..., pₙ.
  But N ≡ 1 (mod pᵢ) for all i, so p cannot divide N, contradiction.

Therefore, there must be infinitely many primes. ∎

Limitations

May occasionally produce incorrect mathematical derivations
Performance varies by domain (strongest in pure math, logic)
Long responses may become repetitive
Should be verified for critical applications

License

This model inherits the Gemma license from the base model.

Citation

@misc{gemma4-Gemini-3.1-Pro-Reasoning-Distill-2026,
  title={Gemma-4-E2B Hard Reasoning},
  author={Fine-tuned with Unsloth},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/Ayodele01/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill}
}

Acknowledgments

Google for the Gemma-4 model family
Unsloth for efficient fine-tuning
Roman1111111 for the reasoning datasets

Downloads last month: 52

Model tree for Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill

Base model

google/gemma-4-E2B-it

Adapter

(6)

this model

Ayodele01
/

gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill