GPT-2.5-Math
GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.
Model Details
- Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).
- Training Hardware: NVIDIA H100 (via Modal.com).
- Dataset: 5% subset of
microsoft/orca-math-word-problems-200k. - Objective: Fine-tuned to solve math word problems and logical queries.
Performance
The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.
Training Details
- Optimizer: AdamW
- Precision: Mixed Precision (torch.amp)
- Epochs: 3
- Learning Rate: 5e-5
- Downloads last month
- 18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support