jrc
/

phi3-mini-math

Text Generation

Model card Files Files and versions

phi3-mini-math / README.md

jrc's picture

Update README.md

623a311 verified almost 2 years ago

|

history blame contribute delete

3.04 kB

	---
	license: apache-2.0
	datasets:
	- TIGER-Lab/MATH-plus
	language:
	- en
	tags:
	- torchtune
	- minerva-math
	library_name: transformers
	pipeline_tag: text-generation
	---

	# jrc/phi3-mini-math

	<!-- Provide a quick summary of what the model is/does. -->

	Math majors - who needs em? This model can answer any math questions you have.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	# Load model directly
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("jrc/phi3-mini-math", trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained("jrc/phi3-mini-math", trust_remote_code=True)
	```

	## Training Details

	Phi3 was trained using [torchtune](https://github.com/pytorch/torchtune) and the training script + config file are located in this repository.

	```bash
	tune run lora_finetune_distributed.py --config mini_lora.yaml
	```

	You can see a full Weights & Biases run [here](https://api.wandb.ai/links/jcummings/hkey76vj).

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	This model was finetuned on the following datasets:

	* [TIGER-Lab/MATH-plus](https://huggingface.co/datasets/TIGER-Lab/MATH-plus): An advanced math-specific dataset with 894k samples.

	#### Hardware

	* Machines: 4 x NVIDIA A100 GPUs
	* Max VRAM used per GPU: 29 GB
	* Real time: 10 hours

	## Evaluation

	The finetuned model is evaluated on [minerva-math](https://research.google/blog/minerva-solving-quantitative-reasoning-problems-with-language-models/) using [EleutherAI Eval Harness](https://github.com/EleutherAI/lm-evaluation-harness) through torchtune.

	```bash
	tune run eleuther_eval --config eleuther_evaluation \
	checkpoint.checkpoint_dir=./lora-phi3-math \
	tasks=["minerva_math"] \
	batch_size=32
	```

	\| Tasks \|Version\|Filter\|n-shot\| Metric \|Value \| \|Stderr\|
	\|------------------------------------\|-------\|------\|-----:\|-----------\|-----:\|---\|-----:\|
	\|minerva_math \|N/A \|none \| 4\|exact_match\|0.1670\|± \|0.0051\|
	\| - minerva_math_algebra \| 1\|none \| 4\|exact_match\|0.2502\|± \|0.0126\|
	\| - minerva_math_counting_and_prob \| 1\|none \| 4\|exact_match\|0.1329\|± \|0.0156\|
	\| - minerva_math_geometry \| 1\|none \| 4\|exact_match\|0.1232\|± \|0.0150\|
	\| - minerva_math_intermediate_algebra\| 1\|none \| 4\|exact_match\|0.0576\|± \|0.0078\|
	\| - minerva_math_num_theory \| 1\|none \| 4\|exact_match\|0.1148\|± \|0.0137\|
	\| - minerva_math_prealgebra \| 1\|none \| 4\|exact_match\|0.3077\|± \|0.0156\|
	\| - minerva_math_precalc \| 1\|none \| 4\|exact_match\|0.0623\|± \|0.0104\|

	This shows a large improvement over the base Phi3 Mini model.

	## Model Card Contact

	Drop me a line at @official_j3rck