NVIDIA-Nemotron-3-Nano-4B-oQ4
This model was quantized using oQ mixed-precision quantization.
Quantization details
- Model type: nemotron_h
- Bits: 4
- Group size: 64
- Format: MLX safetensors
Benchmark
| Model | File size | MMLU | JMMLU | HELLASWAG | GSM8K | ARC_CHALLENGE |
|---|---|---|---|---|---|---|
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | 1.86 GB | 53.7% | 45.3% | 66.0% | 79.3% | 74.0% |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | 2.19 GB | 61.0% | 51.7% | 71.7% | 81.7% | 81.3% |
Detail
| Model | Benchmark | Accuracy | Correct | Total | Time(s) |
|---|---|---|---|---|---|
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | MMLU | 53.7% | 161 | 300 | 572.9 |
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | JMMLU | 45.3% | 136 | 300 | 159.8 |
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | HELLASWAG | 66.0% | 198 | 300 | 200.4 |
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | ARC_CHALLENGE | 79.3% | 238 | 300 | 114.9 |
| NVIDIA-Nemotron-3-Nano-4B-oQ3.5 | GSM8K | 74.0% | 222 | 300 | 904 |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | MMLU | 61.0% | 183 | 300 | 612.2 |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | JMMLU | 51.7% | 155 | 300 | 162.3 |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | HELLASWAG | 71.7% | 215 | 300 | 210.4 |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | ARC_CHALLENGE | 81.7% | 245 | 300 | 116.1 |
| NVIDIA-Nemotron-3-Nano-4B-oQ4 | GSM8K | 81.3% | 244 | 300 | 885.6 |
- Downloads last month
- 2,124
Model size
0.7B params
Tensor type
BF16
·
U32 ·
U8 ·
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for RepublicOfKorokke/NVIDIA-Nemotron-3-Nano-4B-oQ4
Base model
nvidia/NVIDIA-Nemotron-Nano-12B-v2-Base Finetuned
nvidia/NVIDIA-Nemotron-Nano-12B-v2 Finetuned
nvidia/NVIDIA-Nemotron-Nano-9B-v2 Finetuned
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16