NVIDIA-Nemotron-3-Nano-4B-oQ4

This model was quantized using oQ mixed-precision quantization.

Quantization details

  • Model type: nemotron_h
  • Bits: 4
  • Group size: 64
  • Format: MLX safetensors

Benchmark

Model File size MMLU JMMLU HELLASWAG GSM8K ARC_CHALLENGE
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 1.86 GB 53.7% 45.3% 66.0% 79.3% 74.0%
NVIDIA-Nemotron-3-Nano-4B-oQ4 2.19 GB 61.0% 51.7% 71.7% 81.7% 81.3%

Detail

Model Benchmark Accuracy Correct Total Time(s)
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 MMLU 53.7% 161 300 572.9
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 JMMLU 45.3% 136 300 159.8
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 HELLASWAG 66.0% 198 300 200.4
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 ARC_CHALLENGE 79.3% 238 300 114.9
NVIDIA-Nemotron-3-Nano-4B-oQ3.5 GSM8K 74.0% 222 300 904
NVIDIA-Nemotron-3-Nano-4B-oQ4 MMLU 61.0% 183 300 612.2
NVIDIA-Nemotron-3-Nano-4B-oQ4 JMMLU 51.7% 155 300 162.3
NVIDIA-Nemotron-3-Nano-4B-oQ4 HELLASWAG 71.7% 215 300 210.4
NVIDIA-Nemotron-3-Nano-4B-oQ4 ARC_CHALLENGE 81.7% 245 300 116.1
NVIDIA-Nemotron-3-Nano-4B-oQ4 GSM8K 81.3% 244 300 885.6
Downloads last month
2,124
Safetensors
Model size
0.7B params
Tensor type
BF16
·
U32
·
U8
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RepublicOfKorokke/NVIDIA-Nemotron-3-Nano-4B-oQ4

Datasets used to train RepublicOfKorokke/NVIDIA-Nemotron-3-Nano-4B-oQ4