Edit: 2026-03-25

Recommend Quantized Model: RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-oQ3.5


This model was converted to MLX format from nvidia/Nemotron-Cascade-2-30B-A3B using mlx-lm version 0.30.7.

Conversion Command

$ uv run mlx_lm.convert --hf-path nvidia/Nemotron-Cascade-2-30B-A3B --mlx-path Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 -q --q-mode mxfp4 -q-group-size 32

Benchmark

Model File size MMLU JMMLU HELLASWAG ARC_CHALLENGE GSM8K
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 15.65 GB 45.7% 33.7% 28.3% 72.3% 82.3%
Nemotron-Cascade-2-30B-A3B-oQ3.5 13.32 GB 65.7% 61.0% 76.0% 85.7% 89.7%
Nemotron-Cascade-2-30B-A3B-oQ4 16.91 GB 68.3% 61.7% - - -

Detail

Model Benchmark Accuracy Correct Total Time(s)
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 MMLU 45.7% 137 300 946.7
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 JMMLU 33.7% 101 300 558.5
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 HELLASWAG 28.3% 85 300 647
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 GSM8K 82.3% 247 300 1781.6
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 ARC_CHALLENGE 72.3% 217 300 462.5
Nemotron-Cascade-2-30B-A3B-oQ3.5 MMLU 65.7% 197 300 696
Nemotron-Cascade-2-30B-A3B-oQ3.5 JMMLU 61.0% 183 300 294.7
Nemotron-Cascade-2-30B-A3B-oQ3.5 HELLASWAG 76.0% 228 300 314.4
Nemotron-Cascade-2-30B-A3B-oQ3.5 GSM8K 89.7% 269 300 992.6
Nemotron-Cascade-2-30B-A3B-oQ3.5 ARC_CHALLENGE 85.7% 257 300 204.5
Nemotron-Cascade-2-30B-A3B-oQ4 MMLU 68.3% 205 300 572.7
Nemotron-Cascade-2-30B-A3B-oQ4 JMMLU 61.7% 185 300 239.2
Downloads last month
616
Safetensors
Model size
32B params
Tensor type
U8
·
U32
·
BF16
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-mlx-mxfp4

Quantized
(31)
this model