Tinker-Stack/Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF at main

Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF

Ctrl+K

Ctrl+K

1 contributor

History: 6 commits

Tinker-Stack's picture

Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config

189d56e verified about 2 months ago

.gitattributes

1.6 kB
Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub about 2 months ago
Modelfile

195 Bytes
Upload Modelfile with huggingface_hub about 2 months ago
README.md

8.39 kB
Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config about 2 months ago
nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf

18.1 GB
xet

Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub about 2 months ago