Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Tinker-Stack
/
Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF

Text Generation
GGUF
nemotron
nvidia
Mixture of Experts
mixture-of-experts
mamba
tool-calling
reasoning
llama-cpp
ollama
11gb-vram
rtx-2080-ti
turing
imatrix
conversational
Model card Files Files and versions
xet
Community
Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF
Ctrl+K
Ctrl+K
  • 1 contributor
History: 6 commits
Tinker-Stack's picture
Tinker-Stack
Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config
189d56e verified about 2 months ago
  • .gitattributes
    1.6 kB
    Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub about 2 months ago
  • Modelfile
    195 Bytes
    Upload Modelfile with huggingface_hub about 2 months ago
  • README.md
    8.39 kB
    Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config about 2 months ago
  • nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf
    18.1 GB
    xet
    Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub about 2 months ago