Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

nvidia
/
Gemma-4-31B-IT-NVFP4

Text Generation
Safetensors
Model Optimizer
gemma4
nvidia
ModelOpt
Gemma-4-31B-IT
lighthouse
quantized
NVFP4
conversational
modelopt
Model card Files Files and versions
xet
Community
5
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Why not quantize the MATRICES of Wq, Wk, Wv, Wo?

#5 opened 3 days ago by
BeetSoup

这个版本对于5090单卡来说还是太大了

10
#4 opened 4 days ago by
iwaitu

Why is this 4bit version has a 32.7 GB size?

10
#3 opened 4 days ago by
alexcardo
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs