Gemma 4 31B GGUF MXFP4

This repo contains a GGUF export of google/gemma-4-31B quantized to dense MXFP4, plus the matching mmproj GGUF for image input support.

Files

  • gemma-4-31B.MXFP4.gguf: main text model quantized to dense MXFP4.
  • mmproj-gemma-4-31B.f16.gguf: multimodal projector required for image input.

LM Studio note

The final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass enable_thinking=false or do not expose a separate reasoning toggle.

This repo also ships tokenizer_config.json and chat_template.jinja sidecars with the Gemma 4 response_schema for the <|channel>thought\n...<channel|> reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.

This is the base google/gemma-4-31B checkpoint, not google/gemma-4-31B-it. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but instruction-following and tool-calling quality may differ from the -it model.

Downloads last month
3,074
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tianrui6641/gemma-4-31b-gguf-mxfp4

Quantized
(28)
this model