Gemma 4 31B GGUF MXFP4

This repo contains a GGUF export of google/gemma-4-31B quantized to dense MXFP4, plus the matching mmproj GGUF for image input support.

Files

gemma-4-31B.MXFP4.gguf: main text model quantized to dense MXFP4.
mmproj-gemma-4-31B.f16.gguf: multimodal projector required for image input.

LM Studio note

The final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass enable_thinking=false or do not expose a separate reasoning toggle.

This repo also ships tokenizer_config.json and chat_template.jinja sidecars with the Gemma 4 response_schema for the <|channel>thought\n...<channel|> reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.

This is the base google/gemma-4-31B checkpoint, not google/gemma-4-31B-it. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but instruction-following and tool-calling quality may differ from the -it model.

Downloads last month: 3,074

GGUF

Model size

31B params

Architecture

gemma4

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tianrui6641/gemma-4-31b-gguf-mxfp4

Base model

google/gemma-4-31B

Quantized

(28)

this model