Gemma 4 26B A4B GGUF MXFP4_MOE

This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.

Files

gemma-4-26B-A4B.MXFP4_MOE.gguf: main text model quantized to MXFP4_MOE.
mmproj-gemma-4-26B-A4B.f16.gguf: multimodal projector required for image input.

LM Studio note

google/gemma-4-26B-A4B is a Gemma 4 Mixture-of-Experts model, so the correct GGUF quantization target for the text model is MXFP4_MOE rather than dense MXFP4.

The final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass enable_thinking=false or do not expose a separate reasoning toggle.

This repo also ships tokenizer_config.json and chat_template.jinja sidecars with the Gemma 4 response_schema for the <|channel>thought\n...<channel|> reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.

This is the base google/gemma-4-26B-A4B checkpoint, not a separate instruction-tuned -it variant. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but runtime compatibility still depends on the GGUF engine supporting Gemma 4 multimodal MoE models and MXFP4_MOE.

Downloads last month: 4,074

GGUF

Model size

25B params

Architecture

gemma4

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe

Base model

google/gemma-4-26B-A4B

Quantized

(2)

this model