Gemma 4 26B A4B GGUF MXFP4_MOE
This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.
Files
gemma-4-26B-A4B.MXFP4_MOE.gguf: main text model quantized toMXFP4_MOE.mmproj-gemma-4-26B-A4B.f16.gguf: multimodal projector required for image input.
LM Studio note
google/gemma-4-26B-A4B is a Gemma 4 Mixture-of-Experts model, so the correct GGUF quantization target for the text model is MXFP4_MOE rather than dense MXFP4.
The final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass enable_thinking=false or do not expose a separate reasoning toggle.
This repo also ships tokenizer_config.json and chat_template.jinja sidecars with the Gemma 4 response_schema for the <|channel>thought\n...<channel|> reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.
This is the base google/gemma-4-26B-A4B checkpoint, not a separate instruction-tuned -it variant. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but runtime compatibility still depends on the GGUF engine supporting Gemma 4 multimodal MoE models and MXFP4_MOE.
- Downloads last month
- 4,074
4-bit
Model tree for tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe
Base model
google/gemma-4-26B-A4B