Recommended way to run this model:
llama-server -hf ggml-org/gemma-4-E2B-it-GGUF
Then, access http://localhost:8080
Chat template
8-bit
16-bit
Base model