Qwen2.5-Coder-7B-Instruct โ€” GGUF Quants

Quantized GGUF versions of Qwen/Qwen2.5-Coder-7B-Instruct โ€” Alibaba's specialized code generation model trained on 5.5 trillion tokens of code data. Achieves performance comparable to much larger general models on coding benchmarks โ€” the go-to for local code assistance.

Available Files

File Quant Size Use Case
Qwen2.5-Coder-7B-Instruct-Q8_0.gguf Q8_0 ~7.7GB Maximum quality
Qwen2.5-Coder-7B-Instruct-Q6_K.gguf Q6_K ~6.0GB Near-lossless
Qwen2.5-Coder-7B-Instruct-Q5_K_M.gguf Q5_K_M ~5.2GB High quality
Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf Q4_K_M ~4.4GB Recommended default
Qwen2.5-Coder-7B-Instruct-Q3_K_M.gguf Q3_K_M ~3.5GB Low VRAM
Qwen2.5-Coder-7B-Instruct-IQ4_XS.gguf IQ4_XS ~3.9GB Imatrix 4-bit
Qwen2.5-Coder-7B-Instruct-IQ3_XXS.gguf IQ3_XXS ~2.9GB Imatrix 3-bit
Qwen2.5-Coder-7B-Instruct-IQ2_M.gguf IQ2_M ~2.5GB Imatrix 2-bit
Qwen2.5-Coder-7B-Instruct-IQ1_S.gguf IQ1_S ~1.8GB Extreme compression
Qwen2.5-Coder-7B-Instruct-fp16.gguf FP16 ~14.8GB Full precision
imatrix.dat โ€” โ€” Importance matrix

Usage

./llama-cli -m Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf \
  --ctx-size 8192 -n 1024 \
  -p "<|im_start|>system\nYou are an expert programmer.<|im_end|>\n<|im_start|>user\nWrite a Python function to sort a list.<|im_end|>\n<|im_start|>assistant\n"

About Qwen2.5-Coder-7B

  • Parameters: 7B
  • Training: 5.5T tokens of code data
  • Context: 128K tokens
  • License: Apache 2.0
  • Strengths: Code completion, debugging, code explanation, FIM for IDE integrations

Best-in-class local code assistance at the 7B scale.


Quantized by DuoNeural using llama.cpp on RTX 5090.


DuoNeural

DuoNeural is an open AI research lab โ€” human + AI in collaboration.

DuoNeural Research Publications

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura โ€” DuoNeural.

Downloads last month
1,201
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DuoNeural/Qwen2.5-Coder-7B-Instruct-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(182)
this model