Qwen3.5-27B-TQ3_1S

Qwen3.5-27B-TQ3_1S is a GGUF quantization of Qwen/Qwen3.5-27B using TQ3_1S, a 3.5-bit Walsh-Hadamard-transform weight format.

Files

  • Qwen3.5-27B-TQ3_1S.gguf
  • mmproj-BF16.gguf

Runtime Requirement

This model requires the public TurboQuant runtime fork:

  • https://github.com/turbo-tan/llama.cpp-tq3

It will not load correctly on stock llama.cpp or other runtimes that do not include TQ3_1S.

Text-Only Run

./build/bin/llama-server \
  -m /path/to/Qwen3.5-27B-TQ3_1S.gguf \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek

Vision / Image Input

For image input, use the included projector:

./build/bin/llama-server \
  -m /path/to/Qwen3.5-27B-TQ3_1S.gguf \
  -mm /path/to/mmproj-BF16.gguf \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek \
  --no-mmproj-offload

If your frontend says image input is unsupported, it is almost always talking to an older server instance that was started without --mmproj.

Quality

Gold-standard wiki.test.raw pass, c=512, full 580 chunks:

Format PPL Size
Q4_0 7.2431 +/- 0.0482 14.4 GB
TQ3_1S 7.2570 +/- 0.0480 12.9 GB

Base Model

License

Same license terms as the base model apply.

Downloads last month
4,680
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for YTan2000/Qwen3.5-27B-TQ3_1S

Base model

Qwen/Qwen3.5-27B
Quantized
(164)
this model