Qwen3.5-27B-TQ3_1S

Qwen3.5-27B-TQ3_1S is a GGUF quantization of Qwen/Qwen3.5-27B using TQ3_1S, a 3.5-bit Walsh-Hadamard-transform weight format.

Files

Qwen3.5-27B-TQ3_1S.gguf
mmproj-BF16.gguf

Runtime Requirement

This model requires the public TurboQuant runtime fork:

https://github.com/turbo-tan/llama.cpp-tq3

It will not load correctly on stock llama.cpp or other runtimes that do not include TQ3_1S.

Text-Only Run

./build/bin/llama-server \
  -m /path/to/Qwen3.5-27B-TQ3_1S.gguf \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek

Vision / Image Input

For image input, use the included projector:

./build/bin/llama-server \
  -m /path/to/Qwen3.5-27B-TQ3_1S.gguf \
  -mm /path/to/mmproj-BF16.gguf \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek \
  --no-mmproj-offload

If your frontend says image input is unsupported, it is almost always talking to an older server instance that was started without --mmproj.

Quality

Gold-standard wiki.test.raw pass, c=512, full 580 chunks:

Format	PPL	Size
Q4_0	`7.2431 +/- 0.0482`	14.4 GB
TQ3_1S	`7.2570 +/- 0.0480`	12.9 GB

Base Model

Qwen/Qwen3.5-27B

License

Same license terms as the base model apply.

Downloads last month: 4,680

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for YTan2000/Qwen3.5-27B-TQ3_1S

Base model

Qwen/Qwen3.5-27B

Quantized

(164)

this model