Qwen3.5-27B-TQ3_4S
Clean base TQ3_4S GGUF release for Qwen3.5-27B.
TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.
Summary
- Format:
TQ3_4S - Model size: about
12.9 GiB - Target runtime: public TurboQuant-enabled
llama.cpp - Intended use: local inference on consumer GPUs
- Multimodal projector included:
mmproj-BF16.gguf
Quality
Qwen3.5-27B, wiki.test.raw, c=2048:
| Format | PPL | Size |
|---|---|---|
TQ3_4S |
6.8224 +/- 0.04534 |
12.9 GiB |
Q3_K_S |
6.8630 +/- 0.04583 |
11.4 GiB |
TQ3_1S |
6.9807 +/- 0.04690 |
12.9 GiB |
EXL3 3.0bpw |
7.027580 |
~13.0 GiB |
Notes:
TQ3_4SandQ3_K_Sare full-passllama-perplexityresults.TQ3_1Sis also a full-passllama-perplexityresult atc=2048.EXL3 3.0bpwis from a local145 x 2048eval, notllama-perplexity.- This 27B result should not be read as evidence that plain
TQ3_4Sworks equally well on smaller dense models.
Runtime
This model requires the public TurboQuant runtime fork:
Build and run:
git clone https://github.com/turbo-tan/llama.cpp-tq3.git
cd llama.cpp-tq3
cmake -B build -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build -j
./build/bin/llama-server \
-m /path/to/Qwen_Qwen3.5-27B-TQ3_4S.gguf \
-ngl 99 \
-fa on \
-c 8192 \
-ctk q8_0 -ctv q8_0 \
--cache-ram 0 \
--no-warmup --jinja \
--reasoning off --reasoning-budget 0 --reasoning-format deepseek \
--port 8090
Vision / Image Input
Use the included projector:
./build/bin/llama-server \
-m /path/to/Qwen_Qwen3.5-27B-TQ3_4S.gguf \
-mm /path/to/mmproj-BF16.gguf \
-ngl 99 -c 8192 -np 1 \
-ctk q8_0 -ctv q8_0 -fa on \
--cache-ram 0 --no-warmup --jinja \
--reasoning off --reasoning-budget 0 --reasoning-format deepseek \
--no-mmproj-offload
If your frontend says image input is unsupported, it is usually still pointed at a server instance that was started without --mmproj.
Notes
This upload is the clean base TQ3_4S release, not the private KLD-guided mixed-precision variants.
Credits
- llama.cpp
- Qwen3.5-27B
- Walsh-Hadamard / transform-quantization line including RaBitQ, TurboQuant, and related work
- Downloads last month
- 2,687
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Model tree for YTan2000/Qwen3.5-27B-TQ3_4S
Base model
Qwen/Qwen3.5-27B