Qwopus3.5-27B-v3-TQ3_4S

TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.

This release is a TQ3_4S GGUF quantization of Jackrong/Qwopus3.5-27B-v3, which is itself derived from the Qwen3.5-27B family.

Quantization Source

HF source checkout:
- Jackrong/Qwopus3.5-27B-v3
upstream family:
- Qwen/Qwen3.5-27B
F16 GGUF used as the quantization source:
- Qwopus3.5-27B-v3-f16.gguf

Quantized with:

./build/bin/llama-quantize \
  /path/to/Qwopus3.5-27B-v3-f16.gguf \
  /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  TQ3_4S \
  8

Quality

Full-pass wiki.test.raw, c=2048:

Final PPL = 6.3433 +/- 0.03999
Median chunk PPL = 6.1953

Runtime Validation

Validated on clean public llama.cpp-tq3 main:

runtime commit: 62eb27dce
strict chat smoke:
- prompt: Write ONLY the word ok.
- response: ok

Validated server profile:

./build/bin/llama-server \
  -m /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  -a qwopus35-27b-v3-tq3_4s \
  --host 127.0.0.1 --port 8080 \
  -ngl 99 -c 4096 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek