Qwopus3.5-27B-v3-TQ3_4S

TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.

This release is a TQ3_4S GGUF quantization of Jackrong/Qwopus3.5-27B-v3, which is itself derived from the Qwen3.5-27B family.

Quantization Source

  • HF source checkout:
    • Jackrong/Qwopus3.5-27B-v3
  • upstream family:
    • Qwen/Qwen3.5-27B
  • F16 GGUF used as the quantization source:
    • Qwopus3.5-27B-v3-f16.gguf

Quantized with:

./build/bin/llama-quantize \
  /path/to/Qwopus3.5-27B-v3-f16.gguf \
  /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  TQ3_4S \
  8

Quality

Full-pass wiki.test.raw, c=2048:

  • Final PPL = 6.3433 +/- 0.03999
  • Median chunk PPL = 6.1953

Runtime Validation

Validated on clean public llama.cpp-tq3 main:

  • runtime commit: 62eb27dce
  • strict chat smoke:
    • prompt: Write ONLY the word ok.
    • response: ok

Validated server profile:

./build/bin/llama-server \
  -m /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  -a qwopus35-27b-v3-tq3_4s \
  --host 127.0.0.1 --port 8080 \
  -ngl 99 -c 4096 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek

Notes

  • This is a weight quantization release for the Qwopus v3 model line.
  • The TQ3_4S runtime is provided by:
    • turbo-tan/llama.cpp-tq3

Credits

Downloads last month
388
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ankushbl6/Qwopus3.5-27B-v3-TQ3_4S

Base model

Qwen/Qwen3.5-27B
Quantized
(23)
this model