schuttdev
/

hipfire-qwen3.5-27b

+---
+license: mit
+base_model: Qwen/Qwen3.5-27B
+tags:
+  - hipfire
+  - amd
+  - rdna
+  - quantized
+  - qwen3.5
+library_name: hipfire
+---
+# Qwen3.5-27B for hipfire
+Pre-quantized **Qwen3.5-27B** (DeltaNet hybrid) for [hipfire](https://github.com/Kaden-Schutt/hipfire), a Rust-native LLM inference engine for AMD RDNA GPUs.
+Quantized from [Qwen/Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B).
+## Files
+| File | Quant | Size | Min VRAM | Speed (5700 XT) |
+|------|-------|------|----------|-----------------|
+| qwen3.5-27b.q4.hfq | HFQ4 | 14.3GB | 16GB | TBD |
+| qwen3.5-27b.hfq6.hfq | HFQ6 | 21.4GB | 24GB | TBD |
+## GPU Compatibility
+| GPU | VRAM | HFQ4 | HFQ6 |
+|-----|------|------|------|
+| RX 5700 XT | 8GB | No | No |
+| RX 6800 XT | 16GB | Yes | No |
+| RX 7900 XTX | 24GB | Yes | Yes |
+| RX 9070 | 16GB | Yes | No |
+## Usage
+```bash
+# Install hipfire
+curl -L https://raw.githubusercontent.com/Kaden-Schutt/hipfire/master/scripts/install.sh | bash
+# Pull and run
+hipfire pull qwen3.5:27b
+hipfire run qwen3.5:27b "Hello"
+```
+## Quantization Formats
+- **HFQ4**: 4-bit, 256-weight groups (0.53 B/w). Best speed.
+- **HFQ6**: 6-bit, 256-weight groups (0.78 B/w). Best quality. ~15% slower.
+Both include embedded tokenizer and model config.
+## About hipfire
+Rust + HIP inference engine for AMD consumer GPUs (RDNA1–RDNA4). No Python in the hot path. 9x faster than llama.cpp+ROCm on the same hardware.
+- GitHub: [Kaden-Schutt/hipfire](https://github.com/Kaden-Schutt/hipfire)
+- All models: [docs/MODELS.md](https://github.com/Kaden-Schutt/hipfire/blob/master/docs/MODELS.md)
+## License
+Model weights subject to original [Qwen license](https://huggingface.co/Qwen/Qwen3.5-27B). hipfire engine: MIT.