Qwen3.5-397B-A17B-Uncensored-REAP35 GGUF
Expert-pruned version of Qwen3.5-397B-A17B-Uncensored using REAP (Router-weighted Expert Activation Pruning). 35% of experts removed, reducing from 512 to 332 experts per layer.
Based on the uncensored fine-tune by timteh673.
Available Quants
| File | Quant | Size | Description |
|---|---|---|---|
Qwen3.5-397B-A17B-Uncensored-REAP35-Q8_0 |
Q8_0 | 259 GB | Full precision pruned, for re-quantization |
More quant variants coming soon.
REAP Pruning Method
REAP scores each expert using imatrix calibration data and uniformly removes the lowest-scoring experts from every MoE layer.
Each expert receives a score based on two signals captured during calibration inference:
- Activation count โ how many times the expert was selected by the router
- Activation magnitude โ sum of squared input activations when the expert was active
The final score is: normalized_count x normalized_magnitude
- Base model: Qwen3.5-397B-A17B-Uncensored (512 experts, 10 active per layer)
- Pruned: 512 โ 332 experts per layer (35% removed)
- Calibration data: unsloth calibration dataset (base model imatrix)
- Downloads last month
- 2,564
Hardware compatibility
Log In to add your hardware
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for Goldkoron/Qwen3.5-397B-A17B-Uncensored-REAP35
Base model
Qwen/Qwen3.5-397B-A17B